You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Server Admin Log: Difference between revisions
Jump to navigation
Jump to search
imported>Stashbot (XioNoX: configure OSPF between cr2-drmrs and cr2-eqdfw) |
imported>Stashbot (jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye) |
||
(426 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
== | == 2023-06-09 == | ||
* 20: | * 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye | ||
* 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye | |||
* 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye | |||
* 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye | |||
* 20:38 btullis@cumin1001: END (ERROR) - Cookbook sre.aqs.roll-restart-reboot (exit_code=97) rolling restart_daemons on A:aqs | |||
* 20:23 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs | |||
* 17:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye | |||
* 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1016.eqiad.wmnet with OS buster | |||
* 17:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 17:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49398 and previous config saved to /var/cache/conftool/dbconfig/20230609-173202-ladsgroup.json | |||
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49397 and previous config saved to /var/cache/conftool/dbconfig/20230609-171656-ladsgroup.json | |||
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49396 and previous config saved to /var/cache/conftool/dbconfig/20230609-170150-ladsgroup.json | |||
* 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster | |||
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49395 and previous config saved to /var/cache/conftool/dbconfig/20230609-164644-ladsgroup.json | |||
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49394 and previous config saved to /var/cache/conftool/dbconfig/20230609-163007-ladsgroup.json | |||
* 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance | |||
* 16:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance | |||
* 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49393 and previous config saved to /var/cache/conftool/dbconfig/20230609-162946-ladsgroup.json | |||
* 16:20 urandom: powercycling restbase1028 | |||
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49392 and previous config saved to /var/cache/conftool/dbconfig/20230609-161440-ladsgroup.json | |||
* 16:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 16:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['snapshot1016'] | |||
* 16:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['snapshot1016'] | |||
* 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49391 and previous config saved to /var/cache/conftool/dbconfig/20230609-155934-ladsgroup.json | |||
* 15:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49390 and previous config saved to /var/cache/conftool/dbconfig/20230609-154428-ladsgroup.json | |||
* 15:30 andrewbogott: wikitech-static: deleted everything in /srv/mediawiki/images/wikitech/archive for [[phab:T338520|T338520]] | |||
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49388 and previous config saved to /var/cache/conftool/dbconfig/20230609-152845-ladsgroup.json | |||
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance | |||
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance | |||
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49387 and previous config saved to /var/cache/conftool/dbconfig/20230609-152824-ladsgroup.json | |||
* 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002" | |||
* 15:22 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002" | |||
* 15:17 pt1979@cumin2002: START - Cookbook sre.dns.netbox | |||
* 15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49386 and previous config saved to /var/cache/conftool/dbconfig/20230609-151318-ladsgroup.json | |||
* 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49385 and previous config saved to /var/cache/conftool/dbconfig/20230609-145812-ladsgroup.json | |||
* 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49384 and previous config saved to /var/cache/conftool/dbconfig/20230609-144305-ladsgroup.json | |||
* 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49383 and previous config saved to /var/cache/conftool/dbconfig/20230609-142731-ladsgroup.json | |||
* 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance | |||
* 14:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance | |||
* 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance | |||
* 14:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance | |||
* 14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49382 and previous config saved to /var/cache/conftool/dbconfig/20230609-142655-ladsgroup.json | |||
* 14:14 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet | |||
* 14:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49381 and previous config saved to /var/cache/conftool/dbconfig/20230609-141149-ladsgroup.json | |||
* 13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49380 and previous config saved to /var/cache/conftool/dbconfig/20230609-135643-ladsgroup.json | |||
* 13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49379 and previous config saved to /var/cache/conftool/dbconfig/20230609-134137-ladsgroup.json | |||
* 13:29 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk | |||
* 13:29 sukhe: start pybal on lvs2013 | |||
* 13:28 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk | |||
* 13:25 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet | |||
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49378 and previous config saved to /var/cache/conftool/dbconfig/20230609-132541-ladsgroup.json | |||
* 13:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance | |||
* 13:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance | |||
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49377 and previous config saved to /var/cache/conftool/dbconfig/20230609-132520-ladsgroup.json | |||
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49376 and previous config saved to /var/cache/conftool/dbconfig/20230609-131014-ladsgroup.json | |||
* 13:07 sukhe: stop pybal on lvs2013 to test lvs2014 | |||
* 13:02 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs2014 | |||
* 13:02 sukhe: sudo cumin 'A:lvs and A:codfw' 'enable-puppet "CR 928818"' | |||
* 13:01 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014 | |||
* 12:59 sukhe: sudo cumin 'A:lvs and A:codfw' 'disable-puppet "CR 928818"' | |||
* 12:57 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014 | |||
* 12:57 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014 | |||
* 12:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014 | |||
* 12:55 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014 | |||
* 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49373 and previous config saved to /var/cache/conftool/dbconfig/20230609-125508-ladsgroup.json | |||
* 12:50 krinkle@deploy1002: Finished scap: {{Gerrit|I385d28d2edacb37}} (duration: 06m 59s) | |||
* 12:43 krinkle@deploy1002: Started scap: {{Gerrit|I385d28d2edacb37}} | |||
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49371 and previous config saved to /var/cache/conftool/dbconfig/20230609-124002-ladsgroup.json | |||
* 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001" | |||
* 12:29 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001" | |||
* 12:27 cmooney@cumin1001: START - Cookbook sre.dns.netbox | |||
* 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49370 and previous config saved to /var/cache/conftool/dbconfig/20230609-122303-ladsgroup.json | |||
* 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance | |||
* 12:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance | |||
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49369 and previous config saved to /var/cache/conftool/dbconfig/20230609-122243-ladsgroup.json | |||
* 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002" | |||
* 12:15 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002" | |||
* 12:13 aborrero@cumin2002: START - Cookbook sre.dns.netbox | |||
* 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49368 and previous config saved to /var/cache/conftool/dbconfig/20230609-120737-ladsgroup.json | |||
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 778 hosts | |||
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49367 and previous config saved to /var/cache/conftool/dbconfig/20230609-115230-ladsgroup.json | |||
* 11:52 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 778 hosts | |||
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 1262 hosts | |||
* 11:49 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 1262 hosts | |||
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49366 and previous config saved to /var/cache/conftool/dbconfig/20230609-113724-ladsgroup.json | |||
* 11:27 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . | |||
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49365 and previous config saved to /var/cache/conftool/dbconfig/20230609-112250-ladsgroup.json | |||
* 11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance | |||
* 11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance | |||
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49364 and previous config saved to /var/cache/conftool/dbconfig/20230609-112229-ladsgroup.json | |||
* 11:20 sukhe: pcc-db1001: sudo systemctl start pcc_facts_processor.service | |||
* 11:14 sukhe: sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080 | |||
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49363 and previous config saved to /var/cache/conftool/dbconfig/20230609-110723-ladsgroup.json | |||
* 11:02 sukhe: homer "cr*-codfw*" commit "Gerrit: 928113 add new LVS host lvs2014 | |||
* 10:53 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2014.codfw.wmnet with OS bullseye | |||
* 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49362 and previous config saved to /var/cache/conftool/dbconfig/20230609-105217-ladsgroup.json | |||
* 10:40 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage | |||
* 10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49361 and previous config saved to /var/cache/conftool/dbconfig/20230609-103711-ladsgroup.json | |||
* 10:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage | |||
* 10:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49360 and previous config saved to /var/cache/conftool/dbconfig/20230609-102217-ladsgroup.json | |||
* 10:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance | |||
* 10:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance | |||
* 10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49359 and previous config saved to /var/cache/conftool/dbconfig/20230609-102156-ladsgroup.json | |||
* 10:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye | |||
* 10:12 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync | |||
* 10:12 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop: sync | |||
* 10:09 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync | |||
* 10:08 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: sync | |||
* 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49358 and previous config saved to /var/cache/conftool/dbconfig/20230609-100650-ladsgroup.json | |||
* 09:57 elukey: increase <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change topic partitions (3->5) on kafka main clusters - [[phab:T338357|T338357]] | |||
* 09:56 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . | |||
* 09:54 moritzm: installing jupyter-core security updates on bullseye | |||
* 09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49357 and previous config saved to /var/cache/conftool/dbconfig/20230609-095144-ladsgroup.json | |||
* 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49356 and previous config saved to /var/cache/conftool/dbconfig/20230609-093638-ladsgroup.json | |||
* 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49355 and previous config saved to /var/cache/conftool/dbconfig/20230609-092141-ladsgroup.json | |||
* 09:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance | |||
* 09:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance | |||
* 09:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance | |||
* 09:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance | |||
* 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49354 and previous config saved to /var/cache/conftool/dbconfig/20230609-090829-ladsgroup.json | |||
* 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49353 and previous config saved to /var/cache/conftool/dbconfig/20230609-085322-ladsgroup.json | |||
* 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49352 and previous config saved to /var/cache/conftool/dbconfig/20230609-083816-ladsgroup.json | |||
* 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49351 and previous config saved to /var/cache/conftool/dbconfig/20230609-082310-ladsgroup.json | |||
* 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49350 and previous config saved to /var/cache/conftool/dbconfig/20230609-080708-ladsgroup.json | |||
* 08:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance | |||
* 08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance | |||
* 08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49349 and previous config saved to /var/cache/conftool/dbconfig/20230609-080637-ladsgroup.json | |||
* 07:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49348 and previous config saved to /var/cache/conftool/dbconfig/20230609-075130-ladsgroup.json | |||
* 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49347 and previous config saved to /var/cache/conftool/dbconfig/20230609-073624-ladsgroup.json | |||
* 07:33 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1492.eqiad.wmnet | |||
* 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49346 and previous config saved to /var/cache/conftool/dbconfig/20230609-072118-ladsgroup.json | |||
* 07:19 moritzm: powercycling restbase2018 (kernel hung following what looks like I/O errors) | |||
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49345 and previous config saved to /var/cache/conftool/dbconfig/20230609-070520-ladsgroup.json | |||
* 07:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance | |||
* 07:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance | |||
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49344 and previous config saved to /var/cache/conftool/dbconfig/20230609-070459-ladsgroup.json | |||
* 06:50 moritzm: installing wireshark security updates | |||
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49343 and previous config saved to /var/cache/conftool/dbconfig/20230609-064953-ladsgroup.json | |||
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster2005.codfw.wmnet | |||
* 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster2005.codfw.wmnet | |||
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster1005.eqiad.wmnet | |||
* 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster1005.eqiad.wmnet | |||
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: prometheus3001.esams.wmnet | |||
* 06:48 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: prometheus3001.esams.wmnet | |||
* 06:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress | |||
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress | |||
* 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49342 and previous config saved to /var/cache/conftool/dbconfig/20230609-063447-ladsgroup.json | |||
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49341 and previous config saved to /var/cache/conftool/dbconfig/20230609-061941-ladsgroup.json | |||
* 06:06 eileen: config {{Gerrit|97c57848}} -> {{Gerrit|6f4a9d19}} restart jobs | |||
* 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49340 and previous config saved to /var/cache/conftool/dbconfig/20230609-060438-ladsgroup.json | |||
* 06:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance | |||
* 06:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance | |||
* 05:53 eileen: civicrm upgraded from {{Gerrit|158896cc}} to {{Gerrit|5bbed553}} | |||
* 05:52 eileen: config revision changed from {{Gerrit|8b71fa7a}} to {{Gerrit|97c57848}} | |||
* 05:50 moritzm: installing cpio security updates | |||
* 05:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance | |||
* 05:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance | |||
* 05:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance | |||
* 05:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance | |||
* 05:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance | |||
* 05:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance | |||
* 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49339 and previous config saved to /var/cache/conftool/dbconfig/20230609-052315-ladsgroup.json | |||
* 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49338 and previous config saved to /var/cache/conftool/dbconfig/20230609-050809-ladsgroup.json | |||
* 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49337 and previous config saved to /var/cache/conftool/dbconfig/20230609-045302-ladsgroup.json | |||
* 04:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49336 and previous config saved to /var/cache/conftool/dbconfig/20230609-043756-ladsgroup.json | |||
* 04:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49335 and previous config saved to /var/cache/conftool/dbconfig/20230609-042306-ladsgroup.json | |||
* 04:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance | |||
* 04:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance | |||
* 04:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49334 and previous config saved to /var/cache/conftool/dbconfig/20230609-042246-ladsgroup.json | |||
* 04:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49333 and previous config saved to /var/cache/conftool/dbconfig/20230609-040739-ladsgroup.json | |||
* 03:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49332 and previous config saved to /var/cache/conftool/dbconfig/20230609-035233-ladsgroup.json | |||
* 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49331 and previous config saved to /var/cache/conftool/dbconfig/20230609-033727-ladsgroup.json | |||
* 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49330 and previous config saved to /var/cache/conftool/dbconfig/20230609-032127-ladsgroup.json | |||
* 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance | |||
* 03:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance | |||
* 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49329 and previous config saved to /var/cache/conftool/dbconfig/20230609-032106-ladsgroup.json | |||
* 03:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49328 and previous config saved to /var/cache/conftool/dbconfig/20230609-030600-ladsgroup.json | |||
* 02:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49327 and previous config saved to /var/cache/conftool/dbconfig/20230609-025054-ladsgroup.json | |||
* 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49326 and previous config saved to /var/cache/conftool/dbconfig/20230609-023548-ladsgroup.json | |||
* 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49325 and previous config saved to /var/cache/conftool/dbconfig/20230609-022054-ladsgroup.json | |||
* 02:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance | |||
* 02:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance | |||
* 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49324 and previous config saved to /var/cache/conftool/dbconfig/20230609-022034-ladsgroup.json | |||
* 02:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1002.eqiad.wmnet with OS bullseye | |||
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49323 and previous config saved to /var/cache/conftool/dbconfig/20230609-020528-ladsgroup.json | |||
* 02:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage | |||
* 02:01 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage | |||
* 02:00 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye | |||
* 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49322 and previous config saved to /var/cache/conftool/dbconfig/20230609-015021-ladsgroup.json | |||
* 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye | |||
* 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye | |||
* 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49321 and previous config saved to /var/cache/conftool/dbconfig/20230609-013515-ladsgroup.json | |||
* 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS bullseye | |||
* 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" | |||
* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49320 and previous config saved to /var/cache/conftool/dbconfig/20230609-011945-ladsgroup.json | |||
* 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance | |||
* 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance | |||
* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49319 and previous config saved to /var/cache/conftool/dbconfig/20230609-011924-ladsgroup.json | |||
* 01:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" | |||
* 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49318 and previous config saved to /var/cache/conftool/dbconfig/20230609-010418-ladsgroup.json | |||
* 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye | |||
* 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye | |||
* 00:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage | |||
* 00:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye | |||
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49317 and previous config saved to /var/cache/conftool/dbconfig/20230609-004912-ladsgroup.json | |||
* 00:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage | |||
* 00:47 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye | |||
* 00:34 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS bullseye | |||
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49316 and previous config saved to /var/cache/conftool/dbconfig/20230609-003406-ladsgroup.json | |||
* 00:31 eileen: civicrm upgraded from {{Gerrit|6f64e77d}} to {{Gerrit|158896cc}} | |||
* 00:25 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['pki-root1002'] | |||
* 00:25 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002'] | |||
* 00:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['pki-root1002'] | |||
* 00:24 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002'] | |||
* 00:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49315 and previous config saved to /var/cache/conftool/dbconfig/20230609-001821-ladsgroup.json | |||
* 00:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance | |||
* 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance | |||
* 00:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance | |||
* 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance | |||
* 00:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49314 and previous config saved to /var/cache/conftool/dbconfig/20230609-001732-ladsgroup.json | |||
* 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49313 and previous config saved to /var/cache/conftool/dbconfig/20230609-000226-ladsgroup.json | |||
== 2023-06-08 == | |||
* 23:55 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye | |||
* 23:54 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye | |||
* 23:54 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye | |||
* 23:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye | |||
* 23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49312 and previous config saved to /var/cache/conftool/dbconfig/20230608-234720-ladsgroup.json | |||
* 23:42 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED | |||
* 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for pki-root - pt1979@cumin2002" | |||
* 23:40 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add | |||
== 2023-06-07 == | |||
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49219 and previous config saved to /var/cache/conftool/dbconfig/20230607-235624-ladsgroup.json | |||
* 23:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance | |||
* 23:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance | |||
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49218 and previous config saved to /var/cache/conftool/dbconfig/20230607-235603-ladsgroup.json | |||
* 23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P49217 and previous config saved to /var/cache/conftool/dbconfig/20230607-234522-ladsgroup.json | |||
* 23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49216 and previous config saved to /var/cache/conftool/dbconfig/20230607-234057-ladsgroup.json | |||
* 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49215 and previous config saved to /var/cache/conftool/dbconfig/20230607-233016-ladsgroup.json | |||
* 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49214 and previous config saved to /var/cache/conftool/dbconfig/20230607-232551-ladsgroup.json | |||
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49213 and previous config saved to /var/cache/conftool/dbconfig/20230607-232223-ladsgroup.json | |||
* 23:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance | |||
* 23:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance | |||
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org | |||
== 2023-06-06 == | |||
* 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48961 and previous config saved to /var/cache/conftool/dbconfig/20230606-235248-ladsgroup.json | |||
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48960 and previous config saved to /var/cache/conftool/dbconfig/20230606-234810-ladsgroup.json | |||
* 23:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-a1-codfw.mgmt.codfw.wmnet | |||
* 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48959 and previous config saved to /var/cache/conftool/dbconfig/20230606-233742-ladsgroup.json | |||
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48958 and previous config saved to /var/cache/conftool/dbconfig/20230606-233304-ladsgroup.json | |||
* 23:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye | |||
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48955 and previous config saved to /var/cache/conftool/dbconfig/20230606-232235-ladsgroup.json | |||
* 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox | |||
== 2023-06-05 == | |||
* 23:53 ladsgroup@cumin1001: dbctl commit (dc=all) | |||
* 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | * 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | ||
* 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002" | |||
* 23:14 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002" | |||
* 23:12 pt1979@cumin2002: START - Cookbook sre.dns.netbox | |||
* 23:11 jforrester@deploy1002: Finished deploy [integration/docroot@6eefe56]: {{Gerrit|I5c1b92322ae59bfe8a9233ad23c3c89b844f5fb7}} for [[phab:T334492|T334492]] (duration: 00m 05s) | |||
* 23:10 jforrester@deploy1002: Started deploy [integration/docroot@6eefe56]: {{Gerrit|I5c1b92322ae59bfe8a9233ad23c3c89b844f5fb7}} for [[phab:T334492|T334492]] | |||
* 23:09 jforrester@deploy1002: Finished deploy [integration/docroot@ab77611]: {{Gerrit|Idf6c7ad01ed18785b850967252c6867d7871e902}} (duration: 00m 08s) | |||
* 23:09 jforrester@deploy1002: Started deploy [integration/docroot@ab77611]: {{Gerrit|Idf6c7ad01ed18785b850967252c6867d7871e902}} | |||
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48803 and previous config saved to /var/cache/conftool/dbconfig/20230605-230752-ladsgroup.json | |||
* 23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to | |||
== | == 2023-06-03 == | ||
* 13:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade | |||
* 13:41 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade | |||
* 13:28 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet | |||
* 13:28 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet | |||
* | |||
* | |||
* 13 | |||
* | |||
== | == 2023-06-02 == | ||
* | * 20:16 apergos: rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps' | ||
* | * 18:42 bblack: dns*: puppets are all re-enabled, ntp restarts are done, etc | ||
* 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | |||
* 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002" | |||
* | * 17:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002" | ||
* 17:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox | |||
* | * 17:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet | ||
* | * 17:27 bblack: dns*: disabling puppet to control rollout of NTP config fixups | ||
* | * 16:03 bblack: dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed | ||
* | * 15:35 sukhe: restart ntp.service on dns1002 | ||
* | * 13:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'. | ||
* | * 13:26 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'. | ||
* | * 13:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'. | ||
* | * 13:25 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'. | ||
* | * 13:25 ottomata: deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479 | ||
* | * 13:24 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. | ||
* | * 13:24 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. | ||
* | * 13:24 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. | ||
* | * 13:24 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'. | ||
* | * 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. | ||
* | * 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. | ||
* | * 12:03 moritzm: installing at-spi2-core bugfix updates from Bullseye point release | ||
* | * 09:35 moritzm: installing texlive-security updates on buster | ||
* | * 09:18 akosiaris: update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) [[phab:T337836|T337836]]. Reload systemd for unit changes to take effect | ||
* | * 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet | ||
* | * 08:52 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet | ||
* | * 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet | ||
* | * 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet | ||
* | * 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet | ||
* | * 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet | ||
* | * 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet | ||
* | * 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet | ||
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts: | |||
* | * 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 0 hosts: | ||
* | * 08:42 moritzm: installing traceroute bugfix updates from Bullseye point release | ||
* 07:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org | |||
* 07:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org | |||
* | * 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org | ||
* | * 07:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org | ||
* 07:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad) | |||
* 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org | |||
* | * 07:22 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad) | ||
* | * 07:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org | ||
* 01:53 ejegg: fundraising python tools upgraded from {{Gerrit|759d4c89}} to {{Gerrit|2ca83336}} | |||
* 01:22 cstone: civicrm upgraded from {{Gerrit|3819d6d1}} to {{Gerrit|bcc8fccc}} | |||
== 2023-06-01 == | |||
* 21:06 samtar@deploy1002: Finished scap: Backport for [[gerrit:925858{{!}}Remove deleted config wgVectorStickyHeaderEdit (T337955)]] (duration: 08m 30s) | |||
* 20:59 samtar@deploy1002: esanders and samtar: Backport for [[gerrit:925858{{!}}Remove deleted config wgVectorStickyHeaderEdit (T337955)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet | |||
* 20:57 samtar@deploy1002: Started scap: Backport for [[gerrit:925858{{!}}Remove deleted config wgVectorStickyHeaderEdit (T337955)]] | |||
* 20:54 samtar@deploy1002: Finished scap: Backport for [[gerrit:925792{{!}}Remove config and AB test code for edit buttons in sticky header (T337955)]] (duration: 10m 29s) | |||
* 20:45 samtar@deploy1002: samtar and ksarabia: Backport for [[gerrit:925792{{!}}Remove config and AB test code for | |||
==Archives== | ==Archives == | ||
See [[Server Admin Log/Archives]]. | See [[Server Admin Log/Archives]]. | ||
<noinclude> | <noinclude> |
Latest revision as of 21:50, 9 June 2023
2023-06-09
- 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
- 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
- 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
- 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
- 20:38 btullis@cumin1001: END (ERROR) - Cookbook sre.aqs.roll-restart-reboot (exit_code=97) rolling restart_daemons on A:aqs
- 20:23 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs
- 17:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
- 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1016.eqiad.wmnet with OS buster
- 17:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 17:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49398 and previous config saved to /var/cache/conftool/dbconfig/20230609-173202-ladsgroup.json
- 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49397 and previous config saved to /var/cache/conftool/dbconfig/20230609-171656-ladsgroup.json
- 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49396 and previous config saved to /var/cache/conftool/dbconfig/20230609-170150-ladsgroup.json
- 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster
- 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49395 and previous config saved to /var/cache/conftool/dbconfig/20230609-164644-ladsgroup.json
- 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49394 and previous config saved to /var/cache/conftool/dbconfig/20230609-163007-ladsgroup.json
- 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 16:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49393 and previous config saved to /var/cache/conftool/dbconfig/20230609-162946-ladsgroup.json
- 16:20 urandom: powercycling restbase1028
- 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49392 and previous config saved to /var/cache/conftool/dbconfig/20230609-161440-ladsgroup.json
- 16:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED
- 16:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['snapshot1016']
- 16:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['snapshot1016']
- 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49391 and previous config saved to /var/cache/conftool/dbconfig/20230609-155934-ladsgroup.json
- 15:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED
- 15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49390 and previous config saved to /var/cache/conftool/dbconfig/20230609-154428-ladsgroup.json
- 15:30 andrewbogott: wikitech-static: deleted everything in /srv/mediawiki/images/wikitech/archive for T338520
- 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49388 and previous config saved to /var/cache/conftool/dbconfig/20230609-152845-ladsgroup.json
- 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49387 and previous config saved to /var/cache/conftool/dbconfig/20230609-152824-ladsgroup.json
- 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED
- 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED
- 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"
- 15:22 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"
- 15:17 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49386 and previous config saved to /var/cache/conftool/dbconfig/20230609-151318-ladsgroup.json
- 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49385 and previous config saved to /var/cache/conftool/dbconfig/20230609-145812-ladsgroup.json
- 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49384 and previous config saved to /var/cache/conftool/dbconfig/20230609-144305-ladsgroup.json
- 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49383 and previous config saved to /var/cache/conftool/dbconfig/20230609-142731-ladsgroup.json
- 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 14:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 14:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49382 and previous config saved to /var/cache/conftool/dbconfig/20230609-142655-ladsgroup.json
- 14:14 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
- 14:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49381 and previous config saved to /var/cache/conftool/dbconfig/20230609-141149-ladsgroup.json
- 13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49380 and previous config saved to /var/cache/conftool/dbconfig/20230609-135643-ladsgroup.json
- 13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49379 and previous config saved to /var/cache/conftool/dbconfig/20230609-134137-ladsgroup.json
- 13:29 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk
- 13:29 sukhe: start pybal on lvs2013
- 13:28 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk
- 13:25 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
- 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49378 and previous config saved to /var/cache/conftool/dbconfig/20230609-132541-ladsgroup.json
- 13:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 13:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49377 and previous config saved to /var/cache/conftool/dbconfig/20230609-132520-ladsgroup.json
- 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49376 and previous config saved to /var/cache/conftool/dbconfig/20230609-131014-ladsgroup.json
- 13:07 sukhe: stop pybal on lvs2013 to test lvs2014
- 13:02 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs2014
- 13:02 sukhe: sudo cumin 'A:lvs and A:codfw' 'enable-puppet "CR 928818"'
- 13:01 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
- 12:59 sukhe: sudo cumin 'A:lvs and A:codfw' 'disable-puppet "CR 928818"'
- 12:57 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014
- 12:57 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
- 12:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014
- 12:55 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
- 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49373 and previous config saved to /var/cache/conftool/dbconfig/20230609-125508-ladsgroup.json
- 12:50 krinkle@deploy1002: Finished scap: I385d28 (duration: 06m 59s)
- 12:43 krinkle@deploy1002: Started scap: I385d28
- 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49371 and previous config saved to /var/cache/conftool/dbconfig/20230609-124002-ladsgroup.json
- 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001"
- 12:29 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001"
- 12:27 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49370 and previous config saved to /var/cache/conftool/dbconfig/20230609-122303-ladsgroup.json
- 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T336886)', diff saved to https://phabricator.wikimedia.org/P49369 and previous config saved to /var/cache/conftool/dbconfig/20230609-122243-ladsgroup.json
- 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002"
- 12:15 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002"
- 12:13 aborrero@cumin2002: START - Cookbook sre.dns.netbox
- 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49368 and previous config saved to /var/cache/conftool/dbconfig/20230609-120737-ladsgroup.json
- 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 778 hosts
- 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49367 and previous config saved to /var/cache/conftool/dbconfig/20230609-115230-ladsgroup.json
- 11:52 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 778 hosts
- 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 1262 hosts
- 11:49 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 1262 hosts
- 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T336886)', diff saved to https://phabricator.wikimedia.org/P49366 and previous config saved to /var/cache/conftool/dbconfig/20230609-113724-ladsgroup.json
- 11:27 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T336886)', diff saved to https://phabricator.wikimedia.org/P49365 and previous config saved to /var/cache/conftool/dbconfig/20230609-112250-ladsgroup.json
- 11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T336886)', diff saved to https://phabricator.wikimedia.org/P49364 and previous config saved to /var/cache/conftool/dbconfig/20230609-112229-ladsgroup.json
- 11:20 sukhe: pcc-db1001: sudo systemctl start pcc_facts_processor.service
- 11:14 sukhe: sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080
- 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49363 and previous config saved to /var/cache/conftool/dbconfig/20230609-110723-ladsgroup.json
- 11:02 sukhe: homer "cr*-codfw*" commit "Gerrit: 928113 add new LVS host lvs2014
- 10:53 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2014.codfw.wmnet with OS bullseye
- 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49362 and previous config saved to /var/cache/conftool/dbconfig/20230609-105217-ladsgroup.json
- 10:40 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
- 10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T336886)', diff saved to https://phabricator.wikimedia.org/P49361 and previous config saved to /var/cache/conftool/dbconfig/20230609-103711-ladsgroup.json
- 10:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
- 10:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T336886)', diff saved to https://phabricator.wikimedia.org/P49360 and previous config saved to /var/cache/conftool/dbconfig/20230609-102217-ladsgroup.json
- 10:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 10:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T336886)', diff saved to https://phabricator.wikimedia.org/P49359 and previous config saved to /var/cache/conftool/dbconfig/20230609-102156-ladsgroup.json
- 10:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
- 10:12 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
- 10:12 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop: sync
- 10:09 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
- 10:08 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
- 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49358 and previous config saved to /var/cache/conftool/dbconfig/20230609-100650-ladsgroup.json
- 09:57 elukey: increase {eqiad,codfw}.change-prop.transcludes.resource-change topic partitions (3->5) on kafka main clusters - T338357
- 09:56 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:54 moritzm: installing jupyter-core security updates on bullseye
- 09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49357 and previous config saved to /var/cache/conftool/dbconfig/20230609-095144-ladsgroup.json
- 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T336886)', diff saved to https://phabricator.wikimedia.org/P49356 and previous config saved to /var/cache/conftool/dbconfig/20230609-093638-ladsgroup.json
- 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T336886)', diff saved to https://phabricator.wikimedia.org/P49355 and previous config saved to /var/cache/conftool/dbconfig/20230609-092141-ladsgroup.json
- 09:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 09:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 09:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 09:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T336886)', diff saved to https://phabricator.wikimedia.org/P49354 and previous config saved to /var/cache/conftool/dbconfig/20230609-090829-ladsgroup.json
- 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49353 and previous config saved to /var/cache/conftool/dbconfig/20230609-085322-ladsgroup.json
- 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49352 and previous config saved to /var/cache/conftool/dbconfig/20230609-083816-ladsgroup.json
- 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T336886)', diff saved to https://phabricator.wikimedia.org/P49351 and previous config saved to /var/cache/conftool/dbconfig/20230609-082310-ladsgroup.json
- 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T336886)', diff saved to https://phabricator.wikimedia.org/P49350 and previous config saved to /var/cache/conftool/dbconfig/20230609-080708-ladsgroup.json
- 08:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
- 08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T336886)', diff saved to https://phabricator.wikimedia.org/P49349 and previous config saved to /var/cache/conftool/dbconfig/20230609-080637-ladsgroup.json
- 07:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49348 and previous config saved to /var/cache/conftool/dbconfig/20230609-075130-ladsgroup.json
- 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49347 and previous config saved to /var/cache/conftool/dbconfig/20230609-073624-ladsgroup.json
- 07:33 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1492.eqiad.wmnet
- 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T336886)', diff saved to https://phabricator.wikimedia.org/P49346 and previous config saved to /var/cache/conftool/dbconfig/20230609-072118-ladsgroup.json
- 07:19 moritzm: powercycling restbase2018 (kernel hung following what looks like I/O errors)
- 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T336886)', diff saved to https://phabricator.wikimedia.org/P49345 and previous config saved to /var/cache/conftool/dbconfig/20230609-070520-ladsgroup.json
- 07:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 07:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
- 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T336886)', diff saved to https://phabricator.wikimedia.org/P49344 and previous config saved to /var/cache/conftool/dbconfig/20230609-070459-ladsgroup.json
- 06:50 moritzm: installing wireshark security updates
- 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49343 and previous config saved to /var/cache/conftool/dbconfig/20230609-064953-ladsgroup.json
- 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster2005.codfw.wmnet
- 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster2005.codfw.wmnet
- 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster1005.eqiad.wmnet
- 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster1005.eqiad.wmnet
- 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: prometheus3001.esams.wmnet
- 06:48 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: prometheus3001.esams.wmnet
- 06:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
- 06:44 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
- 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49342 and previous config saved to /var/cache/conftool/dbconfig/20230609-063447-ladsgroup.json
- 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T336886)', diff saved to https://phabricator.wikimedia.org/P49341 and previous config saved to /var/cache/conftool/dbconfig/20230609-061941-ladsgroup.json
- 06:06 eileen: config 97c57848 -> 6f4a9d19 restart jobs
- 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T336886)', diff saved to https://phabricator.wikimedia.org/P49340 and previous config saved to /var/cache/conftool/dbconfig/20230609-060438-ladsgroup.json
- 06:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 06:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
- 05:53 eileen: civicrm upgraded from 158896cc to 5bbed553
- 05:52 eileen: config revision changed from 8b71fa7a to 97c57848
- 05:50 moritzm: installing cpio security updates
- 05:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
- 05:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
- 05:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 05:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 05:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 05:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T336886)', diff saved to https://phabricator.wikimedia.org/P49339 and previous config saved to /var/cache/conftool/dbconfig/20230609-052315-ladsgroup.json
- 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49338 and previous config saved to /var/cache/conftool/dbconfig/20230609-050809-ladsgroup.json
- 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49337 and previous config saved to /var/cache/conftool/dbconfig/20230609-045302-ladsgroup.json
- 04:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T336886)', diff saved to https://phabricator.wikimedia.org/P49336 and previous config saved to /var/cache/conftool/dbconfig/20230609-043756-ladsgroup.json
- 04:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1219 (T336886)', diff saved to https://phabricator.wikimedia.org/P49335 and previous config saved to /var/cache/conftool/dbconfig/20230609-042306-ladsgroup.json
- 04:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 04:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 04:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T336886)', diff saved to https://phabricator.wikimedia.org/P49334 and previous config saved to /var/cache/conftool/dbconfig/20230609-042246-ladsgroup.json
- 04:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49333 and previous config saved to /var/cache/conftool/dbconfig/20230609-040739-ladsgroup.json
- 03:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49332 and previous config saved to /var/cache/conftool/dbconfig/20230609-035233-ladsgroup.json
- 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T336886)', diff saved to https://phabricator.wikimedia.org/P49331 and previous config saved to /var/cache/conftool/dbconfig/20230609-033727-ladsgroup.json
- 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1218 (T336886)', diff saved to https://phabricator.wikimedia.org/P49330 and previous config saved to /var/cache/conftool/dbconfig/20230609-032127-ladsgroup.json
- 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 03:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T336886)', diff saved to https://phabricator.wikimedia.org/P49329 and previous config saved to /var/cache/conftool/dbconfig/20230609-032106-ladsgroup.json
- 03:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49328 and previous config saved to /var/cache/conftool/dbconfig/20230609-030600-ladsgroup.json
- 02:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49327 and previous config saved to /var/cache/conftool/dbconfig/20230609-025054-ladsgroup.json
- 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T336886)', diff saved to https://phabricator.wikimedia.org/P49326 and previous config saved to /var/cache/conftool/dbconfig/20230609-023548-ladsgroup.json
- 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1207 (T336886)', diff saved to https://phabricator.wikimedia.org/P49325 and previous config saved to /var/cache/conftool/dbconfig/20230609-022054-ladsgroup.json
- 02:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 02:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T336886)', diff saved to https://phabricator.wikimedia.org/P49324 and previous config saved to /var/cache/conftool/dbconfig/20230609-022034-ladsgroup.json
- 02:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49323 and previous config saved to /var/cache/conftool/dbconfig/20230609-020528-ladsgroup.json
- 02:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
- 02:01 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
- 02:00 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49322 and previous config saved to /var/cache/conftool/dbconfig/20230609-015021-ladsgroup.json
- 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
- 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
- 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T336886)', diff saved to https://phabricator.wikimedia.org/P49321 and previous config saved to /var/cache/conftool/dbconfig/20230609-013515-ladsgroup.json
- 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS bullseye
- 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1206 (T336886)', diff saved to https://phabricator.wikimedia.org/P49320 and previous config saved to /var/cache/conftool/dbconfig/20230609-011945-ladsgroup.json
- 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T336886)', diff saved to https://phabricator.wikimedia.org/P49319 and previous config saved to /var/cache/conftool/dbconfig/20230609-011924-ladsgroup.json
- 01:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49318 and previous config saved to /var/cache/conftool/dbconfig/20230609-010418-ladsgroup.json
- 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
- 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
- 00:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
- 00:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
- 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49317 and previous config saved to /var/cache/conftool/dbconfig/20230609-004912-ladsgroup.json
- 00:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
- 00:47 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
- 00:34 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS bullseye
- 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T336886)', diff saved to https://phabricator.wikimedia.org/P49316 and previous config saved to /var/cache/conftool/dbconfig/20230609-003406-ladsgroup.json
- 00:31 eileen: civicrm upgraded from 6f64e77d to 158896cc
- 00:25 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['pki-root1002']
- 00:25 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002']
- 00:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['pki-root1002']
- 00:24 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002']
- 00:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED
- 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 (T336886)', diff saved to https://phabricator.wikimedia.org/P49315 and previous config saved to /var/cache/conftool/dbconfig/20230609-001821-ladsgroup.json
- 00:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 00:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 00:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T336886)', diff saved to https://phabricator.wikimedia.org/P49314 and previous config saved to /var/cache/conftool/dbconfig/20230609-001732-ladsgroup.json
- 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49313 and previous config saved to /var/cache/conftool/dbconfig/20230609-000226-ladsgroup.json
2023-06-08
- 23:55 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
- 23:54 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
- 23:54 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
- 23:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
- 23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49312 and previous config saved to /var/cache/conftool/dbconfig/20230608-234720-ladsgroup.json
- 23:42 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED
- 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for pki-root - pt1979@cumin2002"
- 23:40 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for pki-root - pt1979@cumin2002"
- 23:38 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T336886)', diff saved to https://phabricator.wikimedia.org/P49311 and previous config saved to /var/cache/conftool/dbconfig/20230608-233214-ladsgroup.json
- 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T336886)', diff saved to https://phabricator.wikimedia.org/P49310 and previous config saved to /var/cache/conftool/dbconfig/20230608-231650-ladsgroup.json
- 23:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 23:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T336886)', diff saved to https://phabricator.wikimedia.org/P49309 and previous config saved to /var/cache/conftool/dbconfig/20230608-231629-ladsgroup.json
- 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P49308 and previous config saved to /var/cache/conftool/dbconfig/20230608-230123-ladsgroup.json
- 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P49307 and previous config saved to /var/cache/conftool/dbconfig/20230608-224617-ladsgroup.json
- 22:39 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on gerrit1001.wikimedia.org with reason: decom
- 22:39 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on gerrit1001.wikimedia.org with reason: decom
- 22:37 mutante: gerrit1001 - rmdir /etc/ssh/userkeys/gerrit.d which leads to puppet warnings because it cant remove empty dir
- 22:35 mutante: removing gerrit role from former gerrit prod machine gerrit1001, removes firewall rules, shell access, monitoring..etc
- 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T336886)', diff saved to https://phabricator.wikimedia.org/P49306 and previous config saved to /var/cache/conftool/dbconfig/20230608-223111-ladsgroup.json
- 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T336886)', diff saved to https://phabricator.wikimedia.org/P49305 and previous config saved to /var/cache/conftool/dbconfig/20230608-221536-ladsgroup.json
- 22:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 22:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T336886)', diff saved to https://phabricator.wikimedia.org/P49304 and previous config saved to /var/cache/conftool/dbconfig/20230608-221515-ladsgroup.json
- 22:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P49303 and previous config saved to /var/cache/conftool/dbconfig/20230608-220009-ladsgroup.json
- 21:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P49302 and previous config saved to /var/cache/conftool/dbconfig/20230608-214503-ladsgroup.json
- 21:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T336886)', diff saved to https://phabricator.wikimedia.org/P49301 and previous config saved to /var/cache/conftool/dbconfig/20230608-212957-ladsgroup.json
- 21:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T336886)', diff saved to https://phabricator.wikimedia.org/P49300 and previous config saved to /var/cache/conftool/dbconfig/20230608-211419-ladsgroup.json
- 21:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 21:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 21:08 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup1011.eqiad.wmnet']
- 21:07 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1011.eqiad.wmnet']
- 21:07 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup1011.eqiad.wmnet']
- 21:07 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup1010.eqiad.wmnet']
- 21:07 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1010.eqiad.wmnet']
- 21:06 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup1010.eqiad.wmnet']
- 21:06 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1011.eqiad.wmnet']
- 21:05 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1010.eqiad.wmnet']
- 21:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 21:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 20:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 20:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T336886)', diff saved to https://phabricator.wikimedia.org/P49298 and previous config saved to /var/cache/conftool/dbconfig/20230608-204722-ladsgroup.json
- 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P49297 and previous config saved to /var/cache/conftool/dbconfig/20230608-203216-ladsgroup.json
- 20:31 ladsgroup@deploy1002: Finished scap: Backport for Externallinks: Make port part of the index (T337149) (duration: 10m 10s)
- 20:22 ladsgroup@deploy1002: ladsgroup: Backport for Externallinks: Make port part of the index (T337149) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 20:21 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1028.eqiad.wmnet with OS bullseye
- 20:20 ladsgroup@deploy1002: Started scap: Backport for Externallinks: Make port part of the index (T337149)
- 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P49296 and previous config saved to /var/cache/conftool/dbconfig/20230608-201710-ladsgroup.json
- 20:12 ladsgroup@deploy1002: Finished scap: Backport for Remove VectorLimitedWidthIndicator (T336197) (duration: 07m 32s)
- 20:06 ladsgroup@deploy1002: ladsgroup and ksarabia: Backport for Remove VectorLimitedWidthIndicator (T336197) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 20:05 ladsgroup@deploy1002: Started scap: Backport for Remove VectorLimitedWidthIndicator (T336197)
- 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T336886)', diff saved to https://phabricator.wikimedia.org/P49295 and previous config saved to /var/cache/conftool/dbconfig/20230608-200204-ladsgroup.json
- 20:01 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
- 19:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1028.eqiad.wmnet with reason: host reimage
- 19:54 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1028.eqiad.wmnet with reason: host reimage
- 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T336886)', diff saved to https://phabricator.wikimedia.org/P49294 and previous config saved to /var/cache/conftool/dbconfig/20230608-194555-ladsgroup.json
- 19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 19:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1134.eqiad.wmnet with reason: Maintenance
- 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T336886)', diff saved to https://phabricator.wikimedia.org/P49293 and previous config saved to /var/cache/conftool/dbconfig/20230608-194534-ladsgroup.json
- 19:40 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS bullseye
- 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P49292 and previous config saved to /var/cache/conftool/dbconfig/20230608-193028-ladsgroup.json
- 19:22 jclark@cumin1001: START - Cookbook sre.hosts.provision for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
- 19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P49291 and previous config saved to /var/cache/conftool/dbconfig/20230608-191522-ladsgroup.json
- 19:08 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup1011.mgmt.eqiad.wmnet with reboot policy FORCED
- 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T336886)', diff saved to https://phabricator.wikimedia.org/P49290 and previous config saved to /var/cache/conftool/dbconfig/20230608-190016-ladsgroup.json
- 18:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
- 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 (T336886)', diff saved to https://phabricator.wikimedia.org/P49289 and previous config saved to /var/cache/conftool/dbconfig/20230608-184312-ladsgroup.json
- 18:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1132.eqiad.wmnet with reason: Maintenance
- 18:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1132.eqiad.wmnet with reason: Maintenance
- 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49288 and previous config saved to /var/cache/conftool/dbconfig/20230608-184251-ladsgroup.json
- 18:36 jclark@cumin1001: START - Cookbook sre.hosts.provision for host backup1011.mgmt.eqiad.wmnet with reboot policy FORCED
- 18:36 jclark@cumin1001: START - Cookbook sre.hosts.provision for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
- 18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P49287 and previous config saved to /var/cache/conftool/dbconfig/20230608-182745-ladsgroup.json
- 18:24 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool sessionstore in eqiad: maintenance
- 18:19 eevans@cumin1001: START - Cookbook sre.discovery.service-route pool sessionstore in eqiad: maintenance
- 18:18 urandom: (Re)pooling sessionstore/eqiad — T337426
- 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P49286 and previous config saved to /var/cache/conftool/dbconfig/20230608-181238-ladsgroup.json
- 18:09 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.12 refs T337526
- 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49285 and previous config saved to /var/cache/conftool/dbconfig/20230608-175732-ladsgroup.json
- 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49284 and previous config saved to /var/cache/conftool/dbconfig/20230608-174135-ladsgroup.json
- 17:41 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1128.eqiad.wmnet with reason: Maintenance
- 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1128.eqiad.wmnet with reason: Maintenance
- 17:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:36 stevemunene@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
- 17:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 17:31 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:31 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:30 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:30 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:28 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 17:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1119.eqiad.wmnet with reason: Maintenance
- 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T336886)', diff saved to https://phabricator.wikimedia.org/P49283 and previous config saved to /var/cache/conftool/dbconfig/20230608-172746-ladsgroup.json
- 17:24 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:14 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P49282 and previous config saved to /var/cache/conftool/dbconfig/20230608-171240-ladsgroup.json
- 17:10 stevemunene@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
- 17:05 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetmaster1006.eqiad.wmnet with OS bullseye
- 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 16:58 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 16:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P49281 and previous config saved to /var/cache/conftool/dbconfig/20230608-165734-ladsgroup.json
- 16:56 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:aqs
- 16:46 urandom: Starting traffic test against sessionstore.svc.eqiad.wmnet — T337426
- 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetmaster1006.eqiad.wmnet with reason: host reimage
- 16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T336886)', diff saved to https://phabricator.wikimedia.org/P49280 and previous config saved to /var/cache/conftool/dbconfig/20230608-164228-ladsgroup.json
- 16:41 urandom: Upgrading Cassandra to 4.1.1, sessionstore1003 — T337426
- 16:39 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetmaster1006.eqiad.wmnet with reason: host reimage
- 16:38 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetmaster1006.eqiad.wmnet with OS bullseye
- 16:36 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host puppetmaster1006.eqiad.wmnet with OS bullseye
- 16:35 urandom: Upgrading Cassandra to 4.1.1, sessionstore1002 — T337426
- 16:34 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs
- 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 (T336886)', diff saved to https://phabricator.wikimedia.org/P49279 and previous config saved to /var/cache/conftool/dbconfig/20230608-162650-ladsgroup.json
- 16:26 urandom: Upgrading Cassandra to 4.1.1, sessionstore1001 — T337426
- 16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1118.eqiad.wmnet with reason: Maintenance
- 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1118.eqiad.wmnet with reason: Maintenance
- 16:22 urandom: creating pre-upgrade Cassandra snapshots, sessionstore/eqiad — T337426
- 16:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance
- 16:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance
- 16:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 16:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
- 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
- 16:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 16:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance
- 16:11 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool sessionstore in eqiad: maintenance
- 16:06 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs2014.codfw.wmnet with OS bullseye
- 16:06 eevans@cumin1001: START - Cookbook sre.discovery.service-route depool sessionstore in eqiad: maintenance
- 16:06 urandom: depooling eqiad sessionstore for Cassandra upgrade — T337426
- 16:00 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 15:58 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
- 15:58 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs2014.codfw.wmnet with OS bullseye
- 15:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 15:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 15:23 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
- 15:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetmaster1006.eqiad.wmnet with OS bullseye
- 15:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['puppetmaster1006']
- 15:13 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['puppetmaster1006']
- 15:09 moritzm: installing c-ares security updates on bullseye
- 14:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
- 14:42 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
- 14:41 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
- 14:41 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
- 14:41 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
- 14:36 moritzm: installing libwep security updates on buster
- 14:29 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003']
- 14:28 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003']
- 14:28 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 14:28 jclark@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
- 14:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host puppetmaster1006.mgmt.eqiad.wmnet with reboot policy FORCED
- 14:19 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:19 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
- 14:17 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
- 14:15 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 14:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2014.codfw.wmnet with OS bullseye
- 14:14 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 14:13 XioNoX: cloudsw2-c8-eqiad> request system zeroize - T338459
- 14:13 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 14:11 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 14:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 14:10 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 14:10 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 14:09 XioNoX: decom cloudsw2-c8-eqiad - T338459
- 14:08 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 14:07 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
- 14:07 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 14:07 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 14:06 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 14:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:04 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
- 14:02 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
- 14:01 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 14:00 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 13:59 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 13:58 ladsgroup@deploy1002: Finished scap: Backport for Remove svwiktionary, svwiki and dawiki from legacy encoding (T128156 T128152 T128153) (duration: 09m 13s)
- 13:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
- 13:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
- 13:52 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
- 13:52 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 13:51 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 13:51 ladsgroup@deploy1002: ladsgroup: Backport for Remove svwiktionary, svwiki and dawiki from legacy encoding (T128156 T128152 T128153) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 13:51 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 13:50 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 13:49 ladsgroup@deploy1002: Started scap: Backport for Remove svwiktionary, svwiki and dawiki from legacy encoding (T128156 T128152 T128153)
- 13:49 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host puppetmaster1006.mgmt.eqiad.wmnet with reboot policy FORCED
- 13:48 jclark@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
- 13:44 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:44 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
- 13:43 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 13:43 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 13:43 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
- 13:41 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 13:40 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 13:39 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 13:36 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 13:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
- 13:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:29 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 13:06 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 13:06 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 13:05 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 13:05 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 12:57 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 12:57 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 12:36 cmooney@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937 (duration: 17m 22s)
- 12:19 topranks: De-pooling lvs1017 to move link to lsw1-e1-eqiad to ssw1-e1-eqiad T322937
- 12:18 cmooney@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937
- 12:12 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:11 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 12:03 vgutierrez: restore cp4052 HAProxy configuration - T317799
- 11:51 vgutierrez: repooling cp4052 - T317799
- 11:40 vgutierrez: depooling cp4052 for some HAProxy tests - T317799
- 11:28 Amir1: mwscript maintenance/storage/moveToExternal.php --wiki=nlwiki --iconv DB cluster26 (T128154)
- 11:03 Amir1: mwscript maintenance/storage/moveToExternal.php --wiki=dawiki --iconv DB cluster27 (T128153)
- 10:49 Amir1: mwscript maintenance/storage/moveToExternal.php --wiki=svwiki --iconv DB cluster27 (T128153)
- 10:22 jiji@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:21 jiji@cumin1001: START - Cookbook sre.dns.netbox
- 09:58 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@bb7526e]: (no justification provided) (duration: 00m 08s)
- 09:57 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@bb7526e]: (no justification provided)
- 09:40 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetserver2001.codfw.wmnet with OS bookworm
- 09:40 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin2002"
- 09:24 vgutierrez: updated to HAProxy 2.7.9 on cp4052 and cp5032
- 09:22 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp5032.eqsin.wmnet,cp4052.ulsfo.wmnet} and A:cp
- 09:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
- 09:18 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
- 09:17 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp5032.eqsin.wmnet,cp4052.ulsfo.wmnet} and A:cp
- 09:10 vgutierrez: fetch HAProxy 2.7.9 for thirdparty/haproxy27 bullseye (apt.wm.o)
- 08:54 apergos: UTC morning backport and config training window done
- 08:38 ariel@deploy1002: Finished scap: Backport for [ruwiki] Add an editautoreviewprotected level protecion (T337430) (duration: 08m 25s)
- 08:31 ariel@deploy1002: ariel and superpes: Backport for [ruwiki] Add an editautoreviewprotected level protecion (T337430) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 08:30 ariel@deploy1002: Started scap: Backport for [ruwiki] Add an editautoreviewprotected level protecion (T337430)
- 08:25 ariel@deploy1002: Finished scap: Backport for [fiwiki] Limitate the use of the ContentTranslation tool (T337412) (duration: 09m 16s)
- 08:17 ariel@deploy1002: superpes and ariel: Backport for [fiwiki] Limitate the use of the ContentTranslation tool (T337412) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 08:16 ariel@deploy1002: Started scap: Backport for [fiwiki] Limitate the use of the ContentTranslation tool (T337412)
- 08:12 ariel@deploy1002: Finished scap: Backport for [itwiktionary] Add a tagline (T337688) (duration: 08m 07s)
- 08:06 ariel@deploy1002: ariel and superpes: Backport for [itwiktionary] Add a tagline (T337688) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 08:04 ariel@deploy1002: Started scap: Backport for [itwiktionary] Add a tagline (T337688)
- 07:49 ariel@deploy1002: Finished scap: Backport for [kaawiki] Change the logo with an HD version and the tagline (T337641) (duration: 09m 09s)
- 07:41 ariel@deploy1002: ariel and superpes: Backport for [kaawiki] Change the logo with an HD version and the tagline (T337641) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 07:40 ariel@deploy1002: Started scap: Backport for [kaawiki] Change the logo with an HD version and the tagline (T337641)
- 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T336886)', diff saved to https://phabricator.wikimedia.org/P49271 and previous config saved to /var/cache/conftool/dbconfig/20230608-073524-ladsgroup.json
- 07:27 kartik@deploy1002: Finished scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337834) (duration: 09m 19s)
- 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P49270 and previous config saved to /var/cache/conftool/dbconfig/20230608-072018-ladsgroup.json
- 07:19 kartik@deploy1002: kartik: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337834) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 07:17 kartik@deploy1002: Started scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337834)
- 07:14 elukey: delete pod kask-production-7dfdfc7cbc-2vw5q in wikikube codfw, since it was scheduled on a non dedicated node
- 07:14 kartik@deploy1002: Finished scap: Backport for Enable Content and Section Translation for 9 Wikipedia (T337290) (duration: 09m 52s)
- 07:06 kartik@deploy1002: kartik: Backport for Enable Content and Section Translation for 9 Wikipedia (T337290) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P49268 and previous config saved to /var/cache/conftool/dbconfig/20230608-070512-ladsgroup.json
- 07:04 kartik@deploy1002: Started scap: Backport for Enable Content and Section Translation for 9 Wikipedia (T337290)
- 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T336886)', diff saved to https://phabricator.wikimedia.org/P49267 and previous config saved to /var/cache/conftool/dbconfig/20230608-065006-ladsgroup.json
- 06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T336886)', diff saved to https://phabricator.wikimedia.org/P49266 and previous config saved to /var/cache/conftool/dbconfig/20230608-064508-ladsgroup.json
- 06:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 06:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
- 06:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T336886)', diff saved to https://phabricator.wikimedia.org/P49265 and previous config saved to /var/cache/conftool/dbconfig/20230608-064447-ladsgroup.json
- 06:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P49264 and previous config saved to /var/cache/conftool/dbconfig/20230608-062941-ladsgroup.json
- 06:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P49263 and previous config saved to /var/cache/conftool/dbconfig/20230608-061435-ladsgroup.json
- 06:10 elukey: kill remaining processes for `andyrussg` on stat100x nodes to unblock puppet
- 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T336886)', diff saved to https://phabricator.wikimedia.org/P49262 and previous config saved to /var/cache/conftool/dbconfig/20230608-055929-ladsgroup.json
- 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T336886)', diff saved to https://phabricator.wikimedia.org/P49261 and previous config saved to /var/cache/conftool/dbconfig/20230608-055432-ladsgroup.json
- 05:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 05:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T336886)', diff saved to https://phabricator.wikimedia.org/P49260 and previous config saved to /var/cache/conftool/dbconfig/20230608-055411-ladsgroup.json
- 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P49259 and previous config saved to /var/cache/conftool/dbconfig/20230608-053904-ladsgroup.json
- 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P49258 and previous config saved to /var/cache/conftool/dbconfig/20230608-052358-ladsgroup.json
- 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T336886)', diff saved to https://phabricator.wikimedia.org/P49257 and previous config saved to /var/cache/conftool/dbconfig/20230608-050852-ladsgroup.json
- 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T336886)', diff saved to https://phabricator.wikimedia.org/P49256 and previous config saved to /var/cache/conftool/dbconfig/20230608-050353-ladsgroup.json
- 05:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 05:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 05:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 05:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49255 and previous config saved to /var/cache/conftool/dbconfig/20230608-050328-ladsgroup.json
- 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P49254 and previous config saved to /var/cache/conftool/dbconfig/20230608-044821-ladsgroup.json
- 04:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P49253 and previous config saved to /var/cache/conftool/dbconfig/20230608-043315-ladsgroup.json
- 04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49252 and previous config saved to /var/cache/conftool/dbconfig/20230608-041809-ladsgroup.json
- 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49251 and previous config saved to /var/cache/conftool/dbconfig/20230608-041311-ladsgroup.json
- 04:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 04:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 04:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 04:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49250 and previous config saved to /var/cache/conftool/dbconfig/20230608-040935-ladsgroup.json
- 03:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P49249 and previous config saved to /var/cache/conftool/dbconfig/20230608-035428-ladsgroup.json
- 03:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P49248 and previous config saved to /var/cache/conftool/dbconfig/20230608-033922-ladsgroup.json
- 03:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49247 and previous config saved to /var/cache/conftool/dbconfig/20230608-032416-ladsgroup.json
- 03:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49246 and previous config saved to /var/cache/conftool/dbconfig/20230608-031911-ladsgroup.json
- 03:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 03:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 03:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49245 and previous config saved to /var/cache/conftool/dbconfig/20230608-031901-ladsgroup.json
- 03:11 eileen: civicrm upgraded from 066095b8 to 6f64e77d
- 03:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P49244 and previous config saved to /var/cache/conftool/dbconfig/20230608-030355-ladsgroup.json
- 02:54 samtar@deploy1002: Finished scap: Backport for Remove additional v1 suffix when computing internalRestbaseURL (T334842 T338381) (duration: 09m 50s)
- 02:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P49243 and previous config saved to /var/cache/conftool/dbconfig/20230608-024849-ladsgroup.json
- 02:46 samtar@deploy1002: samtar: Backport for Remove additional v1 suffix when computing internalRestbaseURL (T334842 T338381) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 02:44 samtar@deploy1002: Started scap: Backport for Remove additional v1 suffix when computing internalRestbaseURL (T334842 T338381)
- 02:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49242 and previous config saved to /var/cache/conftool/dbconfig/20230608-023343-ladsgroup.json
- 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49241 and previous config saved to /var/cache/conftool/dbconfig/20230608-022842-ladsgroup.json
- 02:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 02:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T336886)', diff saved to https://phabricator.wikimedia.org/P49240 and previous config saved to /var/cache/conftool/dbconfig/20230608-022821-ladsgroup.json
- 02:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P49239 and previous config saved to /var/cache/conftool/dbconfig/20230608-021315-ladsgroup.json
- 01:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P49238 and previous config saved to /var/cache/conftool/dbconfig/20230608-015809-ladsgroup.json
- 01:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T336886)', diff saved to https://phabricator.wikimedia.org/P49237 and previous config saved to /var/cache/conftool/dbconfig/20230608-014303-ladsgroup.json
- 01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T336886)', diff saved to https://phabricator.wikimedia.org/P49236 and previous config saved to /var/cache/conftool/dbconfig/20230608-013808-ladsgroup.json
- 01:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 01:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
- 01:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T336886)', diff saved to https://phabricator.wikimedia.org/P49235 and previous config saved to /var/cache/conftool/dbconfig/20230608-013736-ladsgroup.json
- 01:23 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 01:23 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 01:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P49234 and previous config saved to /var/cache/conftool/dbconfig/20230608-012230-ladsgroup.json
- 01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49233 and previous config saved to /var/cache/conftool/dbconfig/20230608-010853-ladsgroup.json
- 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P49232 and previous config saved to /var/cache/conftool/dbconfig/20230608-010724-ladsgroup.json
- 00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P49231 and previous config saved to /var/cache/conftool/dbconfig/20230608-005347-ladsgroup.json
- 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T336886)', diff saved to https://phabricator.wikimedia.org/P49230 and previous config saved to /var/cache/conftool/dbconfig/20230608-005218-ladsgroup.json
- 00:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T336886)', diff saved to https://phabricator.wikimedia.org/P49229 and previous config saved to /var/cache/conftool/dbconfig/20230608-004713-ladsgroup.json
- 00:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 00:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
- 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T336886)', diff saved to https://phabricator.wikimedia.org/P49228 and previous config saved to /var/cache/conftool/dbconfig/20230608-004653-ladsgroup.json
- 00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P49227 and previous config saved to /var/cache/conftool/dbconfig/20230608-003841-ladsgroup.json
- 00:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P49226 and previous config saved to /var/cache/conftool/dbconfig/20230608-003146-ladsgroup.json
- 00:28 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
- 00:28 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49225 and previous config saved to /var/cache/conftool/dbconfig/20230608-002335-ladsgroup.json
- 00:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P49224 and previous config saved to /var/cache/conftool/dbconfig/20230608-001640-ladsgroup.json
- 00:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49223 and previous config saved to /var/cache/conftool/dbconfig/20230608-001555-ladsgroup.json
- 00:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 00:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 00:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49222 and previous config saved to /var/cache/conftool/dbconfig/20230608-001534-ladsgroup.json
- 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T336886)', diff saved to https://phabricator.wikimedia.org/P49221 and previous config saved to /var/cache/conftool/dbconfig/20230608-000134-ladsgroup.json
- 00:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P49220 and previous config saved to /var/cache/conftool/dbconfig/20230608-000028-ladsgroup.json
2023-06-07
- 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T336886)', diff saved to https://phabricator.wikimedia.org/P49219 and previous config saved to /var/cache/conftool/dbconfig/20230607-235624-ladsgroup.json
- 23:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 23:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
- 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T336886)', diff saved to https://phabricator.wikimedia.org/P49218 and previous config saved to /var/cache/conftool/dbconfig/20230607-235603-ladsgroup.json
- 23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P49217 and previous config saved to /var/cache/conftool/dbconfig/20230607-234522-ladsgroup.json
- 23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49216 and previous config saved to /var/cache/conftool/dbconfig/20230607-234057-ladsgroup.json
- 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49215 and previous config saved to /var/cache/conftool/dbconfig/20230607-233016-ladsgroup.json
- 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49214 and previous config saved to /var/cache/conftool/dbconfig/20230607-232551-ladsgroup.json
- 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49213 and previous config saved to /var/cache/conftool/dbconfig/20230607-232223-ladsgroup.json
- 23:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 23:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49212 and previous config saved to /var/cache/conftool/dbconfig/20230607-232203-ladsgroup.json
- 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T336886)', diff saved to https://phabricator.wikimedia.org/P49211 and previous config saved to /var/cache/conftool/dbconfig/20230607-231045-ladsgroup.json
- 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P49210 and previous config saved to /var/cache/conftool/dbconfig/20230607-230657-ladsgroup.json
- 23:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T336886)', diff saved to https://phabricator.wikimedia.org/P49209 and previous config saved to /var/cache/conftool/dbconfig/20230607-230540-ladsgroup.json
- 23:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
- 23:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
- 23:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
- 23:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
- 22:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 22:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T336886)', diff saved to https://phabricator.wikimedia.org/P49208 and previous config saved to /var/cache/conftool/dbconfig/20230607-225926-ladsgroup.json
- 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P49207 and previous config saved to /var/cache/conftool/dbconfig/20230607-225150-ladsgroup.json
- 22:45 zabe@deploy1002: Finished scap: T338287 (duration: 07m 30s)
- 22:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P49206 and previous config saved to /var/cache/conftool/dbconfig/20230607-224420-ladsgroup.json
- 22:38 zabe@deploy1002: Started scap: T338287
- 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49205 and previous config saved to /var/cache/conftool/dbconfig/20230607-223644-ladsgroup.json
- 22:34 zabe@deploy1002: Sync cancelled.
- 22:34 zabe@deploy1002: zabe: Backport for Use cuc_timestamp as index field when reading old (T338287) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 22:32 zabe@deploy1002: Started scap: Backport for Use cuc_timestamp as index field when reading old (T338287)
- 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P49204 and previous config saved to /var/cache/conftool/dbconfig/20230607-222914-ladsgroup.json
- 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49203 and previous config saved to /var/cache/conftool/dbconfig/20230607-222905-ladsgroup.json
- 22:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 22:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49202 and previous config saved to /var/cache/conftool/dbconfig/20230607-222844-ladsgroup.json
- 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T336886)', diff saved to https://phabricator.wikimedia.org/P49201 and previous config saved to /var/cache/conftool/dbconfig/20230607-221408-ladsgroup.json
- 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P49200 and previous config saved to /var/cache/conftool/dbconfig/20230607-221338-ladsgroup.json
- 22:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1221 (T336886)', diff saved to https://phabricator.wikimedia.org/P49199 and previous config saved to /var/cache/conftool/dbconfig/20230607-220859-ladsgroup.json
- 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T336886)', diff saved to https://phabricator.wikimedia.org/P49198 and previous config saved to /var/cache/conftool/dbconfig/20230607-220821-ladsgroup.json
- 22:05 eileen: civicrm upgraded from bcc8fccc to 066095b8
- 22:05 zabe@deploy1002: Finished scap: Backport for Use cuc_timestamp as index field when reading old (T338287) (duration: 11m 48s)
- 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P49197 and previous config saved to /var/cache/conftool/dbconfig/20230607-215831-ladsgroup.json
- 21:55 zabe@deploy1002: dreamyjazz and zabe: Backport for Use cuc_timestamp as index field when reading old (T338287) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 21:53 zabe@deploy1002: Started scap: Backport for Use cuc_timestamp as index field when reading old (T338287)
- 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P49196 and previous config saved to /var/cache/conftool/dbconfig/20230607-215315-ladsgroup.json
- 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49195 and previous config saved to /var/cache/conftool/dbconfig/20230607-214325-ladsgroup.json
- 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P49194 and previous config saved to /var/cache/conftool/dbconfig/20230607-213809-ladsgroup.json
- 21:36 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
- 21:36 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet
- 21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49193 and previous config saved to /var/cache/conftool/dbconfig/20230607-213530-ladsgroup.json
- 21:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 21:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
- 21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T336886)', diff saved to https://phabricator.wikimedia.org/P49192 and previous config saved to /var/cache/conftool/dbconfig/20230607-213509-ladsgroup.json
- 21:33 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 21:32 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 21:32 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs1016.eqiad.wmnet
- 21:32 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs1016.eqiad.wmnet
- 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T336886)', diff saved to https://phabricator.wikimedia.org/P49191 and previous config saved to /var/cache/conftool/dbconfig/20230607-212303-ladsgroup.json
- 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P49190 and previous config saved to /var/cache/conftool/dbconfig/20230607-212003-ladsgroup.json
- 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T336886)', diff saved to https://phabricator.wikimedia.org/P49189 and previous config saved to /var/cache/conftool/dbconfig/20230607-211807-ladsgroup.json
- 21:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 21:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 21:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T336886)', diff saved to https://phabricator.wikimedia.org/P49188 and previous config saved to /var/cache/conftool/dbconfig/20230607-211746-ladsgroup.json
- 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P49187 and previous config saved to /var/cache/conftool/dbconfig/20230607-210457-ladsgroup.json
- 21:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P49186 and previous config saved to /var/cache/conftool/dbconfig/20230607-210240-ladsgroup.json
- 20:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1023.eqiad.wmnet with OS bullseye
- 20:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T336886)', diff saved to https://phabricator.wikimedia.org/P49185 and previous config saved to /var/cache/conftool/dbconfig/20230607-204951-ladsgroup.json
- 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P49184 and previous config saved to /var/cache/conftool/dbconfig/20230607-204734-ladsgroup.json
- 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T336886)', diff saved to https://phabricator.wikimedia.org/P49183 and previous config saved to /var/cache/conftool/dbconfig/20230607-204728-ladsgroup.json
- 20:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 20:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 20:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 20:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
- 20:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T336886)', diff saved to https://phabricator.wikimedia.org/P49182 and previous config saved to /var/cache/conftool/dbconfig/20230607-204652-ladsgroup.json
- 20:45 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
- 20:35 catrope@deploy1002: Finished scap: Backport for Link to translations of CC BY-SA 4.0 where possible (T319064) (duration: 12m 12s)
- 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T336886)', diff saved to https://phabricator.wikimedia.org/P49181 and previous config saved to /var/cache/conftool/dbconfig/20230607-203228-ladsgroup.json
- 20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P49180 and previous config saved to /var/cache/conftool/dbconfig/20230607-203146-ladsgroup.json
- 20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T336886)', diff saved to https://phabricator.wikimedia.org/P49179 and previous config saved to /var/cache/conftool/dbconfig/20230607-202733-ladsgroup.json
- 20:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 20:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 20:24 catrope@deploy1002: catrope: Backport for Link to translations of CC BY-SA 4.0 where possible (T319064) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 20:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 20:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T336886)', diff saved to https://phabricator.wikimedia.org/P49178 and previous config saved to /var/cache/conftool/dbconfig/20230607-202408-ladsgroup.json
- 20:23 catrope@deploy1002: Started scap: Backport for Link to translations of CC BY-SA 4.0 where possible (T319064)
- 20:18 catrope@deploy1002: Finished scap: Backport for Deploy GDI safety survey to JA and RU wikis. (T337728) (duration: 10m 53s)
- 20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P49177 and previous config saved to /var/cache/conftool/dbconfig/20230607-201640-ladsgroup.json
- 20:15 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
- 20:15 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
- 20:15 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
- 20:14 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
- 20:11 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
- 20:11 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet
- 20:09 catrope@deploy1002: catrope and essexigyan: Backport for Deploy GDI safety survey to JA and RU wikis. (T337728) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 20:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P49176 and previous config saved to /var/cache/conftool/dbconfig/20230607-200902-ladsgroup.json
- 20:07 catrope@deploy1002: Started scap: Backport for Deploy GDI safety survey to JA and RU wikis. (T337728)
- 20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T336886)', diff saved to https://phabricator.wikimedia.org/P49175 and previous config saved to /var/cache/conftool/dbconfig/20230607-200134-ladsgroup.json
- 19:54 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1023.eqiad.wmnet with OS bullseye
- 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P49174 and previous config saved to /var/cache/conftool/dbconfig/20230607-195356-ladsgroup.json
- 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T336886)', diff saved to https://phabricator.wikimedia.org/P49173 and previous config saved to /var/cache/conftool/dbconfig/20230607-195316-ladsgroup.json
- 19:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
- 19:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
- 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T336886)', diff saved to https://phabricator.wikimedia.org/P49172 and previous config saved to /var/cache/conftool/dbconfig/20230607-195255-ladsgroup.json
- 19:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
- 19:41 taavi: manually created 3 global accounts T338197
- 19:40 bblack: cp*: disabling puppet temporarily out of an abundance of caution
- 19:40 eevans@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
- 19:40 eevans@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
- 19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T336886)', diff saved to https://phabricator.wikimedia.org/P49171 and previous config saved to /var/cache/conftool/dbconfig/20230607-193850-ladsgroup.json
- 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P49170 and previous config saved to /var/cache/conftool/dbconfig/20230607-193749-ladsgroup.json
- 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T336886)', diff saved to https://phabricator.wikimedia.org/P49169 and previous config saved to /var/cache/conftool/dbconfig/20230607-193357-ladsgroup.json
- 19:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 19:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1149.eqiad.wmnet with reason: Maintenance
- 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49168 and previous config saved to /var/cache/conftool/dbconfig/20230607-193326-ladsgroup.json
- 19:23 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 19:23 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P49167 and previous config saved to /var/cache/conftool/dbconfig/20230607-192243-ladsgroup.json
- 19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P49166 and previous config saved to /var/cache/conftool/dbconfig/20230607-191820-ladsgroup.json
- 19:16 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool sessionstore in codfw: maintenance
- 19:11 eevans@cumin1001: START - Cookbook sre.discovery.service-route pool sessionstore in codfw: maintenance
- 19:11 urandom: (Re)pooling codfw sessionstore — T337426
- 19:09 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore2001.codfw.wmnet
- 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T336886)', diff saved to https://phabricator.wikimedia.org/P49165 and previous config saved to /var/cache/conftool/dbconfig/20230607-190737-ladsgroup.json
- 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T336886)', diff saved to https://phabricator.wikimedia.org/P49164 and previous config saved to /var/cache/conftool/dbconfig/20230607-190514-ladsgroup.json
- 19:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
- 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P49163 and previous config saved to /var/cache/conftool/dbconfig/20230607-190314-ladsgroup.json
- 19:02 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore2001.codfw.wmnet
- 18:59 pt1979@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dbproxy1022.eqiad.wmnet with OS bullseye
- 18:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 18:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
- 18:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 18:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 18:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49162 and previous config saved to /var/cache/conftool/dbconfig/20230607-184808-ladsgroup.json
- 18:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 18:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 18:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T336886)', diff saved to https://phabricator.wikimedia.org/P49161 and previous config saved to /var/cache/conftool/dbconfig/20230607-184712-ladsgroup.json
- 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49160 and previous config saved to /var/cache/conftool/dbconfig/20230607-184411-ladsgroup.json
- 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 18:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1148.eqiad.wmnet with reason: Maintenance
- 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49159 and previous config saved to /var/cache/conftool/dbconfig/20230607-184351-ladsgroup.json
- 18:41 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
- 18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P49158 and previous config saved to /var/cache/conftool/dbconfig/20230607-183206-ladsgroup.json
- 18:31 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3052.esams.wmnet
- 18:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P49157 and previous config saved to /var/cache/conftool/dbconfig/20230607-182845-ladsgroup.json
- 18:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1135.eqiad.wmnet with reason: T338354
- 18:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1135.eqiad.wmnet with reason: T338354
- 18:22 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp3052.esams.wmnet
- 18:20 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.12 refs T337526 (duration: 06m 05s)
- 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P49156 and previous config saved to /var/cache/conftool/dbconfig/20230607-181700-ladsgroup.json
- 18:14 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.12 refs T337526
- 18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P49155 and previous config saved to /var/cache/conftool/dbconfig/20230607-181339-ladsgroup.json
- 18:08 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@d90d5c8]: (no justification provided) (duration: 00m 33s)
- 18:07 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@d90d5c8]: (no justification provided)
- 18:04 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs2014.codfw.wmnet with OS bullseye
- 18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T336886)', diff saved to https://phabricator.wikimedia.org/P49154 and previous config saved to /var/cache/conftool/dbconfig/20230607-180154-ladsgroup.json
- 17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49153 and previous config saved to /var/cache/conftool/dbconfig/20230607-175833-ladsgroup.json
- 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1222 (T336886)', diff saved to https://phabricator.wikimedia.org/P49152 and previous config saved to /var/cache/conftool/dbconfig/20230607-175347-ladsgroup.json
- 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49151 and previous config saved to /var/cache/conftool/dbconfig/20230607-175337-ladsgroup.json
- 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1147.eqiad.wmnet with reason: Maintenance
- 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T336886)', diff saved to https://phabricator.wikimedia.org/P49150 and previous config saved to /var/cache/conftool/dbconfig/20230607-175327-ladsgroup.json
- 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1147.eqiad.wmnet with reason: Maintenance
- 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49149 and previous config saved to /var/cache/conftool/dbconfig/20230607-175316-ladsgroup.json
- 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3050.esams.wmnet,service=ats-be
- 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3050.esams.wmnet,service=cdn
- 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3051.esams.wmnet,service=ats-be
- 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3051.esams.wmnet,service=cdn
- 17:46 inflatador: bking@wdqs depool wdqs2012 T321605
- 17:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3051.esams.wmnet
- 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P49148 and previous config saved to /var/cache/conftool/dbconfig/20230607-173821-ladsgroup.json
- 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P49147 and previous config saved to /var/cache/conftool/dbconfig/20230607-173810-ladsgroup.json
- 17:34 cwhite@cumin2002: dbctl commit (dc=all): 'depool db1135', diff saved to https://phabricator.wikimedia.org/P49146 and previous config saved to /var/cache/conftool/dbconfig/20230607-173453-cwhite.json
- 17:33 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp3051.esams.wmnet
- 17:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
- 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P49145 and previous config saved to /var/cache/conftool/dbconfig/20230607-172315-ladsgroup.json
- 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P49144 and previous config saved to /var/cache/conftool/dbconfig/20230607-172304-ladsgroup.json
- 17:13 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 17:13 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 17:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 17:12 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 17:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 17:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T336886)', diff saved to https://phabricator.wikimedia.org/P49143 and previous config saved to /var/cache/conftool/dbconfig/20230607-170808-ladsgroup.json
- 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49142 and previous config saved to /var/cache/conftool/dbconfig/20230607-170758-ladsgroup.json
- 17:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
- 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T336886)', diff saved to https://phabricator.wikimedia.org/P49141 and previous config saved to /var/cache/conftool/dbconfig/20230607-170551-ladsgroup.json
- 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T336886)', diff saved to https://phabricator.wikimedia.org/P49140 and previous config saved to /var/cache/conftool/dbconfig/20230607-170530-ladsgroup.json
- 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49139 and previous config saved to /var/cache/conftool/dbconfig/20230607-170252-ladsgroup.json
- 17:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 17:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49138 and previous config saved to /var/cache/conftool/dbconfig/20230607-165934-ladsgroup.json
- 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P49137 and previous config saved to /var/cache/conftool/dbconfig/20230607-165024-ladsgroup.json
- 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P49135 and previous config saved to /var/cache/conftool/dbconfig/20230607-164428-ladsgroup.json
- 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P49134 and previous config saved to /var/cache/conftool/dbconfig/20230607-163518-ladsgroup.json
- 16:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
- 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P49133 and previous config saved to /var/cache/conftool/dbconfig/20230607-162922-ladsgroup.json
- 16:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:29 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:23 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3050.esams.wmnet
- 16:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs2014']
- 16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T336886)', diff saved to https://phabricator.wikimedia.org/P49132 and previous config saved to /var/cache/conftool/dbconfig/20230607-162012-ladsgroup.json
- 16:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T336886)', diff saved to https://phabricator.wikimedia.org/P49131 and previous config saved to /var/cache/conftool/dbconfig/20230607-161800-ladsgroup.json
- 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 16:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T336886)', diff saved to https://phabricator.wikimedia.org/P49130 and previous config saved to /var/cache/conftool/dbconfig/20230607-161740-ladsgroup.json
- 16:15 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp3050.esams.wmnet
- 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49129 and previous config saved to /var/cache/conftool/dbconfig/20230607-161416-ladsgroup.json
- 16:13 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:12 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
- 16:11 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
- 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49128 and previous config saved to /var/cache/conftool/dbconfig/20230607-160912-ladsgroup.json
- 16:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 16:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T336886)', diff saved to https://phabricator.wikimedia.org/P49127 and previous config saved to /var/cache/conftool/dbconfig/20230607-160851-ladsgroup.json
- 16:07 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 16:04 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin2002"
- 16:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2014.mgmt.codfw.wmnet with reboot policy FORCED
- 16:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P49126 and previous config saved to /var/cache/conftool/dbconfig/20230607-160234-ladsgroup.json
- 16:00 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lists1003.wikimedia.org
- 15:57 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 15:56 urandom: Beginning (3 hour) generated traffic testing of sessionstore.svc.codfw.wmnet — T337426
- 15:56 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P49125 and previous config saved to /var/cache/conftool/dbconfig/20230607-155345-ladsgroup.json
- 15:52 urandom: Upgrading Cassandra to 4.1.1, sessionstore2003 — T337426
- 15:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists1003.wikimedia.org
- 15:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
- 15:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P49124 and previous config saved to /var/cache/conftool/dbconfig/20230607-154727-ladsgroup.json
- 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
- 15:44 urandom: Upgrading Cassandra to 4.1.1, sessionstore2002 — T337426
- 15:43 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host lvs2014.mgmt.codfw.wmnet with reboot policy FORCED
- 15:42 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:42 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for lvs2014 - pt1979@cumin2002"
- 15:41 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for lvs2014 - pt1979@cumin2002"
- 15:40 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetserver2001.codfw.wmnet with reason: host reimage
- 15:39 moritzm: installing isc-dhcp bugfixes updates from Bullseye 11.7 point release
- 15:38 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P49123 and previous config saved to /var/cache/conftool/dbconfig/20230607-153839-ladsgroup.json
- 15:37 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetserver2001.codfw.wmnet with reason: host reimage
- 15:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 15:33 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 15:33 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 15:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T336886)', diff saved to https://phabricator.wikimedia.org/P49122 and previous config saved to /var/cache/conftool/dbconfig/20230607-153221-ladsgroup.json
- 15:26 moritzm: rolling restart of FPM on mw canaries to pick up libwebp security updates
- 15:26 pt1979@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
- 15:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T336886)', diff saved to https://phabricator.wikimedia.org/P49121 and previous config saved to /var/cache/conftool/dbconfig/20230607-152456-ladsgroup.json
- 15:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 15:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49120 and previous config saved to /var/cache/conftool/dbconfig/20230607-152425-ladsgroup.json
- 15:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T336886)', diff saved to https://phabricator.wikimedia.org/P49119 and previous config saved to /var/cache/conftool/dbconfig/20230607-152333-ladsgroup.json
- 15:23 elukey: all varnishkafka instances on caching nodes are getting restarted due to https://gerrit.wikimedia.org/r/c/operations/puppet/+/928087 - T337825
- 15:22 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 15:22 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 15:22 elukey: re-enable puppet on caching nodes
- 15:22 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 15:21 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 15:21 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 15:21 claime: Bumping prewarmparsoid concurrency to 45 in changeprop-jobqueue - T320534
- 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T336886)', diff saved to https://phabricator.wikimedia.org/P49118 and previous config saved to /var/cache/conftool/dbconfig/20230607-151835-ladsgroup.json
- 15:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 15:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T336886)', diff saved to https://phabricator.wikimedia.org/P49117 and previous config saved to /var/cache/conftool/dbconfig/20230607-151815-ladsgroup.json
- 15:17 moritzm: installing libwebp security updates on buster
- 15:17 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2001.codfw.wmnet with OS bookworm
- 15:17 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetserver2001.codfw.wmnet with OS bookworm
- 15:14 urandom: Upgrading Cassandra to 4.1.1, sessionstore2001 — T337426
- 15:14 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 15:10 elukey: disable puppet on all caching nodes to rollout a varnishakfka change (ref: https://gerrit.wikimedia.org/r/c/operations/puppet/+/928087)
- 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P49116 and previous config saved to /var/cache/conftool/dbconfig/20230607-150919-ladsgroup.json
- 15:08 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2001.codfw.wmnet with OS bookworm
- 15:07 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool sessionstore in codfw: maintenance
- 15:06 jbond@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetserver2001.mgmt.codfw.wmnet on all recursors
- 15:06 jbond@cumin2002: START - Cookbook sre.dns.wipe-cache puppetserver2001.mgmt.codfw.wmnet on all recursors
- 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P49115 and previous config saved to /var/cache/conftool/dbconfig/20230607-150309-ladsgroup.json
- 15:02 eevans@cumin1001: START - Cookbook sre.discovery.service-route depool sessionstore in codfw: maintenance
- 15:02 urandom: de-pooling sessionstore/codfw — T337426
- 14:56 sukhe: homer "cr*-codfw*" commit "Gerrit: 928068 remove decommissioned host lvs2010"
- 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetserver1001.eqiad.wmnet with OS bookworm
- 14:54 jbond@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin1001"
- 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P49114 and previous config saved to /var/cache/conftool/dbconfig/20230607-145413-ladsgroup.json
- 14:54 moritzm: installing postgresql 13 security updates (clients/libs, server instances all updated already)
- 14:53 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin1001"
- 14:51 jbond@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:50 jbond@cumin2002: START - Cookbook sre.dns.netbox
- 14:49 jbond@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs2010.codfw.wmnet
- 14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:48 sukhe@cumin2002: START - Cookbook sre.dns.netbox
- 14:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P49112 and previous config saved to /var/cache/conftool/dbconfig/20230607-144803-ladsgroup.json
- 14:43 jbond@cumin2002: START - Cookbook sre.dns.netbox
- 14:40 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetserver1001.eqiad.wmnet with reason: host reimage
- 14:40 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_eqiad and A:cp
- 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49111 and previous config saved to /var/cache/conftool/dbconfig/20230607-143907-ladsgroup.json
- 14:39 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs2010.codfw.wmnet
- 14:37 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetserver1001.eqiad.wmnet with reason: host reimage
- 14:36 cgoubert@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 14:33 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 14:33 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_eqiad and A:cp
- 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T336886)', diff saved to https://phabricator.wikimedia.org/P49110 and previous config saved to /var/cache/conftool/dbconfig/20230607-143256-ladsgroup.json
- 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49109 and previous config saved to /var/cache/conftool/dbconfig/20230607-143235-ladsgroup.json
- 14:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 14:32 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T336886)', diff saved to https://phabricator.wikimedia.org/P49108 and previous config saved to /var/cache/conftool/dbconfig/20230607-143215-ladsgroup.json
- 14:32 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 14:31 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T336886)', diff saved to https://phabricator.wikimedia.org/P49107 and previous config saved to /var/cache/conftool/dbconfig/20230607-142756-ladsgroup.json
- 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1142.eqiad.wmnet with reason: Maintenance
- 14:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1142.eqiad.wmnet with reason: Maintenance
- 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T336886)', diff saved to https://phabricator.wikimedia.org/P49106 and previous config saved to /var/cache/conftool/dbconfig/20230607-142736-ladsgroup.json
- 14:26 cgoubert@deploy1002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 14:25 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetserver1001.eqiad.wmnet with OS bookworm
- 14:24 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetserver1001.eqiad.wmnet with OS bookworm
- 14:23 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
- 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P49104 and previous config saved to /var/cache/conftool/dbconfig/20230607-141709-ladsgroup.json
- 14:17 aborrero@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudnet2006-dev
- 14:16 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2006-dev
- 14:14 aborrero@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudnet2005-dev
- 14:14 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2005-dev
- 14:14 aborrero@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudnet2006-dev
- 14:13 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2006-dev
- 14:13 aborrero@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudnet2005-dev
- 14:13 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2005-dev
- 14:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P49103 and previous config saved to /var/cache/conftool/dbconfig/20230607-141230-ladsgroup.json
- 14:10 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Enable 'multi-line' mode in preg_match() for wikitextToHTML regex (T338264) (duration: 09m 16s)
- 14:05 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetserver1001.eqiad.wmnet with OS bookworm
- 14:03 lucaswerkmeister-wmde@deploy1002: d3r1ck01 and lucaswerkmeister-wmde: Backport for Enable 'multi-line' mode in preg_match() for wikitextToHTML regex (T338264) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
- 14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P49102 and previous config saved to /var/cache/conftool/dbconfig/20230607-140203-ladsgroup.json
- 14:01 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Enable 'multi-line' mode in preg_match() for wikitextToHTML regex (T338264)
- 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P49101 and previous config saved to /var/cache/conftool/dbconfig/20230607-135724-ladsgroup.json
- 13:47 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Enable cache warming jobs for parsoid per default. (T329366) (duration: 10m 27s)
- 13:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T336886)', diff saved to https://phabricator.wikimedia.org/P49100 and previous config saved to /var/cache/conftool/dbconfig/20230607-134656-ladsgroup.json
- 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T336886)', diff saved to https://phabricator.wikimedia.org/P49099 and previous config saved to /var/cache/conftool/dbconfig/20230607-134218-ladsgroup.json
- 13:40 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
- 13:39 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
- 13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T336886)', diff saved to https://phabricator.wikimedia.org/P49098 and previous config saved to /var/cache/conftool/dbconfig/20230607-133933-ladsgroup.json
- 13:39 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
- 13:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 13:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49097 and previous config saved to /var/cache/conftool/dbconfig/20230607-133854-ladsgroup.json
- 13:38 lucaswerkmeister-wmde@deploy1002: daniel and lucaswerkmeister-wmde: Backport for Enable cache warming jobs for parsoid per default. (T329366) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 13:38 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
- 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T336886)', diff saved to https://phabricator.wikimedia.org/P49096 and previous config saved to /var/cache/conftool/dbconfig/20230607-133725-ladsgroup.json
- 13:37 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Enable cache warming jobs for parsoid per default. (T329366)
- 13:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1141.eqiad.wmnet with reason: Maintenance
- 13:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1141.eqiad.wmnet with reason: Maintenance
- 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T336886)', diff saved to https://phabricator.wikimedia.org/P49095 and previous config saved to /var/cache/conftool/dbconfig/20230607-133704-ladsgroup.json
- 13:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
- 13:29 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1024.mgmt.eqiad.wmnet with reboot policy FORCED
- 13:29 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
- 13:28 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1024.mgmt.eqiad.wmnet with reboot policy FORCED
- 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P49093 and previous config saved to /var/cache/conftool/dbconfig/20230607-132348-ladsgroup.json
- 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P49092 and previous config saved to /var/cache/conftool/dbconfig/20230607-132158-ladsgroup.json
- 13:20 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
- 13:20 topranks: removing remote vlan configuration from lsw1-f1-eqiad T322937
- 13:19 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
- 13:10 ladsgroup@deploy1002: Finished scap: Backport for Revert "Revert "Remove legacy encoding option from dawiktionary"" (duration: 07m 11s)
- 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P49090 and previous config saved to /var/cache/conftool/dbconfig/20230607-130841-ladsgroup.json
- 13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P49089 and previous config saved to /var/cache/conftool/dbconfig/20230607-130651-ladsgroup.json
- 13:04 ladsgroup@deploy1002: ladsgroup: Backport for Revert "Revert "Remove legacy encoding option from dawiktionary"" synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 13:03 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
- 13:03 ladsgroup@deploy1002: Started scap: Backport for Revert "Revert "Remove legacy encoding option from dawiktionary""
- 13:02 cmooney@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937 (duration: 11m 45s)
- 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49088 and previous config saved to /var/cache/conftool/dbconfig/20230607-125335-ladsgroup.json
- 12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T336886)', diff saved to https://phabricator.wikimedia.org/P49087 and previous config saved to /var/cache/conftool/dbconfig/20230607-125145-ladsgroup.json
- 12:51 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetserver1001.eqiad.wmnet with OS bookworm
- 12:50 topranks: Depooling lvs1019 to move link from lsw1-f1-eqiad to ssw1-f1-eqiad
- 12:50 cmooney@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937
- 12:46 Amir1: mwscript maintenance/storage/moveToExternal.php --iconv DB cluster27 on dawiktionary and svwiktionary (T128155 and T128156)
- 12:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49086 and previous config saved to /var/cache/conftool/dbconfig/20230607-124543-ladsgroup.json
- 12:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 12:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
- 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 12:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
- 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T336886)', diff saved to https://phabricator.wikimedia.org/P49085 and previous config saved to /var/cache/conftool/dbconfig/20230607-123926-ladsgroup.json
- 12:37 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:37 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet - aborrero@cumin2002"
- 12:36 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet - aborrero@cumin2002"
- 12:33 aborrero@cumin2002: START - Cookbook sre.dns.netbox
- 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T336886)', diff saved to https://phabricator.wikimedia.org/P49084 and previous config saved to /var/cache/conftool/dbconfig/20230607-123002-ladsgroup.json
- 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P49083 and previous config saved to /var/cache/conftool/dbconfig/20230607-122420-ladsgroup.json
- 12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P49082 and previous config saved to /var/cache/conftool/dbconfig/20230607-121456-ladsgroup.json
- 12:13 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetserver1001.eqiad.wmnet with OS bookworm
- 12:12 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetserver1001.eqiad.wmnet on all recursors
- 12:12 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache puppetserver1001.eqiad.wmnet on all recursors
- 12:11 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetserver.eqiad.wmnet on all recursors
- 12:11 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache puppetserver.eqiad.wmnet on all recursors
- 12:11 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:10 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
- 12:09 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
- 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P49081 and previous config saved to /var/cache/conftool/dbconfig/20230607-120914-ladsgroup.json
- 12:07 jbond@cumin1001: START - Cookbook sre.dns.netbox
- 12:07 jbond@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host puppetserver1001
- 12:06 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver1001
- 12:06 jbond@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host puppetserver2001
- 12:04 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver2001
- 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P49080 and previous config saved to /var/cache/conftool/dbconfig/20230607-115950-ladsgroup.json
- 11:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T336886)', diff saved to https://phabricator.wikimedia.org/P49079 and previous config saved to /var/cache/conftool/dbconfig/20230607-115408-ladsgroup.json
- 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T336886)', diff saved to https://phabricator.wikimedia.org/P49078 and previous config saved to /var/cache/conftool/dbconfig/20230607-115156-ladsgroup.json
- 11:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 11:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1129.eqiad.wmnet with reason: Maintenance
- 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 (T336886)', diff saved to https://phabricator.wikimedia.org/P49077 and previous config saved to /var/cache/conftool/dbconfig/20230607-115124-ladsgroup.json
- 11:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1138.eqiad.wmnet with reason: Maintenance
- 11:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1138.eqiad.wmnet with reason: Maintenance
- 11:48 jbond@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host puppetserver2001
- 11:46 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver2001
- 11:46 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:46 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
- 11:45 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
- 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T336886)', diff saved to https://phabricator.wikimedia.org/P49076 and previous config saved to /var/cache/conftool/dbconfig/20230607-114444-ladsgroup.json
- 11:44 jbond@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host puppetserver1001
- 11:43 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver1001
- 11:43 jbond@cumin1001: START - Cookbook sre.dns.netbox
- 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T336886)', diff saved to https://phabricator.wikimedia.org/P49075 and previous config saved to /var/cache/conftool/dbconfig/20230607-114120-ladsgroup.json
- 11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 11:41 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
- 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49074 and previous config saved to /var/cache/conftool/dbconfig/20230607-114059-ladsgroup.json
- 11:40 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
- 11:35 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 11:35 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 11:30 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster2005
- 11:30 jbond@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:30 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster1005
- 11:30 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:30 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1005 decommissioned, removing all IPs except the asset tag one - jbond@cumin1001"
- 11:29 jbond@cumin2002: START - Cookbook sre.dns.netbox
- 11:27 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1005 decommissioned, removing all IPs except the asset tag one - jbond@cumin1001"
- 11:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P49073 and previous config saved to /var/cache/conftool/dbconfig/20230607-112553-ladsgroup.json
- 11:24 jbond@cumin1001: START - Cookbook sre.dns.netbox
- 11:24 jbond@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster2005
- 11:23 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts puppetmaster1005
- 11:22 jbond@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster1005
- 11:17 jbond@cumin1001: START - Cookbook sre.hosts.decommission for hosts puppetmaster1005
- 11:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P49072 and previous config saved to /var/cache/conftool/dbconfig/20230607-111047-ladsgroup.json
- 10:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49071 and previous config saved to /var/cache/conftool/dbconfig/20230607-105541-ladsgroup.json
- 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49070 and previous config saved to /var/cache/conftool/dbconfig/20230607-105215-ladsgroup.json
- 10:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49069 and previous config saved to /var/cache/conftool/dbconfig/20230607-105154-ladsgroup.json
- 10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P49068 and previous config saved to /var/cache/conftool/dbconfig/20230607-103648-ladsgroup.json
- 10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P49066 and previous config saved to /var/cache/conftool/dbconfig/20230607-102141-ladsgroup.json
- 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49065 and previous config saved to /var/cache/conftool/dbconfig/20230607-100635-ladsgroup.json
- 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49064 and previous config saved to /var/cache/conftool/dbconfig/20230607-100307-ladsgroup.json
- 10:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 10:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T336886)', diff saved to https://phabricator.wikimedia.org/P49063 and previous config saved to /var/cache/conftool/dbconfig/20230607-100247-ladsgroup.json
- 09:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P49062 and previous config saved to /var/cache/conftool/dbconfig/20230607-094740-ladsgroup.json
- 09:33 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir
- 09:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P49061 and previous config saved to /var/cache/conftool/dbconfig/20230607-093234-ladsgroup.json
- 09:21 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T336886)', diff saved to https://phabricator.wikimedia.org/P49060 and previous config saved to /var/cache/conftool/dbconfig/20230607-091728-ladsgroup.json
- 09:17 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T336886)', diff saved to https://phabricator.wikimedia.org/P49059 and previous config saved to /var/cache/conftool/dbconfig/20230607-091402-ladsgroup.json
- 09:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 09:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T336886)', diff saved to https://phabricator.wikimedia.org/P49058 and previous config saved to /var/cache/conftool/dbconfig/20230607-091341-ladsgroup.json
- 09:07 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 09:06 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 09:00 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 08:59 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_eqiad and A:cp
- 08:59 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_eqiad and A:cp
- 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P49057 and previous config saved to /var/cache/conftool/dbconfig/20230607-085835-ladsgroup.json
- 08:49 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P49056 and previous config saved to /var/cache/conftool/dbconfig/20230607-084329-ladsgroup.json
- 08:34 fabfur: disable puppet on A:cp-eqiad for varnish <-> haproxy port 80 swap
- 08:29 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir
- 08:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T336886)', diff saved to https://phabricator.wikimedia.org/P49055 and previous config saved to /var/cache/conftool/dbconfig/20230607-082823-ladsgroup.json
- 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T336886)', diff saved to https://phabricator.wikimedia.org/P49054 and previous config saved to /var/cache/conftool/dbconfig/20230607-082500-ladsgroup.json
- 08:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 08:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 08:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 08:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 08:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T336886)', diff saved to https://phabricator.wikimedia.org/P49053 and previous config saved to /var/cache/conftool/dbconfig/20230607-082434-ladsgroup.json
- 08:22 moritzm: uploaded ruby 2.5.5-3+deb10u5+wmf1 to apt.wikimedia.org, unbreaking Puppet runs after latest Ruby update for Buster T338294
- 08:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P49052 and previous config saved to /var/cache/conftool/dbconfig/20230607-080928-ladsgroup.json
- 07:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P49051 and previous config saved to /var/cache/conftool/dbconfig/20230607-075422-ladsgroup.json
- 07:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T336886)', diff saved to https://phabricator.wikimedia.org/P49050 and previous config saved to /var/cache/conftool/dbconfig/20230607-073916-ladsgroup.json
- 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T336886)', diff saved to https://phabricator.wikimedia.org/P49049 and previous config saved to /var/cache/conftool/dbconfig/20230607-073554-ladsgroup.json
- 07:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 07:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T336886)', diff saved to https://phabricator.wikimedia.org/P49048 and previous config saved to /var/cache/conftool/dbconfig/20230607-073533-ladsgroup.json
- 07:22 kartik@deploy1002: Finished scap: Backport for Use direct Parsoid in Small and Medium Wikis for Content Translation (T337922) (duration: 18m 06s)
- 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P49047 and previous config saved to /var/cache/conftool/dbconfig/20230607-072027-ladsgroup.json
- 07:06 kartik@deploy1002: kartik: Backport for Use direct Parsoid in Small and Medium Wikis for Content Translation (T337922) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P49046 and previous config saved to /var/cache/conftool/dbconfig/20230607-070521-ladsgroup.json
- 07:04 kartik@deploy1002: Started scap: Backport for Use direct Parsoid in Small and Medium Wikis for Content Translation (T337922)
- 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T336886)', diff saved to https://phabricator.wikimedia.org/P49045 and previous config saved to /var/cache/conftool/dbconfig/20230607-065015-ladsgroup.json
- 06:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T336886)', diff saved to https://phabricator.wikimedia.org/P49044 and previous config saved to /var/cache/conftool/dbconfig/20230607-064652-ladsgroup.json
- 06:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 06:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
- 06:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T336886)', diff saved to https://phabricator.wikimedia.org/P49043 and previous config saved to /var/cache/conftool/dbconfig/20230607-064631-ladsgroup.json
- 06:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T336886)', diff saved to https://phabricator.wikimedia.org/P49042 and previous config saved to /var/cache/conftool/dbconfig/20230607-064215-ladsgroup.json
- 06:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P49041 and previous config saved to /var/cache/conftool/dbconfig/20230607-063125-ladsgroup.json
- 06:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P49040 and previous config saved to /var/cache/conftool/dbconfig/20230607-062709-ladsgroup.json
- 06:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P49039 and previous config saved to /var/cache/conftool/dbconfig/20230607-061618-ladsgroup.json
- 06:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P49038 and previous config saved to /var/cache/conftool/dbconfig/20230607-061203-ladsgroup.json
- 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T336886)', diff saved to https://phabricator.wikimedia.org/P49037 and previous config saved to /var/cache/conftool/dbconfig/20230607-060112-ladsgroup.json
- 05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2161 (T336886)', diff saved to https://phabricator.wikimedia.org/P49036 and previous config saved to /var/cache/conftool/dbconfig/20230607-055746-ladsgroup.json
- 05:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 05:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T336886)', diff saved to https://phabricator.wikimedia.org/P49035 and previous config saved to /var/cache/conftool/dbconfig/20230607-055726-ladsgroup.json
- 05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T336886)', diff saved to https://phabricator.wikimedia.org/P49034 and previous config saved to /var/cache/conftool/dbconfig/20230607-055655-ladsgroup.json
- 05:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T336886)', diff saved to https://phabricator.wikimedia.org/P49033 and previous config saved to /var/cache/conftool/dbconfig/20230607-055320-ladsgroup.json
- 05:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 05:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 05:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49032 and previous config saved to /var/cache/conftool/dbconfig/20230607-055259-ladsgroup.json
- 05:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P49031 and previous config saved to /var/cache/conftool/dbconfig/20230607-054220-ladsgroup.json
- 05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P49030 and previous config saved to /var/cache/conftool/dbconfig/20230607-053753-ladsgroup.json
- 05:28 kart_: Updated cxserver to 2023-06-07-044025-production (T337290, T337669, T337834)
- 05:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P49029 and previous config saved to /var/cache/conftool/dbconfig/20230607-052713-ladsgroup.json
- 05:25 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 05:25 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 05:22 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P49028 and previous config saved to /var/cache/conftool/dbconfig/20230607-052247-ladsgroup.json
- 05:22 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 05:17 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 05:17 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T336886)', diff saved to https://phabricator.wikimedia.org/P49027 and previous config saved to /var/cache/conftool/dbconfig/20230607-051207-ladsgroup.json
- 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T336886)', diff saved to https://phabricator.wikimedia.org/P49026 and previous config saved to /var/cache/conftool/dbconfig/20230607-050844-ladsgroup.json
- 05:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 05:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T336886)', diff saved to https://phabricator.wikimedia.org/P49025 and previous config saved to /var/cache/conftool/dbconfig/20230607-050823-ladsgroup.json
- 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49024 and previous config saved to /var/cache/conftool/dbconfig/20230607-050740-ladsgroup.json
- 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49023 and previous config saved to /var/cache/conftool/dbconfig/20230607-050258-ladsgroup.json
- 05:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 05:02 kart_: Updated MinT to 2023-06-06-120533-production (T337910, T337686, T337708)
- 05:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 05:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T336886)', diff saved to https://phabricator.wikimedia.org/P49022 and previous config saved to /var/cache/conftool/dbconfig/20230607-050237-ladsgroup.json
- 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P49021 and previous config saved to /var/cache/conftool/dbconfig/20230607-045317-ladsgroup.json
- 04:51 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 04:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P49020 and previous config saved to /var/cache/conftool/dbconfig/20230607-044731-ladsgroup.json
- 04:45 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 04:39 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 04:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P49019 and previous config saved to /var/cache/conftool/dbconfig/20230607-043810-ladsgroup.json
- 04:36 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 04:32 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 04:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P49018 and previous config saved to /var/cache/conftool/dbconfig/20230607-043225-ladsgroup.json
- 04:31 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 04:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T336886)', diff saved to https://phabricator.wikimedia.org/P49017 and previous config saved to /var/cache/conftool/dbconfig/20230607-042304-ladsgroup.json
- 04:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T336886)', diff saved to https://phabricator.wikimedia.org/P49016 and previous config saved to /var/cache/conftool/dbconfig/20230607-042040-ladsgroup.json
- 04:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 04:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 04:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 04:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 04:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T336886)', diff saved to https://phabricator.wikimedia.org/P49015 and previous config saved to /var/cache/conftool/dbconfig/20230607-041719-ladsgroup.json
- 04:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 04:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 04:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 04:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
- 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T336886)', diff saved to https://phabricator.wikimedia.org/P49014 and previous config saved to /var/cache/conftool/dbconfig/20230607-041357-ladsgroup.json
- 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T336886)', diff saved to https://phabricator.wikimedia.org/P49013 and previous config saved to /var/cache/conftool/dbconfig/20230607-041347-ladsgroup.json
- 04:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 04:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49012 and previous config saved to /var/cache/conftool/dbconfig/20230607-041326-ladsgroup.json
- 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P49011 and previous config saved to /var/cache/conftool/dbconfig/20230607-035851-ladsgroup.json
- 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P49010 and previous config saved to /var/cache/conftool/dbconfig/20230607-035820-ladsgroup.json
- 03:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P49009 and previous config saved to /var/cache/conftool/dbconfig/20230607-034345-ladsgroup.json
- 03:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P49008 and previous config saved to /var/cache/conftool/dbconfig/20230607-034314-ladsgroup.json
- 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T336886)', diff saved to https://phabricator.wikimedia.org/P49007 and previous config saved to /var/cache/conftool/dbconfig/20230607-032839-ladsgroup.json
- 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49006 and previous config saved to /var/cache/conftool/dbconfig/20230607-032808-ladsgroup.json
- 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1214 (T336886)', diff saved to https://phabricator.wikimedia.org/P49005 and previous config saved to /var/cache/conftool/dbconfig/20230607-032522-ladsgroup.json
- 03:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 03:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T336886)', diff saved to https://phabricator.wikimedia.org/P49004 and previous config saved to /var/cache/conftool/dbconfig/20230607-032501-ladsgroup.json
- 03:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49003 and previous config saved to /var/cache/conftool/dbconfig/20230607-032428-ladsgroup.json
- 03:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 03:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
- 03:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49002 and previous config saved to /var/cache/conftool/dbconfig/20230607-032407-ladsgroup.json
- 03:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P49001 and previous config saved to /var/cache/conftool/dbconfig/20230607-030955-ladsgroup.json
- 03:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P49000 and previous config saved to /var/cache/conftool/dbconfig/20230607-030901-ladsgroup.json
- 02:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P48999 and previous config saved to /var/cache/conftool/dbconfig/20230607-025449-ladsgroup.json
- 02:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P48998 and previous config saved to /var/cache/conftool/dbconfig/20230607-025355-ladsgroup.json
- 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T336886)', diff saved to https://phabricator.wikimedia.org/P48997 and previous config saved to /var/cache/conftool/dbconfig/20230607-023943-ladsgroup.json
- 02:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T336886)', diff saved to https://phabricator.wikimedia.org/P48996 and previous config saved to /var/cache/conftool/dbconfig/20230607-023848-ladsgroup.json
- 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1211 (T336886)', diff saved to https://phabricator.wikimedia.org/P48995 and previous config saved to /var/cache/conftool/dbconfig/20230607-023624-ladsgroup.json
- 02:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T336886)', diff saved to https://phabricator.wikimedia.org/P48994 and previous config saved to /var/cache/conftool/dbconfig/20230607-023613-ladsgroup.json
- 02:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 02:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
- 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T336886)', diff saved to https://phabricator.wikimedia.org/P48993 and previous config saved to /var/cache/conftool/dbconfig/20230607-023603-ladsgroup.json
- 02:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 02:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
- 02:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
- 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T336886)', diff saved to https://phabricator.wikimedia.org/P48992 and previous config saved to /var/cache/conftool/dbconfig/20230607-023537-ladsgroup.json
- 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P48991 and previous config saved to /var/cache/conftool/dbconfig/20230607-022057-ladsgroup.json
- 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P48990 and previous config saved to /var/cache/conftool/dbconfig/20230607-022031-ladsgroup.json
- 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P48989 and previous config saved to /var/cache/conftool/dbconfig/20230607-020550-ladsgroup.json
- 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P48988 and previous config saved to /var/cache/conftool/dbconfig/20230607-020518-ladsgroup.json
- 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T336886)', diff saved to https://phabricator.wikimedia.org/P48987 and previous config saved to /var/cache/conftool/dbconfig/20230607-015043-ladsgroup.json
- 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T336886)', diff saved to https://phabricator.wikimedia.org/P48986 and previous config saved to /var/cache/conftool/dbconfig/20230607-015012-ladsgroup.json
- 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T336886)', diff saved to https://phabricator.wikimedia.org/P48985 and previous config saved to /var/cache/conftool/dbconfig/20230607-014635-ladsgroup.json
- 01:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1209 (T336886)', diff saved to https://phabricator.wikimedia.org/P48984 and previous config saved to /var/cache/conftool/dbconfig/20230607-014626-ladsgroup.json
- 01:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 01:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2123.codfw.wmnet with reason: Maintenance
- 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T336886)', diff saved to https://phabricator.wikimedia.org/P48983 and previous config saved to /var/cache/conftool/dbconfig/20230607-014614-ladsgroup.json
- 01:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
- 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T336886)', diff saved to https://phabricator.wikimedia.org/P48982 and previous config saved to /var/cache/conftool/dbconfig/20230607-014605-ladsgroup.json
- 01:39 sukhe: run authdns-update: T338280
- 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P48981 and previous config saved to /var/cache/conftool/dbconfig/20230607-013108-ladsgroup.json
- 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P48980 and previous config saved to /var/cache/conftool/dbconfig/20230607-013059-ladsgroup.json
- 01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P48979 and previous config saved to /var/cache/conftool/dbconfig/20230607-011602-ladsgroup.json
- 01:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P48978 and previous config saved to /var/cache/conftool/dbconfig/20230607-011553-ladsgroup.json
- 01:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T336886)', diff saved to https://phabricator.wikimedia.org/P48977 and previous config saved to /var/cache/conftool/dbconfig/20230607-010055-ladsgroup.json
- 01:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T336886)', diff saved to https://phabricator.wikimedia.org/P48976 and previous config saved to /var/cache/conftool/dbconfig/20230607-010047-ladsgroup.json
- 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1203 (T336886)', diff saved to https://phabricator.wikimedia.org/P48975 and previous config saved to /var/cache/conftool/dbconfig/20230607-005722-ladsgroup.json
- 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T336886)', diff saved to https://phabricator.wikimedia.org/P48974 and previous config saved to /var/cache/conftool/dbconfig/20230607-005713-ladsgroup.json
- 00:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 00:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance
- 00:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T336886)', diff saved to https://phabricator.wikimedia.org/P48973 and previous config saved to /var/cache/conftool/dbconfig/20230607-005654-ladsgroup.json
- 00:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance
- 00:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
- 00:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
- 00:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 00:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 00:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48972 and previous config saved to /var/cache/conftool/dbconfig/20230607-005155-ladsgroup.json
- 00:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P48971 and previous config saved to /var/cache/conftool/dbconfig/20230607-004148-ladsgroup.json
- 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P48970 and previous config saved to /var/cache/conftool/dbconfig/20230607-003649-ladsgroup.json
- 00:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P48969 and previous config saved to /var/cache/conftool/dbconfig/20230607-002642-ladsgroup.json
- 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P48968 and previous config saved to /var/cache/conftool/dbconfig/20230607-002143-ladsgroup.json
- 00:14 urbanecm:: Deployed security patch for T338276
- 00:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T336886)', diff saved to https://phabricator.wikimedia.org/P48967 and previous config saved to /var/cache/conftool/dbconfig/20230607-001136-ladsgroup.json
- 00:08 urbanecm:: Deployed security patch for T338276
- 00:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1193 (T336886)', diff saved to https://phabricator.wikimedia.org/P48966 and previous config saved to /var/cache/conftool/dbconfig/20230607-000814-ladsgroup.json
- 00:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 00:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T336886)', diff saved to https://phabricator.wikimedia.org/P48965 and previous config saved to /var/cache/conftool/dbconfig/20230607-000754-ladsgroup.json
- 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48964 and previous config saved to /var/cache/conftool/dbconfig/20230607-000637-ladsgroup.json
- 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1213:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48963 and previous config saved to /var/cache/conftool/dbconfig/20230607-000337-ladsgroup.json
- 00:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 00:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T336886)', diff saved to https://phabricator.wikimedia.org/P48962 and previous config saved to /var/cache/conftool/dbconfig/20230607-000316-ladsgroup.json
- 00:01 urbanecm: Deploying security patch for T338276
2023-06-06
- 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48961 and previous config saved to /var/cache/conftool/dbconfig/20230606-235248-ladsgroup.json
- 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48960 and previous config saved to /var/cache/conftool/dbconfig/20230606-234810-ladsgroup.json
- 23:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-a1-codfw.mgmt.codfw.wmnet
- 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48959 and previous config saved to /var/cache/conftool/dbconfig/20230606-233742-ladsgroup.json
- 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48958 and previous config saved to /var/cache/conftool/dbconfig/20230606-233304-ladsgroup.json
- 23:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
- 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T336886)', diff saved to https://phabricator.wikimedia.org/P48955 and previous config saved to /var/cache/conftool/dbconfig/20230606-232235-ladsgroup.json
- 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-a1-codfw - pt1979@cumin2002"
- 23:19 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-a1-codfw - pt1979@cumin2002"
- 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1192 (T336886)', diff saved to https://phabricator.wikimedia.org/P48954 and previous config saved to /var/cache/conftool/dbconfig/20230606-231913-ladsgroup.json
- 23:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 23:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T336886)', diff saved to https://phabricator.wikimedia.org/P48953 and previous config saved to /var/cache/conftool/dbconfig/20230606-231853-ladsgroup.json
- 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T336886)', diff saved to https://phabricator.wikimedia.org/P48952 and previous config saved to /var/cache/conftool/dbconfig/20230606-231758-ladsgroup.json
- 23:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:16 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-a1-codfw.mgmt.codfw.wmnet
- 23:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device ssw1-a1-codfw.mgmt.codfw.wmnet
- 23:16 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:16 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
- 23:15 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
- 23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1210 (T336886)', diff saved to https://phabricator.wikimedia.org/P48951 and previous config saved to /var/cache/conftool/dbconfig/20230606-231408-ladsgroup.json
- 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 23:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 23:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T336886)', diff saved to https://phabricator.wikimedia.org/P48950 and previous config saved to /var/cache/conftool/dbconfig/20230606-231347-ladsgroup.json
- 23:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P48949 and previous config saved to /var/cache/conftool/dbconfig/20230606-230347-ladsgroup.json
- 22:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P48948 and previous config saved to /var/cache/conftool/dbconfig/20230606-225841-ladsgroup.json
- 22:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 22:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
- 22:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
- 22:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 22:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P48947 and previous config saved to /var/cache/conftool/dbconfig/20230606-224841-ladsgroup.json
- 22:48 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
- 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P48946 and previous config saved to /var/cache/conftool/dbconfig/20230606-224334-ladsgroup.json
- 22:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T336886)', diff saved to https://phabricator.wikimedia.org/P48945 and previous config saved to /var/cache/conftool/dbconfig/20230606-223335-ladsgroup.json
- 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T336886)', diff saved to https://phabricator.wikimedia.org/P48944 and previous config saved to /var/cache/conftool/dbconfig/20230606-223011-ladsgroup.json
- 22:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 22:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48943 and previous config saved to /var/cache/conftool/dbconfig/20230606-222950-ladsgroup.json
- 22:29 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
- 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T336886)', diff saved to https://phabricator.wikimedia.org/P48942 and previous config saved to /var/cache/conftool/dbconfig/20230606-222828-ladsgroup.json
- 22:27 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp everywhere (T299954) (duration: 07m 33s)
- 22:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1200 (T336886)', diff saved to https://phabricator.wikimedia.org/P48941 and previous config saved to /var/cache/conftool/dbconfig/20230606-222534-ladsgroup.json
- 22:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 22:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 22:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T336886)', diff saved to https://phabricator.wikimedia.org/P48940 and previous config saved to /var/cache/conftool/dbconfig/20230606-222513-ladsgroup.json
- 22:21 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp everywhere (T299954) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 22:19 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp everywhere (T299954)
- 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P48939 and previous config saved to /var/cache/conftool/dbconfig/20230606-221444-ladsgroup.json
- 22:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P48938 and previous config saved to /var/cache/conftool/dbconfig/20230606-221007-ladsgroup.json
- 21:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P48937 and previous config saved to /var/cache/conftool/dbconfig/20230606-215938-ladsgroup.json
- 21:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P48936 and previous config saved to /var/cache/conftool/dbconfig/20230606-215501-ladsgroup.json
- 21:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48935 and previous config saved to /var/cache/conftool/dbconfig/20230606-214432-ladsgroup.json
- 21:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48934 and previous config saved to /var/cache/conftool/dbconfig/20230606-214109-ladsgroup.json
- 21:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T336886)', diff saved to https://phabricator.wikimedia.org/P48933 and previous config saved to /var/cache/conftool/dbconfig/20230606-214048-ladsgroup.json
- 21:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T336886)', diff saved to https://phabricator.wikimedia.org/P48932 and previous config saved to /var/cache/conftool/dbconfig/20230606-213954-ladsgroup.json
- 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1185 (T336886)', diff saved to https://phabricator.wikimedia.org/P48931 and previous config saved to /var/cache/conftool/dbconfig/20230606-213702-ladsgroup.json
- 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T336886)', diff saved to https://phabricator.wikimedia.org/P48930 and previous config saved to /var/cache/conftool/dbconfig/20230606-213641-ladsgroup.json
- 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P48929 and previous config saved to /var/cache/conftool/dbconfig/20230606-212542-ladsgroup.json
- 21:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P48928 and previous config saved to /var/cache/conftool/dbconfig/20230606-212135-ladsgroup.json
- 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P48927 and previous config saved to /var/cache/conftool/dbconfig/20230606-211036-ladsgroup.json
- 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P48926 and previous config saved to /var/cache/conftool/dbconfig/20230606-210629-ladsgroup.json
- 21:03 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
- 21:03 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1026.eqiad.wmnet with OS bullseye
- 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T336886)', diff saved to https://phabricator.wikimedia.org/P48925 and previous config saved to /var/cache/conftool/dbconfig/20230606-205530-ladsgroup.json
- 20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T336886)', diff saved to https://phabricator.wikimedia.org/P48924 and previous config saved to /var/cache/conftool/dbconfig/20230606-205206-ladsgroup.json
- 20:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 20:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 20:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T336886)', diff saved to https://phabricator.wikimedia.org/P48923 and previous config saved to /var/cache/conftool/dbconfig/20230606-205123-ladsgroup.json
- 20:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 20:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P48922 and previous config saved to /var/cache/conftool/dbconfig/20230606-205002-ladsgroup.json
- 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1183 (T336886)', diff saved to https://phabricator.wikimedia.org/P48921 and previous config saved to /var/cache/conftool/dbconfig/20230606-204527-ladsgroup.json
- 20:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
- 20:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
- 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T336886)', diff saved to https://phabricator.wikimedia.org/P48920 and previous config saved to /var/cache/conftool/dbconfig/20230606-204506-ladsgroup.json
- 20:41 urbanecm@deploy1002: Finished scap: Backport for PersonalizedPraiseLogger: Only include mentee_id if not null (T338078), PersonalizedPraiseLogger: Only include mentee_id if not null (T338078) (duration: 07m 23s)
- 20:35 urbanecm@deploy1002: urbanecm: Backport for PersonalizedPraiseLogger: Only include mentee_id if not null (T338078), PersonalizedPraiseLogger: Only include mentee_id if not null (T338078) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 20:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48919 and previous config saved to /var/cache/conftool/dbconfig/20230606-203456-ladsgroup.json
- 20:34 urbanecm@deploy1002: Started scap: Backport for PersonalizedPraiseLogger: Only include mentee_id if not null (T338078), PersonalizedPraiseLogger: Only include mentee_id if not null (T338078)
- 20:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P48917 and previous config saved to /var/cache/conftool/dbconfig/20230606-203000-ladsgroup.json
- 20:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48916 and previous config saved to /var/cache/conftool/dbconfig/20230606-201950-ladsgroup.json
- 20:16 mutante: miscweb1003, miscweb2003 - rm -rf /srv/org/wikimedia/sitemaps after removing httpd virtual host T338064
- 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P48915 and previous config saved to /var/cache/conftool/dbconfig/20230606-201454-ladsgroup.json
- 20:09 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
- 20:09 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1026.eqiad.wmnet with OS bullseye
- 20:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P48914 and previous config saved to /var/cache/conftool/dbconfig/20230606-200444-ladsgroup.json
- 19:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T336886)', diff saved to https://phabricator.wikimedia.org/P48913 and previous config saved to /var/cache/conftool/dbconfig/20230606-195948-ladsgroup.json
- 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T336886)', diff saved to https://phabricator.wikimedia.org/P48912 and previous config saved to /var/cache/conftool/dbconfig/20230606-195557-ladsgroup.json
- 19:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 19:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 19:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 19:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 19:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 19:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
- 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48911 and previous config saved to /var/cache/conftool/dbconfig/20230606-195320-ladsgroup.json
- 19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P48910 and previous config saved to /var/cache/conftool/dbconfig/20230606-193814-ladsgroup.json
- 19:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P48909 and previous config saved to /var/cache/conftool/dbconfig/20230606-192308-ladsgroup.json
- 19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48908 and previous config saved to /var/cache/conftool/dbconfig/20230606-190802-ladsgroup.json
- 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P48907 and previous config saved to /var/cache/conftool/dbconfig/20230606-190420-ladsgroup.json
- 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48906 and previous config saved to /var/cache/conftool/dbconfig/20230606-190402-ladsgroup.json
- 19:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 19:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
- 18:10 mutante: disabling https://sitemaps.wikimedia.org - T338064 T332101
- 18:10 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.12 refs T337526
- 18:01 sukhe: cumin 'A:cp-text' 'enable-puppet "CR 926611" && run-puppet-agent -q'
- 18:01 sukhe: re-enable puppet on A:cp-text and force puppet run: T338064
- 17:54 sukhe: enable puppet on cp4037 to test CR 926611
- 17:50 sukhe: disable puppet on A:cp-text to roll out CR 926611
- 17:39 sukhe: sudo cumin 'P:ntp' 'enable-puppet "testing CR 926598" && run-puppet-agent'
- 17:27 sukhe: sudo cumin 'P:ntp' 'disable-puppet "testing CR 926598"'
- 17:05 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 17:04 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 17:04 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 17:01 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 16:51 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 16:41 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 16:40 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 16:40 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 16:39 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 16:37 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 16:37 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 16:36 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 16:30 sukhe: low-traffic/codfw: set routing-options static route 10.2.1.0/24 next-hop 10.192.32.14
- 16:27 sukhe: restart pybal on lvs2013 to remove bgp-med override
- 16:23 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 16:12 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 16:12 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 16:06 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 16:03 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T336886)', diff saved to https://phabricator.wikimedia.org/P48904 and previous config saved to /var/cache/conftool/dbconfig/20230606-160151-ladsgroup.json
- 15:54 jbond@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
- 15:53 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 15:52 jbond@cumin1001: START - Cookbook sre.postgresql.postgres-init
- 15:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P48902 and previous config saved to /var/cache/conftool/dbconfig/20230606-154645-ladsgroup.json
- 15:46 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 15:46 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:40 cdanis@deploy1002: Finished scap: Backport for Revert "EventStreamConfig - development.network.probe- disable canary events and hadoop ingestion" (duration: 08m 13s)
- 15:38 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 15:37 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 15:35 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 15:35 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:34 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 15:34 cdanis@deploy1002: cdanis and otto: Backport for Revert "EventStreamConfig - development.network.probe- disable canary events and hadoop ingestion" synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 15:32 zabe: purge wikimaniawiki logos # T337044
- 15:32 cdanis@deploy1002: Started scap: Backport for Revert "EventStreamConfig - development.network.probe- disable canary events and hadoop ingestion"
- 15:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P48901 and previous config saved to /var/cache/conftool/dbconfig/20230606-153139-ladsgroup.json
- 15:30 zabe@deploy1002: Finished scap: Backport for Change project logo for Wikimania to Wikimania 2023 version (T337044) (duration: 08m 02s)
- 15:26 sukhe: homer "cr*-codfw*" commit "Gerrit: 927725 add new LVS host lvs2013" : T326767
- 15:24 zabe@deploy1002: robertsky and zabe: Backport for Change project logo for Wikimania to Wikimania 2023 version (T337044) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 15:22 zabe@deploy1002: Started scap: Backport for Change project logo for Wikimania to Wikimania 2023 version (T337044)
- 15:21 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs2013
- 15:21 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2013
- 15:20 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 15:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 15:19 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 15:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T336886)', diff saved to https://phabricator.wikimedia.org/P48900 and previous config saved to /var/cache/conftool/dbconfig/20230606-151633-ladsgroup.json
- 15:12 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_esams and A:cp
- 15:08 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1007.eqiad.wmnet
- 15:07 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 15:06 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 15:06 mforns@deploy1002: Finished deploy [airflow-dags/analytics@72d9b87]: (no justification provided) (duration: 00m 10s)
- 15:06 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 15:06 mforns@deploy1002: Started deploy [airflow-dags/analytics@72d9b87]: (no justification provided)
- 15:03 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 15:02 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 15:02 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T336886)', diff saved to https://phabricator.wikimedia.org/P48899 and previous config saved to /var/cache/conftool/dbconfig/20230606-150141-ladsgroup.json
- 15:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 15:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48898 and previous config saved to /var/cache/conftool/dbconfig/20230606-150120-ladsgroup.json
- 15:00 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host dumpsdata1007.eqiad.wmnet
- 14:57 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1026.eqiad.wmnet with OS bullseye
- 14:57 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
- 14:56 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 14:53 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 14:53 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 14:53 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 14:53 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 14:53 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 14:53 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 14:53 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:53 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change entries for moved links eqiad row e f switches - cmooney@cumin1001"
- 14:51 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change entries for moved links eqiad row e f switches - cmooney@cumin1001"
- 14:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2013.codfw.wmnet with OS bullseye
- 14:49 cmooney@cumin1001: START - Cookbook sre.dns.netbox
- 14:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P48897 and previous config saved to /var/cache/conftool/dbconfig/20230606-144614-ladsgroup.json
- 14:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
- 14:31 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
- 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P48896 and previous config saved to /var/cache/conftool/dbconfig/20230606-143107-ladsgroup.json
- 14:25 oblivian@deploy1002: Finished scap: Backport for Load and enable parsoid everywhere (T334980) (duration: 15m 00s)
- 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48895 and previous config saved to /var/cache/conftool/dbconfig/20230606-141601-ladsgroup.json
- 14:16 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2013.codfw.wmnet with OS bullseye
- 14:15 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 14:12 oblivian@deploy1002: oblivian: Backport for Load and enable parsoid everywhere (T334980) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 14:10 oblivian@deploy1002: Started scap: Backport for Load and enable parsoid everywhere (T334980)
- 14:08 eoghan@cumin1001: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 14:06 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1,3]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
- 14:06 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1,3]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
- 14:03 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1026.eqiad.wmnet with OS bullseye
- 14:03 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
- 14:01 oblivian@deploy1002: Finished scap: Backport for Enable parser cache warming jobs for parsoid on enwiki (T329366) (duration: 07m 57s)
- 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48894 and previous config saved to /var/cache/conftool/dbconfig/20230606-140051-ladsgroup.json
- 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48893 and previous config saved to /var/cache/conftool/dbconfig/20230606-140030-ladsgroup.json
- 13:59 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging AndyRussG out of all services on: 780 hosts
- 13:58 jmm@cumin2002: START - Cookbook sre.idm.logout Logging AndyRussG out of all services on: 780 hosts
- 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging AndyRussG out of all services on: 1259 hosts
- 13:57 jmm@cumin2002: START - Cookbook sre.idm.logout Logging AndyRussG out of all services on: 1259 hosts
- 13:55 oblivian@deploy1002: oblivian and daniel: Backport for Enable parser cache warming jobs for parsoid on enwiki (T329366) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 13:53 oblivian@deploy1002: Started scap: Backport for Enable parser cache warming jobs for parsoid on enwiki (T329366)
- 13:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
- 13:50 oblivian@deploy1002: Finished scap: Backport for Drop wmgMemoryLimitParsoid from IS.php (duration: 07m 21s)
- 13:49 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1023.eqiad.wmnet with OS bullseye
- 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P48891 and previous config saved to /var/cache/conftool/dbconfig/20230606-134524-ladsgroup.json
- 13:45 oblivian@deploy1002: oblivian: Backport for Drop wmgMemoryLimitParsoid from IS.php synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 13:43 oblivian@deploy1002: Started scap: Backport for Drop wmgMemoryLimitParsoid from IS.php
- 13:41 oblivian@deploy1002: Finished scap: Backport for Raise memory limit to match parsoid (T334980) (duration: 07m 53s)
- 13:41 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
- 13:41 elukey@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: sync
- 13:35 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1-2]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
- 13:35 oblivian@deploy1002: oblivian: Backport for Raise memory limit to match parsoid (T334980) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 13:34 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1-2]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
- 13:33 oblivian@deploy1002: Started scap: Backport for Raise memory limit to match parsoid (T334980)
- 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P48890 and previous config saved to /var/cache/conftool/dbconfig/20230606-133018-ladsgroup.json
- 13:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48889 and previous config saved to /var/cache/conftool/dbconfig/20230606-131512-ladsgroup.json
- 13:11 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 13:06 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: EventStreamConfig - Disable canary events and hadoop ingestion for development.network.probe - T332024 (duration: 07m 17s)
- 13:00 eoghan@cumin1001: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48888 and previous config saved to /var/cache/conftool/dbconfig/20230606-125944-ladsgroup.json
- 12:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 12:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T336886)', diff saved to https://phabricator.wikimedia.org/P48887 and previous config saved to /var/cache/conftool/dbconfig/20230606-125923-ladsgroup.json
- 12:56 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_esams and A:cp
- 12:55 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
- 12:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1023.eqiad.wmnet with OS bullseye
- 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P48886 and previous config saved to /var/cache/conftool/dbconfig/20230606-124417-ladsgroup.json
- 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P48885 and previous config saved to /var/cache/conftool/dbconfig/20230606-122911-ladsgroup.json
- 12:21 cgoubert@deploy1002: Finished scap: (no justification provided) (duration: 02m 10s)
- 12:19 cgoubert@deploy1002: Started scap: (no justification provided)
- 12:19 claime: redeploying 927218 to mw-on-k8s - T338121
- 12:15 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T336886)', diff saved to https://phabricator.wikimedia.org/P48884 and previous config saved to /var/cache/conftool/dbconfig/20230606-121405-ladsgroup.json
- 12:09 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 12:00 kamila@deploy1002: Finished scap: Backport for OAuthRateLimiter: Add rate limiting class for WME using LiftWing (T338121) (duration: 08m 54s)
- 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T336886)', diff saved to https://phabricator.wikimedia.org/P48881 and previous config saved to /var/cache/conftool/dbconfig/20230606-115911-ladsgroup.json
- 11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 11:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 11:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 11:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T336886)', diff saved to https://phabricator.wikimedia.org/P48880 and previous config saved to /var/cache/conftool/dbconfig/20230606-115833-ladsgroup.json
- 11:53 kamila@deploy1002: kamila and klausman: Backport for OAuthRateLimiter: Add rate limiting class for WME using LiftWing (T338121) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 11:51 kamila@deploy1002: Started scap: Backport for OAuthRateLimiter: Add rate limiting class for WME using LiftWing (T338121)
- 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P48879 and previous config saved to /var/cache/conftool/dbconfig/20230606-114327-ladsgroup.json
- 11:38 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 11:37 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 11:31 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 11:31 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P48878 and previous config saved to /var/cache/conftool/dbconfig/20230606-112819-ladsgroup.json
- 11:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T336886)', diff saved to https://phabricator.wikimedia.org/P48877 and previous config saved to /var/cache/conftool/dbconfig/20230606-111313-ladsgroup.json
- 11:03 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
- 10:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T336886)', diff saved to https://phabricator.wikimedia.org/P48876 and previous config saved to /var/cache/conftool/dbconfig/20230606-105756-ladsgroup.json
- 10:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 10:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 10:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T336886)', diff saved to https://phabricator.wikimedia.org/P48875 and previous config saved to /var/cache/conftool/dbconfig/20230606-105724-ladsgroup.json
- 10:53 urbanecm@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
- 10:53 urbanecm@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
- 10:52 urbanecm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 10:51 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp in group1 wikis (T299954) (duration: 07m 03s)
- 10:51 urbanecm@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 10:50 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 10:50 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 10:50 urbanecm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 10:50 urbanecm@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 10:46 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp in group1 wikis (T299954) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 10:44 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp in group1 wikis (T299954)
- 10:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P48874 and previous config saved to /var/cache/conftool/dbconfig/20230606-104218-ladsgroup.json
- 10:30 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
- 10:30 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
- 10:28 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 10:28 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P48873 and previous config saved to /var/cache/conftool/dbconfig/20230606-102712-ladsgroup.json
- 10:20 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
- 10:20 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
- 10:20 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 10:20 mwpresync@deploy1002: Pruned MediaWiki: 1.41.0-wmf.10 (duration: 02m 18s)
- 10:20 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 10:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 10:18 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 10:18 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 10:18 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 10:17 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.41.0-wmf.12 refs T337526 (duration: 56m 25s)
- 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T336886)', diff saved to https://phabricator.wikimedia.org/P48872 and previous config saved to /var/cache/conftool/dbconfig/20230606-101205-ladsgroup.json
- 10:07 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 10:07 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 10:02 urbanecm@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
- 10:01 urbanecm@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
- 10:00 urbanecm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 09:59 urbanecm@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 09:58 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 09:58 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 09:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T336886)', diff saved to https://phabricator.wikimedia.org/P48871 and previous config saved to /var/cache/conftool/dbconfig/20230606-095512-ladsgroup.json
- 09:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
- 09:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
- 09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T336886)', diff saved to https://phabricator.wikimedia.org/P48870 and previous config saved to /var/cache/conftool/dbconfig/20230606-095451-ladsgroup.json
- 09:41 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 09:41 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 09:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P48869 and previous config saved to /var/cache/conftool/dbconfig/20230606-093945-ladsgroup.json
- 09:34 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_esams and A:cp
- 09:31 fabfur@cumin1001: END (FAIL) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=1) rolling custom on A:cp-text_esams and A:cp
- 09:27 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_esams and A:cp
- 09:27 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 09:26 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 09:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P48867 and previous config saved to /var/cache/conftool/dbconfig/20230606-092439-ladsgroup.json
- 09:21 mwpresync@deploy1002: Started scap: testwikis wikis to 1.41.0-wmf.12 refs T337526
- 09:18 jynus: running systemctl start train-presync
- 09:16 vgutierrez: restarting acme-chief and nginx on acme-chief instances
- 09:11 claime: Building production images - T338014
- 09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T336886)', diff saved to https://phabricator.wikimedia.org/P48866 and previous config saved to /var/cache/conftool/dbconfig/20230606-090933-ladsgroup.json
- 08:59 urbanecm: deploy1002: run /usr/local/sbin/fix-staging-perms (T338205)
- 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2002.codfw.wmnet
- 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2002.codfw.wmnet
- 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T336886)', diff saved to https://phabricator.wikimedia.org/P48865 and previous config saved to /var/cache/conftool/dbconfig/20230606-085337-ladsgroup.json
- 08:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 08:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
- 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T336886)', diff saved to https://phabricator.wikimedia.org/P48864 and previous config saved to /var/cache/conftool/dbconfig/20230606-085317-ladsgroup.json
- 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1002.eqiad.wmnet
- 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1002.eqiad.wmnet
- 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P48863 and previous config saved to /var/cache/conftool/dbconfig/20230606-083810-ladsgroup.json
- 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P48861 and previous config saved to /var/cache/conftool/dbconfig/20230606-082304-ladsgroup.json
- 08:15 moritzm: installing openssl security updates on bullseye
- 08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T336886)', diff saved to https://phabricator.wikimedia.org/P48860 and previous config saved to /var/cache/conftool/dbconfig/20230606-080758-ladsgroup.json
- 07:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T336886)', diff saved to https://phabricator.wikimedia.org/P48859 and previous config saved to /var/cache/conftool/dbconfig/20230606-075210-ladsgroup.json
- 07:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
- 07:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
- 07:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T336886)', diff saved to https://phabricator.wikimedia.org/P48858 and previous config saved to /var/cache/conftool/dbconfig/20230606-075149-ladsgroup.json
- 07:47 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_esams and A:cp
- 07:42 dcausse@deploy1002: Finished scap: Backport for ttm: use new config option to separate readable and writable services (T322284) (duration: 15m 20s)
- 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P48857 and previous config saved to /var/cache/conftool/dbconfig/20230606-073643-ladsgroup.json
- 07:28 dcausse@deploy1002: dcausse: Backport for ttm: use new config option to separate readable and writable services (T322284) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 07:27 dcausse@deploy1002: Started scap: Backport for ttm: use new config option to separate readable and writable services (T322284)
- 07:22 kharlan@deploy1002: Finished scap: Backport for checkuser: Disable client hints feature by default (T337944) (duration: 08m 14s)
- 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P48856 and previous config saved to /var/cache/conftool/dbconfig/20230606-072137-ladsgroup.json
- 07:16 kharlan@deploy1002: kharlan: Backport for checkuser: Disable client hints feature by default (T337944) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 07:14 kharlan@deploy1002: Started scap: Backport for checkuser: Disable client hints feature by default (T337944)
- 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T336886)', diff saved to https://phabricator.wikimedia.org/P48855 and previous config saved to /var/cache/conftool/dbconfig/20230606-070631-ladsgroup.json
- 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T336886)', diff saved to https://phabricator.wikimedia.org/P48854 and previous config saved to /var/cache/conftool/dbconfig/20230606-065057-ladsgroup.json
- 06:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
- 06:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
- 06:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 06:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
- 06:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 06:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
- 06:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 06:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
- 06:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T336886)', diff saved to https://phabricator.wikimedia.org/P48853 and previous config saved to /var/cache/conftool/dbconfig/20230606-060807-ladsgroup.json
- 05:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P48852 and previous config saved to /var/cache/conftool/dbconfig/20230606-055301-ladsgroup.json
- 05:50 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'configure' for AS: 2518
- 05:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 2518
- 05:49 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 2518
- 05:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 2518
- 05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P48851 and previous config saved to /var/cache/conftool/dbconfig/20230606-053755-ladsgroup.json
- 05:34 Amir1: ladsgroup@clouddb1021:/srv/sqldata.s1$ sudo rm db1196* (T337961)
- 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T336886)', diff saved to https://phabricator.wikimedia.org/P48850 and previous config saved to /var/cache/conftool/dbconfig/20230606-052249-ladsgroup.json
- 05:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T336886)', diff saved to https://phabricator.wikimedia.org/P48849 and previous config saved to /var/cache/conftool/dbconfig/20230606-051938-ladsgroup.json
- 05:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 05:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 05:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T336886)', diff saved to https://phabricator.wikimedia.org/P48848 and previous config saved to /var/cache/conftool/dbconfig/20230606-051918-ladsgroup.json
- 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P48847 and previous config saved to /var/cache/conftool/dbconfig/20230606-050410-ladsgroup.json
- 04:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P48846 and previous config saved to /var/cache/conftool/dbconfig/20230606-044904-ladsgroup.json
- 04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T336886)', diff saved to https://phabricator.wikimedia.org/P48845 and previous config saved to /var/cache/conftool/dbconfig/20230606-043358-ladsgroup.json
- 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T336886)', diff saved to https://phabricator.wikimedia.org/P48844 and previous config saved to /var/cache/conftool/dbconfig/20230606-043047-ladsgroup.json
- 04:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 04:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T336886)', diff saved to https://phabricator.wikimedia.org/P48843 and previous config saved to /var/cache/conftool/dbconfig/20230606-043026-ladsgroup.json
- 04:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P48842 and previous config saved to /var/cache/conftool/dbconfig/20230606-041520-ladsgroup.json
- 04:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P48841 and previous config saved to /var/cache/conftool/dbconfig/20230606-040013-ladsgroup.json
- 03:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T336886)', diff saved to https://phabricator.wikimedia.org/P48840 and previous config saved to /var/cache/conftool/dbconfig/20230606-034506-ladsgroup.json
- 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T336886)', diff saved to https://phabricator.wikimedia.org/P48839 and previous config saved to /var/cache/conftool/dbconfig/20230606-034256-ladsgroup.json
- 03:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 03:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T336886)', diff saved to https://phabricator.wikimedia.org/P48838 and previous config saved to /var/cache/conftool/dbconfig/20230606-034235-ladsgroup.json
- 03:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device ssw1-a1-codfw.mgmt.codfw.wmnet
- 03:32 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 03:32 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
- 03:31 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
- 03:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P48837 and previous config saved to /var/cache/conftool/dbconfig/20230606-032729-ladsgroup.json
- 03:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P48836 and previous config saved to /var/cache/conftool/dbconfig/20230606-031223-ladsgroup.json
- 02:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T336886)', diff saved to https://phabricator.wikimedia.org/P48835 and previous config saved to /var/cache/conftool/dbconfig/20230606-025717-ladsgroup.json
- 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T336886)', diff saved to https://phabricator.wikimedia.org/P48834 and previous config saved to /var/cache/conftool/dbconfig/20230606-025507-ladsgroup.json
- 02:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 02:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 02:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48833 and previous config saved to /var/cache/conftool/dbconfig/20230606-021622-ladsgroup.json
- 02:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 02:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48832 and previous config saved to /var/cache/conftool/dbconfig/20230606-020616-ladsgroup.json
- 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P48831 and previous config saved to /var/cache/conftool/dbconfig/20230606-020116-ladsgroup.json
- 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P48830 and previous config saved to /var/cache/conftool/dbconfig/20230606-015110-ladsgroup.json
- 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P48829 and previous config saved to /var/cache/conftool/dbconfig/20230606-014610-ladsgroup.json
- 01:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P48828 and previous config saved to /var/cache/conftool/dbconfig/20230606-013604-ladsgroup.json
- 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48827 and previous config saved to /var/cache/conftool/dbconfig/20230606-013104-ladsgroup.json
- 01:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48826 and previous config saved to /var/cache/conftool/dbconfig/20230606-012058-ladsgroup.json
- 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48825 and previous config saved to /var/cache/conftool/dbconfig/20230606-010704-ladsgroup.json
- 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48824 and previous config saved to /var/cache/conftool/dbconfig/20230606-010643-ladsgroup.json
- 00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48823 and previous config saved to /var/cache/conftool/dbconfig/20230606-005357-ladsgroup.json
- 00:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 00:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T336886)', diff saved to https://phabricator.wikimedia.org/P48822 and previous config saved to /var/cache/conftool/dbconfig/20230606-005336-ladsgroup.json
- 00:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P48821 and previous config saved to /var/cache/conftool/dbconfig/20230606-005137-ladsgroup.json
- 00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P48820 and previous config saved to /var/cache/conftool/dbconfig/20230606-003830-ladsgroup.json
- 00:37 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P48819 and previous config saved to /var/cache/conftool/dbconfig/20230606-003631-ladsgroup.json
- 00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P48818 and previous config saved to /var/cache/conftool/dbconfig/20230606-002324-ladsgroup.json
- 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48817 and previous config saved to /var/cache/conftool/dbconfig/20230606-002125-ladsgroup.json
- 00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48816 and previous config saved to /var/cache/conftool/dbconfig/20230606-001914-ladsgroup.json
- 00:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 00:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 00:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 00:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T336886)', diff saved to https://phabricator.wikimedia.org/P48815 and previous config saved to /var/cache/conftool/dbconfig/20230606-001836-ladsgroup.json
- 00:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T336886)', diff saved to https://phabricator.wikimedia.org/P48814 and previous config saved to /var/cache/conftool/dbconfig/20230606-000818-ladsgroup.json
- 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P48813 and previous config saved to /var/cache/conftool/dbconfig/20230606-000330-ladsgroup.json
2023-06-05
- 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T336886)', diff saved to https://phabricator.wikimedia.org/P48812 and previous config saved to /var/cache/conftool/dbconfig/20230605-235346-ladsgroup.json
- 23:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 23:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
- 23:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 23:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T336886)', diff saved to https://phabricator.wikimedia.org/P48811 and previous config saved to /var/cache/conftool/dbconfig/20230605-235310-ladsgroup.json
- 23:49 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp in group0 wikis (T299954) (duration: 07m 02s)
- 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P48810 and previous config saved to /var/cache/conftool/dbconfig/20230605-234824-ladsgroup.json
- 23:43 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp in group0 wikis (T299954) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 23:42 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp in group0 wikis (T299954)
- 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P48809 and previous config saved to /var/cache/conftool/dbconfig/20230605-233804-ladsgroup.json
- 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T336886)', diff saved to https://phabricator.wikimedia.org/P48808 and previous config saved to /var/cache/conftool/dbconfig/20230605-233318-ladsgroup.json
- 23:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T336886)', diff saved to https://phabricator.wikimedia.org/P48807 and previous config saved to /var/cache/conftool/dbconfig/20230605-233107-ladsgroup.json
- 23:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1136.eqiad.wmnet with reason: Maintenance
- 23:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1136.eqiad.wmnet with reason: Maintenance
- 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48806 and previous config saved to /var/cache/conftool/dbconfig/20230605-233046-ladsgroup.json
- 23:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:25 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
- 23:24 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
- 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P48805 and previous config saved to /var/cache/conftool/dbconfig/20230605-232258-ladsgroup.json
- 23:22 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:22 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
- 23:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=93) for device ssw1-a1-codfw.mgmt.codfw.wmnet
- 23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P48804 and previous config saved to /var/cache/conftool/dbconfig/20230605-231540-ladsgroup.json
- 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002"
- 23:14 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002"
- 23:12 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 23:11 jforrester@deploy1002: Finished deploy [integration/docroot@6eefe56]: I5c1b92 for T334492 (duration: 00m 05s)
- 23:10 jforrester@deploy1002: Started deploy [integration/docroot@6eefe56]: I5c1b92 for T334492
- 23:09 jforrester@deploy1002: Finished deploy [integration/docroot@ab77611]: Idf6c7a (duration: 00m 08s)
- 23:09 jforrester@deploy1002: Started deploy [integration/docroot@ab77611]: Idf6c7a
- 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T336886)', diff saved to https://phabricator.wikimedia.org/P48803 and previous config saved to /var/cache/conftool/dbconfig/20230605-230752-ladsgroup.json
- 23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P48802 and previous config saved to /var/cache/conftool/dbconfig/20230605-230034-ladsgroup.json
- 22:57 mutante: contint2001 - sudo systemctl restart apache2
- 22:57 mutante: contint2001 - sudo apt-get remove --purge libapache2-mod-php7.3 php7.3-cli php7.3-common php7.3-json php7.3-opcache php7.3-readline
- 22:55 jforrester@deploy1002: Finished deploy [integration/docroot@8255d99]: I6c7575 for T337425 (duration: 00m 13s)
- 22:55 jforrester@deploy1002: Started deploy [integration/docroot@8255d99]: I6c7575 for T337425
- 22:53 mutante: contint2001 (prod main CI server) - upgrading PHP 7.3 to 7.4
- 22:49 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp in testwiki (T299954) (duration: 09m 13s)
- 22:46 mutante: contint2002, contint1002 - upgrading PHP from 7.3 to 7.4
- 22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48801 and previous config saved to /var/cache/conftool/dbconfig/20230605-224528-ladsgroup.json
- 22:41 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp in testwiki (T299954) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 22:40 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp in testwiki (T299954)
- 22:37 ladsgroup@deploy1002: Finished scap: Backport for moveToExternal: Actually convert encoding of cur_text (T337700) (duration: 09m 04s)
- 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T336886)', diff saved to https://phabricator.wikimedia.org/P48800 and previous config saved to /var/cache/conftool/dbconfig/20230605-223035-ladsgroup.json
- 22:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 22:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 22:29 ladsgroup@deploy1002: ladsgroup: Backport for moveToExternal: Actually convert encoding of cur_text (T337700) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 22:28 ladsgroup@deploy1002: Started scap: Backport for moveToExternal: Actually convert encoding of cur_text (T337700)
- 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48799 and previous config saved to /var/cache/conftool/dbconfig/20230605-222745-ladsgroup.json
- 22:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 22:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
- 22:24 ladsgroup@deploy1002: Finished scap: Backport for Revert "Remove legacy encoding option from dawiktionary" (duration: 07m 40s)
- 22:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T335845)', diff saved to https://phabricator.wikimedia.org/P48798 and previous config saved to /var/cache/conftool/dbconfig/20230605-222339-ladsgroup.json
- 22:18 ladsgroup@deploy1002: ladsgroup: Backport for Revert "Remove legacy encoding option from dawiktionary" synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 22:17 ladsgroup@deploy1002: Started scap: Backport for Revert "Remove legacy encoding option from dawiktionary"
- 22:13 ladsgroup@deploy1002: Finished scap: Backport for Help measure the impact of saneitizer jobs (T336698) (duration: 09m 48s)
- 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48797 and previous config saved to /var/cache/conftool/dbconfig/20230605-220833-ladsgroup.json
- 22:05 ladsgroup@deploy1002: ladsgroup: Backport for Help measure the impact of saneitizer jobs (T336698) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
- 22:03 ladsgroup@deploy1002: Started scap: Backport for Help measure the impact of saneitizer jobs (T336698)
- 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs1016.eqiad.wmnet
- 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
- 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
- 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48796 and previous config saved to /var/cache/conftool/dbconfig/20230605-215345-ladsgroup.json
- 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48795 and previous config saved to /var/cache/conftool/dbconfig/20230605-215326-ladsgroup.json
- 21:51 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
- 21:50 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs1016.eqiad.wmnet
- 21:42 urbanecm@deploy1002: Finished scap: Backport for NewImpact: Fix renderMode parsing for Special:Impact (T338085) (duration: 25m 38s)
- 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P48794 and previous config saved to /var/cache/conftool/dbconfig/20230605-213839-ladsgroup.json
- 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T335845)', diff saved to https://phabricator.wikimedia.org/P48793 and previous config saved to /var/cache/conftool/dbconfig/20230605-213819-ladsgroup.json
- 21:35 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs1015.eqiad.wmnet
- 21:35 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
- 21:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T335845)', diff saved to https://phabricator.wikimedia.org/P48792 and previous config saved to /var/cache/conftool/dbconfig/20230605-213202-ladsgroup.json
- 21:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 21:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 21:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 21:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 21:30 urbanecm@deploy1002: urbanecm: Backport for NewImpact: Fix renderMode parsing for Special:Impact (T338085) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 21:29 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 21:29 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 21:25 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
- 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P48791 and previous config saved to /var/cache/conftool/dbconfig/20230605-212333-ladsgroup.json
- 21:23 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs1015.eqiad.wmnet
- 21:18 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 21:17 urbanecm@deploy1002: Started scap: Backport for NewImpact: Fix renderMode parsing for Special:Impact (T338085)
- 21:16 urbanecm@deploy1002: Finished scap: Backport for Update interwiki cache (T338093) (duration: 24m 34s)
- 21:15 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48790 and previous config saved to /var/cache/conftool/dbconfig/20230605-210827-ladsgroup.json
- 21:05 urbanecm@deploy1002: urbanecm: Backport for Update interwiki cache (T338093) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 20:51 urbanecm@deploy1002: Started scap: Backport for Update interwiki cache (T338093)
- 20:48 cjming: end of UTC late backport window
- 20:47 urbanecm: [urbanecm@deploy1002 ~]$ sudo /usr/local/sbin/fix-staging-perms # verify T338180 fix
- away: payments-wiki upgraded from 2b4203df to f3b229c6
- 20:46 cjming@deploy1002: Finished scap: Backport for Revert "Revert "VisualEditorFeatureUse sampling rate to 1 everywhere"" (duration: 09m 57s)
- 20:38 cjming@deploy1002: cjming: Backport for Revert "Revert "VisualEditorFeatureUse sampling rate to 1 everywhere"" synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
- 20:36 cjming@deploy1002: Started scap: Backport for Revert "Revert "VisualEditorFeatureUse sampling rate to 1 everywhere""
- 20:35 cjming@deploy1002: Finished scap: Backport for Add initial stream configs for Android article events using Metrics Platform Java client library (T330355) (duration: 24m 57s)
- 20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48789 and previous config saved to /var/cache/conftool/dbconfig/20230605-202916-ladsgroup.json
- 20:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
- 20:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
- 20:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T336886)', diff saved to https://phabricator.wikimedia.org/P48788 and previous config saved to /var/cache/conftool/dbconfig/20230605-202855-ladsgroup.json
- 20:23 cjming@deploy1002: cjming: Backport for Add initial stream configs for Android article events using Metrics Platform Java client library (T330355) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P48787 and previous config saved to /var/cache/conftool/dbconfig/20230605-201349-ladsgroup.json
- 20:10 cjming@deploy1002: Started scap: Backport for Add initial stream configs for Android article events using Metrics Platform Java client library (T330355)
- 20:09 urbanecm: [urbanecm@deploy1002 ~]$ sudo /usr/local/sbin/fix-staging-perms # attempt to fix permission errors when doing a backport
- 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P48786 and previous config saved to /var/cache/conftool/dbconfig/20230605-195842-ladsgroup.json
- 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T336886)', diff saved to https://phabricator.wikimedia.org/P48785 and previous config saved to /var/cache/conftool/dbconfig/20230605-194336-ladsgroup.json
- 19:32 brett: Maglev LVS scheduler rollout in eqiad finished (puppet re-enabled) - T263797
- 19:12 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs2011.codfw.wmnet
- 19:12 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
- 19:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 19:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
- 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T336886)', diff saved to https://phabricator.wikimedia.org/P48784 and previous config saved to /var/cache/conftool/dbconfig/20230605-190702-ladsgroup.json
- 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T336886)', diff saved to https://phabricator.wikimedia.org/P48783 and previous config saved to /var/cache/conftool/dbconfig/20230605-190528-ladsgroup.json
- 19:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
- 19:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
- 19:03 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
- 18:58 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs2011.codfw.wmnet
- 18:56 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
- 18:52 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: revert - remove undeeded wgEventBusStreamNamesMap override setting (take 2) - T336817 (duration: 11m 54s)
- 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P48782 and previous config saved to /var/cache/conftool/dbconfig/20230605-185156-ladsgroup.json
- 18:48 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
- 18:48 inflatador: bking@cumin1001 depooling wdqs2011for fw update T331297
- 18:48 inflatador: bking@cumin1001 repooling wdqs2010 T331297
- 18:45 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
- 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P48781 and previous config saved to /var/cache/conftool/dbconfig/20230605-183650-ladsgroup.json
- 18:35 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
- 18:32 inflatador: bking@cumin1001 depooling wdqs2010 for fw update T331297
- 18:30 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: revert - Remove unused page_change rc streams - T336817 (duration: 11m 23s)
- 18:29 sukhe: homer "cr*-eqiad*" commit "Gerrit: 927246 remove old gerrit service IP"
- 18:28 brett: Maglev LVS scheduler rollout in eqiad (puppet disabled) - T263797
- 18:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T336886)', diff saved to https://phabricator.wikimedia.org/P48780 and previous config saved to /var/cache/conftool/dbconfig/20230605-182144-ladsgroup.json
- 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1224 (T336886)', diff saved to https://phabricator.wikimedia.org/P48779 and previous config saved to /var/cache/conftool/dbconfig/20230605-181935-ladsgroup.json
- 18:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 18:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1224.eqiad.wmnet with reason: Maintenance
- 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T336886)', diff saved to https://phabricator.wikimedia.org/P48778 and previous config saved to /var/cache/conftool/dbconfig/20230605-181915-ladsgroup.json
- 18:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 18:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T336886)', diff saved to https://phabricator.wikimedia.org/P48777 and previous config saved to /var/cache/conftool/dbconfig/20230605-181219-ladsgroup.json
- 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P48776 and previous config saved to /var/cache/conftool/dbconfig/20230605-180408-ladsgroup.json
- 17:58 btullis@puppetmaster1001: conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
- 17:58 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
- 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P48775 and previous config saved to /var/cache/conftool/dbconfig/20230605-175712-ladsgroup.json
- 17:50 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: no-op: Remove unused page_change rc streams - T336817 (duration: 20m 11s)
- 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P48774 and previous config saved to /var/cache/conftool/dbconfig/20230605-174902-ladsgroup.json
- 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P48773 and previous config saved to /var/cache/conftool/dbconfig/20230605-174206-ladsgroup.json
- 17:38 cdanis@deploy1002: Finished scap: Backport for Enable user network probe events (T332024) (duration: 10m 02s)
- 17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T336886)', diff saved to https://phabricator.wikimedia.org/P48772 and previous config saved to /var/cache/conftool/dbconfig/20230605-173356-ladsgroup.json
- 17:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1213:3316 (T336886)', diff saved to https://phabricator.wikimedia.org/P48771 and previous config saved to /var/cache/conftool/dbconfig/20230605-173002-ladsgroup.json
- 17:30 cdanis@deploy1002: cdanis: Backport for Enable user network probe events (T332024) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
- 17:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 17:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
- 17:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T336886)', diff saved to https://phabricator.wikimedia.org/P48770 and previous config saved to /var/cache/conftool/dbconfig/20230605-172942-ladsgroup.json
- 17:28 cdanis@deploy1002: Started scap: Backport for Enable user network probe events (T332024)
- 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T336886)', diff saved to https://phabricator.wikimedia.org/P48769 and previous config saved to /var/cache/conftool/dbconfig/20230605-172700-ladsgroup.json
- 17:26 cdanis@deploy1002: Backport cancelled.
- 17:26 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: Remove undeeded wgEventBusStreamNamesMap override setting (take 2) - T336817 (duration: 09m 25s)
- 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1223 (T336886)', diff saved to https://phabricator.wikimedia.org/P48768 and previous config saved to /var/cache/conftool/dbconfig/20230605-172124-ladsgroup.json
- 17:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 17:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T336886)', diff saved to https://phabricator.wikimedia.org/P48767 and previous config saved to /var/cache/conftool/dbconfig/20230605-172103-ladsgroup.json
- 17:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P48766 and previous config saved to /var/cache/conftool/dbconfig/20230605-171436-ladsgroup.json
- 17:12 cdanis@deploy1002: Backport cancelled.
- 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P48765 and previous config saved to /var/cache/conftool/dbconfig/20230605-170557-ladsgroup.json
- 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P48764 and previous config saved to /var/cache/conftool/dbconfig/20230605-165929-ladsgroup.json
- 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P48763 and previous config saved to /var/cache/conftool/dbconfig/20230605-165051-ladsgroup.json
- 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T336886)', diff saved to https://phabricator.wikimedia.org/P48762 and previous config saved to /var/cache/conftool/dbconfig/20230605-164423-ladsgroup.json
- 16:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2013.codfw.wmnet with OS bullseye
- 16:37 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1201 (T336886)', diff saved to https://phabricator.wikimedia.org/P48761 and previous config saved to /var/cache/conftool/dbconfig/20230605-163714-ladsgroup.json
- 16:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 16:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T336886)', diff saved to https://phabricator.wikimedia.org/P48760 and previous config saved to /var/cache/conftool/dbconfig/20230605-163653-ladsgroup.json
- 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T336886)', diff saved to https://phabricator.wikimedia.org/P48759 and previous config saved to /var/cache/conftool/dbconfig/20230605-163545-ladsgroup.json
- 16:35 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
- 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1212 (T336886)', diff saved to https://phabricator.wikimedia.org/P48758 and previous config saved to /var/cache/conftool/dbconfig/20230605-162707-ladsgroup.json
- 16:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T336886)', diff saved to https://phabricator.wikimedia.org/P48757 and previous config saved to /var/cache/conftool/dbconfig/20230605-162629-ladsgroup.json
- 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P48756 and previous config saved to /var/cache/conftool/dbconfig/20230605-162147-ladsgroup.json
- 16:21 btullis@puppetmaster1001: conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
- 16:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
- 16:19 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
- 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
- 16:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P48755 and previous config saved to /var/cache/conftool/dbconfig/20230605-161123-ladsgroup.json
- 16:08 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P48754 and previous config saved to /var/cache/conftool/dbconfig/20230605-160640-ladsgroup.json
- 16:06 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 16:06 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 16:06 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 16:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 16:05 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 16:05 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:59 bblack: mw1419: manually executing a php restart to test new safe-service-restart
- 15:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P48753 and previous config saved to /var/cache/conftool/dbconfig/20230605-155617-ladsgroup.json
- 15:55 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2013.codfw.wmnet with OS bullseye
- 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T336886)', diff saved to https://phabricator.wikimedia.org/P48752 and previous config saved to /var/cache/conftool/dbconfig/20230605-155134-ladsgroup.json
- 15:51 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs2013']
- 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1187 (T336886)', diff saved to https://phabricator.wikimedia.org/P48751 and previous config saved to /var/cache/conftool/dbconfig/20230605-154926-ladsgroup.json
- 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
- 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T336886)', diff saved to https://phabricator.wikimedia.org/P48750 and previous config saved to /var/cache/conftool/dbconfig/20230605-154905-ladsgroup.json
- 15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T336886)', diff saved to https://phabricator.wikimedia.org/P48749 and previous config saved to /var/cache/conftool/dbconfig/20230605-154110-ladsgroup.json
- 15:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2013']
- 15:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2013']
- 15:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2013']
- 15:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T336886)', diff saved to https://phabricator.wikimedia.org/P48748 and previous config saved to /var/cache/conftool/dbconfig/20230605-153542-ladsgroup.json
- 15:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 15:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 15:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T336886)', diff saved to https://phabricator.wikimedia.org/P48747 and previous config saved to /var/cache/conftool/dbconfig/20230605-153521-ladsgroup.json
- 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P48746 and previous config saved to /var/cache/conftool/dbconfig/20230605-153359-ladsgroup.json
- 15:33 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
- 15:33 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
- 15:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2013.mgmt.codfw.wmnet with reboot policy FORCED
- 15:27 Amir1: on s3 master: update `text` set old_text = 'O:18:"historyblobcurstub":1:{s:6:"mCurId";i:5532;}', old_flags = 'object' where old_id= 14484; (T337700)
- 15:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P48745 and previous config saved to /var/cache/conftool/dbconfig/20230605-152015-ladsgroup.json
- 15:19 moritzm: installing debian-archive-keyring updates on bullseye hosts
- 15:19 mforns@deploy1002: Finished deploy [airflow-dags/analytics@674ec0a]: (no justification provided) (duration: 00m 17s)
- 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P48744 and previous config saved to /var/cache/conftool/dbconfig/20230605-151853-ladsgroup.json
- 15:18 mforns@deploy1002: Started deploy [airflow-dags/analytics@674ec0a]: (no justification provided)
- 15:18 sukhe@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T326767 (duration: 102m 46s)
- 15:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host lvs2013.mgmt.codfw.wmnet with reboot policy FORCED
- 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Setup DNS for lvs2013 - pt1979@cumin2002"
- 15:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Setup DNS for lvs2013 - pt1979@cumin2002"
- 15:05 moritzm: installing avahi security updates
- 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P48742 and previous config saved to /var/cache/conftool/dbconfig/20230605-150509-ladsgroup.json
- 15:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T336886)', diff saved to https://phabricator.wikimedia.org/P48741 and previous config saved to /var/cache/conftool/dbconfig/20230605-150347-ladsgroup.json
- 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T336886)', diff saved to https://phabricator.wikimedia.org/P48740 and previous config saved to /var/cache/conftool/dbconfig/20230605-150138-ladsgroup.json
- 15:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 15:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
- 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T336886)', diff saved to https://phabricator.wikimedia.org/P48739 and previous config saved to /var/cache/conftool/dbconfig/20230605-150117-ladsgroup.json
- 14:55 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:55 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:52 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:52 otto@deploy1002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:50 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:50 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T336886)', diff saved to https://phabricator.wikimedia.org/P48738 and previous config saved to /var/cache/conftool/dbconfig/20230605-145003-ladsgroup.json
- 14:48 sukhe: homer "cr*-codfw*" commit "Gerrit: 927208 remove decommissioned host lvs2009": T335777
- 14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs2009.codfw.wmnet
- 14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
- 14:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P48737 and previous config saved to /var/cache/conftool/dbconfig/20230605-144611-ladsgroup.json
- 14:45 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
- 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T336886)', diff saved to https://phabricator.wikimedia.org/P48736 and previous config saved to /var/cache/conftool/dbconfig/20230605-144438-ladsgroup.json
- 14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T336886)', diff saved to https://phabricator.wikimedia.org/P48735 and previous config saved to /var/cache/conftool/dbconfig/20230605-144417-ladsgroup.json
- 14:42 sukhe@cumin2002: START - Cookbook sre.dns.netbox
- 14:32 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs2009.codfw.wmnet
- 14:31 ejegg: payments-wiki upgraded from c2f9f8b5 to 2b4203df
- 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P48734 and previous config saved to /var/cache/conftool/dbconfig/20230605-143105-ladsgroup.json
- 14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P48733 and previous config saved to /var/cache/conftool/dbconfig/20230605-142911-ladsgroup.json
- 14:28 sukhe: codfw low-traffic LVS: set routing-options static route 10.2.1.0/24 next-hop 10.192.49.7
- 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T336886)', diff saved to https://phabricator.wikimedia.org/P48732 and previous config saved to /var/cache/conftool/dbconfig/20230605-141559-ladsgroup.json
- 14:15 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:15 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T336886)', diff saved to https://phabricator.wikimedia.org/P48731 and previous config saved to /var/cache/conftool/dbconfig/20230605-141451-ladsgroup.json
- 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
- 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T336886)', diff saved to https://phabricator.wikimedia.org/P48730 and previous config saved to /var/cache/conftool/dbconfig/20230605-141430-ladsgroup.json
- 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P48729 and previous config saved to /var/cache/conftool/dbconfig/20230605-141405-ladsgroup.json
- 14:08 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 14:08 otto@deploy1002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P48728 and previous config saved to /var/cache/conftool/dbconfig/20230605-135924-ladsgroup.json
- 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T336886)', diff saved to https://phabricator.wikimedia.org/P48727 and previous config saved to /var/cache/conftool/dbconfig/20230605-135859-ladsgroup.json
- 13:57 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 13:56 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T336886)', diff saved to https://phabricator.wikimedia.org/P48726 and previous config saved to /var/cache/conftool/dbconfig/20230605-135332-ladsgroup.json
- 13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T336886)', diff saved to https://phabricator.wikimedia.org/P48725 and previous config saved to /var/cache/conftool/dbconfig/20230605-135311-ladsgroup.json
- 13:46 moritzm: installing python-ipaddress security updates
- 13:45 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
- 13:44 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
- 13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P48724 and previous config saved to /var/cache/conftool/dbconfig/20230605-134418-ladsgroup.json
- 13:44 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1002.eqiad.wmnet with reason: Host under maintenance
- 13:43 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1002.eqiad.wmnet with reason: Host under maintenance
- 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T335845)', diff saved to https://phabricator.wikimedia.org/P48723 and previous config saved to /var/cache/conftool/dbconfig/20230605-134313-ladsgroup.json
- 13:41 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 13:41 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P48722 and previous config saved to /var/cache/conftool/dbconfig/20230605-133805-ladsgroup.json
- 13:36 sukhe@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T326767
- 13:35 sukhe@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T322937 (duration: 01m 06s)
- 13:35 sukhe@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T322937
- 13:35 bblack@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: temporary lock for LVS resarts in core DCs (duration: 05m 54s)
- 13:32 bblack: lvs1* (eqiad) - restart pybal for T334703 IPs
- 13:29 bblack: lvs2* (codfw) - restart pybal for T334703 IPs
- 13:29 bblack@deploy1002: Locking from deployment [ALL REPOSITORIES]: temporary lock for LVS resarts in core DCs
- 13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T336886)', diff saved to https://phabricator.wikimedia.org/P48721 and previous config saved to /var/cache/conftool/dbconfig/20230605-132911-ladsgroup.json
- 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P48720 and previous config saved to /var/cache/conftool/dbconfig/20230605-132807-ladsgroup.json
- 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T336886)', diff saved to https://phabricator.wikimedia.org/P48719 and previous config saved to /var/cache/conftool/dbconfig/20230605-132703-ladsgroup.json
- 13:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 13:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
- 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T336886)', diff saved to https://phabricator.wikimedia.org/P48718 and previous config saved to /var/cache/conftool/dbconfig/20230605-132642-ladsgroup.json
- 13:25 hashar: Restarted Zuul due to stall ssh connection # T309376
- 13:25 bblack: lvs3* (esams) - restart pybal for T334703 IPs
- 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P48717 and previous config saved to /var/cache/conftool/dbconfig/20230605-132259-ladsgroup.json
- 13:19 bblack: lvs5* (eqsin) - restart pybal for T334703 IPs
- 13:17 Lucas_WMDE: UTC afternoon backport+config window done
- 13:15 bblack: lvs6* (drmrs) - restart pybal for T334703 IPs
- 13:14 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Make outreachwiki a multilingual Wikidata client (T171140) (duration: 10m 06s)
- 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P48716 and previous config saved to /var/cache/conftool/dbconfig/20230605-131301-ladsgroup.json
- 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P48715 and previous config saved to /var/cache/conftool/dbconfig/20230605-131136-ladsgroup.json
- 13:09 bblack: lvs4* (ulsfo) - restart pybal for T334703 IPs
- 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T336886)', diff saved to https://phabricator.wikimedia.org/P48714 and previous config saved to /var/cache/conftool/dbconfig/20230605-130753-ladsgroup.json
- 13:05 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde: Backport for Make outreachwiki a multilingual Wikidata client (T171140) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 13:04 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Make outreachwiki a multilingual Wikidata client (T171140)
- 13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T336886)', diff saved to https://phabricator.wikimedia.org/P48713 and previous config saved to /var/cache/conftool/dbconfig/20230605-130228-ladsgroup.json
- 13:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 13:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T335845)', diff saved to https://phabricator.wikimedia.org/P48712 and previous config saved to /var/cache/conftool/dbconfig/20230605-125754-ladsgroup.json
- 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
- 12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P48711 and previous config saved to /var/cache/conftool/dbconfig/20230605-125630-ladsgroup.json
- 12:52 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
- 12:51 Amir1: killed prioritizeFilesWithTemplate.php, stopping depool maint.
- 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
- 12:44 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
- 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T335845)', diff saved to https://phabricator.wikimedia.org/P48710 and previous config saved to /var/cache/conftool/dbconfig/20230605-124444-ladsgroup.json
- 12:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 12:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
- 12:43 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe
- 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T336886)', diff saved to https://phabricator.wikimedia.org/P48709 and previous config saved to /var/cache/conftool/dbconfig/20230605-124124-ladsgroup.json
- 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T336886)', diff saved to https://phabricator.wikimedia.org/P48708 and previous config saved to /var/cache/conftool/dbconfig/20230605-123915-ladsgroup.json
- 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:39 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe
- 12:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 12:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 12:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 12:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 12:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 12:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 12:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
- 12:17 jynus: creating a copy of db1157 binlogs on dbprov1004 T338128
- 12:15 bblack: lvs*: disabling puppet to roll out new LVS IPs in https://gerrit.wikimedia.org/r/c/operations/puppet/+/924593 - T334703
- 12:15 bblack: lvs*: disabling puppet to roll out new LVS IPs in https://gerrit.wikimedia.org/r/c/operations/puppet/+/924593 - T334703
- 12:15 jbond@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=puppetboard-next
- 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.elasticsearch.restart-nginx (exit_code=0) rolling restart_daemons on A:relforge
- 11:45 jmm@cumin2002: START - Cookbook sre.elasticsearch.restart-nginx rolling restart_daemons on A:relforge
- 11:39 jbond@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=puppetboard-next
- 11:21 moritzm: restarting Exim on MXes to pick up OpenSSL updates
- 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling restart_daemons on A:ncredir
- 11:13 moritzm: bounced ferm on ml-serve2006 (race caused by firewall profile change)
- 11:08 jmm@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling restart_daemons on A:ncredir
- 10:31 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling restart_daemons on A:ldap-replicas
- 10:29 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling restart_daemons on A:ldap-replicas
- 10:14 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:14 aborrero@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirts - aborrero@cumin1001"
- 10:13 aborrero@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirts - aborrero@cumin1001"
- 10:11 moritzm: installing openssl security updates on Bullseye
- 10:08 aborrero@cumin1001: START - Cookbook sre.dns.netbox
- 10:06 godog: truncate xff.log and JobExecutor.log on mwlog1002 to reclaim space - T338127
- 09:41 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
- 09:39 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
- 09:39 claime: roll-restart thumbor in eqiad - T337649
- 09:39 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
- 09:38 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=thumbor.*
- 09:38 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
- 09:37 claime: roll-restart thumbor in codfw - T337649
- 08:40 claime: power-cycling restbase1027 - T338122
- 07:54 moritzm: installing containerd security updates
- 07:38 kartik@deploy1002: Finished scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337669) (duration: 09m 58s)
- 07:30 kartik@deploy1002: kartik: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337669) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 07:28 kartik@deploy1002: Started scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337669)
- 07:25 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 07:23 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 07:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 07:23 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 07:21 taavi@deploy1002: Finished scap: Backport for [SearchVue] Enable on Norwegian, Hungarian, Catalan, Dutch, and Ukrainian (T336870) (duration: 18m 27s)
- 07:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1001.eqiad.wmnet
- 07:12 taavi@deploy1002: mlitn and taavi: Backport for [SearchVue] Enable on Norwegian, Hungarian, Catalan, Dutch, and Ukrainian (T336870) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 07:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1001.eqiad.wmnet
- 07:02 taavi@deploy1002: Started scap: Backport for [SearchVue] Enable on Norwegian, Hungarian, Catalan, Dutch, and Ukrainian (T336870)
- 06:20 _joe_: killing a pod with consistently high haproxy queue for thumbor in codfw
- 06:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 60427
- 06:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 60427
2023-06-03
- 13:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade
- 13:41 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade
- 13:28 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
- 13:28 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet
2023-06-02
- 20:16 apergos: rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps'
- 18:42 bblack: dns*: puppets are all re-enabled, ntp restarts are done, etc
- 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
- 17:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
- 17:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
- 17:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
- 17:27 bblack: dns*: disabling puppet to control rollout of NTP config fixups
- 16:03 bblack: dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed
- 15:35 sukhe: restart ntp.service on dns1002
- 13:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 13:26 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 13:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 13:25 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 13:25 ottomata: deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479
- 13:24 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 13:24 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 13:24 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 13:24 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 12:03 moritzm: installing at-spi2-core bugfix updates from Bullseye point release
- 09:35 moritzm: installing texlive-security updates on buster
- 09:18 akosiaris: update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) T337836. Reload systemd for unit changes to take effect
- 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet
- 08:52 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet
- 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet
- 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet
- 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet
- 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet
- 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet
- 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet
- 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts:
- 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 0 hosts:
- 08:42 moritzm: installing traceroute bugfix updates from Bullseye point release
- 07:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org
- 07:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org
- 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org
- 07:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org
- 07:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
- 07:22 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 07:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
- 01:53 ejegg: fundraising python tools upgraded from 759d4c89 to 2ca83336
- 01:22 cstone: civicrm upgraded from 3819d6d1 to bcc8fccc
2023-06-01
- 21:06 samtar@deploy1002: Finished scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) (duration: 08m 30s)
- 20:59 samtar@deploy1002: esanders and samtar: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 20:57 samtar@deploy1002: Started scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955)
- 20:54 samtar@deploy1002: Finished scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) (duration: 10m 29s)
- 20:45 samtar@deploy1002: samtar and ksarabia: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 20:44 samtar@deploy1002: Started scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955)
- 20:21 samtar@deploy1002: Finished scap: Backport for Deploy Research Incentive survey on enwiki (T336092) (duration: 07m 56s)
- 20:15 samtar@deploy1002: dani and samtar: Backport for Deploy Research Incentive survey on enwiki (T336092) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
- 20:13 samtar@deploy1002: Started scap: Backport for Deploy Research Incentive survey on enwiki (T336092)
- 20:12 samtar@deploy1002: Finished scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) (duration: 08m 20s)
- 20:05 samtar@deploy1002: samtar and dreamyjazz: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 20:04 samtar@deploy1002: Started scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726)
- 19:51 ejegg: fundraising python tools upgraded from 72570bdd to 759d4c89
- 19:12 mforns@deploy1002: Finished deploy [airflow-dags/analytics@21e7354]: (no justification provided) (duration: 02m 42s)
- 19:11 mforns@deploy1002: Started deploy [airflow-dags/analytics@21e7354]: (no justification provided)
- 19:11 bblack@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work (duration: 03m 27s)
- 19:09 bblack: lvs1* (eqiad): upgrade pybal to 1.15.13 - T334703
- 19:08 bblack@deploy1002: Locking from deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work
- 18:45 bblack: lvs6* (drmrs): upgrade pybal to 1.15.13 - T334703
- 18:33 bblack: lvs3* (esams): upgrade pybal to 1.15.13 - T334703
- 18:32 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.11 refs T337525
- 17:50 mforns@deploy1002: Finished deploy [airflow-dags/analytics@03ca1c1]: (no justification provided) (duration: 00m 10s)
- 17:50 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_drmrs and A:cp
- 17:50 mforns@deploy1002: Started deploy [airflow-dags/analytics@03ca1c1]: (no justification provided)
- 17:49 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
- 17:48 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
- 17:48 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_drmrs and A:cp
- 17:47 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
- 17:47 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
- 17:45 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 17:45 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 17:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 17:05 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:55 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: revert: Remove undeeded wgEventBusStreamNamesMap override setting. Recent EventBus changes are not deployed yet? - T336817 (duration: 07m 24s)
- 16:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:53 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 16:53 aborrero@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
- 16:52 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
- 16:44 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: Remove undeeded wgEventBusStreamNamesMap override setting - T336817 (duration: 08m 18s)
- 16:42 bblack: lvs2* (codfw): upgrade pybal to 1.15.13 - T334703
- 16:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:40 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
- 16:35 bblack: lvs5* (eqsin): upgrade pybal to 1.15.13 - T334703
- 16:32 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703 (round 2!)
- 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1001.eqiad.wmnet with OS bullseye
- 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 16:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 16:10 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
- 16:07 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
- 16:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
- 16:06 mutante: gerrit - set repo wikimedia/annualreport to readonly (from active) - T337041
- 16:04 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
- 16:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
- 16:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
- 15:59 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
- 15:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
- 15:45 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:44 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:33 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:33 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 15:21 fabfur: running run-puppet-agent on cp6010.drmrs.wmnet to fix icinga check from cookbook
- 15:15 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703
- 15:11 sukhe: reprepro -C component/pybal bullseye-wikimedia pybal_1.15.13_source.changes
- 15:00 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1002.eqiad.wmnet with OS bullseye
- 14:59 moritzm: installing python-sqlparse security updates
- 14:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 14:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 14:55 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
- 14:53 moritzm: installing jackson-databind security updates
- 14:49 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 14:45 fabfur: running run-puppet-agent on cp6009.drmrs.wmnet to fix icinga check from cookbook
- 14:44 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
- 14:41 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
- 14:40 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_drmrs and A:cp
- 14:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 14:39 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 14:36 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_drmrs and A:cp
- 14:34 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 14:29 moritzm: installing imagemagick security updates on buster
- 14:16 herron@cumin1001: START - Cookbook sre.hosts.reimage for host mwlog1002.eqiad.wmnet with OS bullseye
- 14:14 fabfur: Disabled puppet on A:cp-drmrs for T323557
- 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3c9cc85]: (no justification provided) (duration: 00m 11s)
- 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@3c9cc85]: (no justification provided)
- 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48700 and previous config saved to /var/cache/conftool/dbconfig/20230601-141317-ladsgroup.json
- 14:11 claime: Removing obsolete mediawiki-services-function-evaluator from registry - T337505
- 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48699 and previous config saved to /var/cache/conftool/dbconfig/20230601-135811-ladsgroup.json
- 13:52 moritzm: installing sysstat security updates
- 13:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 13:51 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 13:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 13:50 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 13:49 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 13:49 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48698 and previous config saved to /var/cache/conftool/dbconfig/20230601-134304-ladsgroup.json
- 13:29 moritzm: installing openssl security updates on bullseye
- 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48697 and previous config saved to /var/cache/conftool/dbconfig/20230601-132758-ladsgroup.json
- 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48695 and previous config saved to /var/cache/conftool/dbconfig/20230601-132319-ladsgroup.json
- 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
- 13:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 13:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20230601-132238-ladsgroup.json
- 13:21 claime: Removing obsolete mediawiki-services-function-orchestrator from registry - T337505
- 13:13 urbanecm@deploy1002: Finished scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) (duration: 11m 08s)
- 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48694 and previous config saved to /var/cache/conftool/dbconfig/20230601-130732-ladsgroup.json
- 13:04 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 13:04 urbanecm@deploy1002: urbanecm and daimona: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 13:03 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
- 13:02 urbanecm@deploy1002: Started scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364)
- 12:58 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 12:57 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 12:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 12:52 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48693 and previous config saved to /var/cache/conftool/dbconfig/20230601-125226-ladsgroup.json
- 12:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 12:49 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48692 and previous config saved to /var/cache/conftool/dbconfig/20230601-123720-ladsgroup.json
- 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48691 and previous config saved to /var/cache/conftool/dbconfig/20230601-123236-ladsgroup.json
- 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 12:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 12:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48690 and previous config saved to /var/cache/conftool/dbconfig/20230601-122900-ladsgroup.json
- 12:17 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 12:17 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 12:16 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 12:16 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48689 and previous config saved to /var/cache/conftool/dbconfig/20230601-121354-ladsgroup.json
- 12:03 Daimona: Creating ce_tracking_tools table for the CampaignEvents extension on testwiki, test2wiki, officewiki, and metawiki # T336365
- 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48688 and previous config saved to /var/cache/conftool/dbconfig/20230601-115848-ladsgroup.json
- 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48687 and previous config saved to /var/cache/conftool/dbconfig/20230601-114342-ladsgroup.json
- 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48686 and previous config saved to /var/cache/conftool/dbconfig/20230601-113843-ladsgroup.json
- 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48685 and previous config saved to /var/cache/conftool/dbconfig/20230601-113822-ladsgroup.json
- 11:28 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 11:28 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 11:26 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 11:25 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48684 and previous config saved to /var/cache/conftool/dbconfig/20230601-112316-ladsgroup.json
- 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48683 and previous config saved to /var/cache/conftool/dbconfig/20230601-110810-ladsgroup.json
- 11:04 jayme: disabling puppet on all kubernestes control planes for https://gerrit.wikimedia.org/r/c/operations/puppet/+/925707
- 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48682 and previous config saved to /var/cache/conftool/dbconfig/20230601-105303-ladsgroup.json
- 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48681 and previous config saved to /var/cache/conftool/dbconfig/20230601-104803-ladsgroup.json
- 10:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 10:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
- 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48680 and previous config saved to /var/cache/conftool/dbconfig/20230601-104742-ladsgroup.json
- 10:45 cmooney@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 10:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48679 and previous config saved to /var/cache/conftool/dbconfig/20230601-103236-ladsgroup.json
- 10:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48678 and previous config saved to /var/cache/conftool/dbconfig/20230601-101730-ladsgroup.json
- 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
- 10:16 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
- 10:14 aborrero@cumin2002: START - Cookbook sre.dns.netbox
- 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48677 and previous config saved to /var/cache/conftool/dbconfig/20230601-100224-ladsgroup.json
- 10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48676 and previous config saved to /var/cache/conftool/dbconfig/20230601-100011-ladsgroup.json
- 10:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
- 09:56 moritzm: installing systemd security updates on bullseye
- 09:53 Amir1: ladsgroup@mwmaint1002:~$ foreachwikiindblist group2 extensions/AbuseFilter/maintenance/MigrateActorsAF.php (T336224)
- 09:52 gehel: cleaning apt archives on an-test-worker1002: `sudo apt-get clean`, recovering 14G
- 09:49 cmooney@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 09:43 cmooney@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:36 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:36 cmooney@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:35 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
- 09:32 volans: installed spicerack v7.2.0 on cumin2002
- 09:30 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 09:21 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1010.eqiad.wmnet
- 09:18 godog: remove lv prometheus-global - T288196
- 09:17 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1010.eqiad.wmnet
- 09:17 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1009.eqiad.wmnet
- 09:16 volans@cumin1001: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet
- 09:16 volans@cumin1001: START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet
- 09:13 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1009.eqiad.wmnet
- 09:12 volans: installed spicerack v7.2.0 on cumin1001
- 09:11 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1008.eqiad.wmnet
- 09:07 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1008.eqiad.wmnet
- 09:06 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1007.eqiad.wmnet
- 09:02 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1007.eqiad.wmnet
- 09:01 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1006.eqiad.wmnet
- 08:57 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1006.eqiad.wmnet
- 08:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
- 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
- 08:53 aborrero@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
- 08:49 aborrero@cumin1001: START - Cookbook sre.dns.netbox
- 08:48 apergos: UTC morning backport and config training window done
- 08:30 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 08:29 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 08:28 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 08:28 daniel@deploy1002: Finished scap: Backport for ORES: add model versions configuration and thresholds (T319170) (duration: 10m 12s)
- 08:28 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 08:19 daniel@deploy1002: daniel and isaranto: Backport for ORES: add model versions configuration and thresholds (T319170) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
- 08:18 daniel@deploy1002: Started scap: Backport for ORES: add model versions configuration and thresholds (T319170)
- 07:55 daniel@deploy1002: Finished scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) (duration: 09m 09s)
- 07:48 daniel@deploy1002: daniel: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
- 07:46 daniel@deploy1002: Started scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366)
- 07:42 mlitn@deploy1002: Finished scap: Backport for Add $wgInterwikiLogoOverride (T315269) (duration: 33m 02s)
- 07:35 moritzm: installing libssh security updates
- 07:29 mlitn@deploy1002: mlitn: Backport for Add $wgInterwikiLogoOverride (T315269) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
- 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
- 07:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
- 07:09 mlitn@deploy1002: Started scap: Backport for Add $wgInterwikiLogoOverride (T315269)
- 06:16 kart_: Updated MinT to 2023-06-01-041041-production (T336525)
- 06:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: applied
- 05:56 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 05:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 05:46 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 05:44 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 05:42 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 05:39 kart_: Updated cxserver to 2023-06-01-041016-production (T337669)
- 05:34 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 05:34 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 05:32 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 05:32 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 05:27 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 05:27 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 00:11 eileen: civicrm upgraded from 885208ca to 3819d6d1