You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(XioNoX: configure OSPF between cr2-drmrs and cr2-eqdfw)
imported>Stashbot
(jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye)
 
(426 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== 2022-02-27 ==
== 2023-06-09 ==
* 20:42 XioNoX: configure OSPF between cr2-drmrs and cr2-eqdfw
* 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
* 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
* 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
* 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
* 20:38 btullis@cumin1001: END (ERROR) - Cookbook sre.aqs.roll-restart-reboot (exit_code=97) rolling restart_daemons on A:aqs
* 20:23 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs
* 17:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
* 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1016.eqiad.wmnet with OS buster
* 17:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
* 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49398 and previous config saved to /var/cache/conftool/dbconfig/20230609-173202-ladsgroup.json
* 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49397 and previous config saved to /var/cache/conftool/dbconfig/20230609-171656-ladsgroup.json
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49396 and previous config saved to /var/cache/conftool/dbconfig/20230609-170150-ladsgroup.json
* 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49395 and previous config saved to /var/cache/conftool/dbconfig/20230609-164644-ladsgroup.json
* 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49394 and previous config saved to /var/cache/conftool/dbconfig/20230609-163007-ladsgroup.json
* 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 16:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49393 and previous config saved to /var/cache/conftool/dbconfig/20230609-162946-ladsgroup.json
* 16:20 urandom: powercycling restbase1028
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49392 and previous config saved to /var/cache/conftool/dbconfig/20230609-161440-ladsgroup.json
* 16:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED
* 16:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['snapshot1016']
* 16:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['snapshot1016']
* 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49391 and previous config saved to /var/cache/conftool/dbconfig/20230609-155934-ladsgroup.json
* 15:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49390 and previous config saved to /var/cache/conftool/dbconfig/20230609-154428-ladsgroup.json
* 15:30 andrewbogott: wikitech-static: deleted everything in /srv/mediawiki/images/wikitech/archive for [[phab:T338520|T338520]]
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49388 and previous config saved to /var/cache/conftool/dbconfig/20230609-152845-ladsgroup.json
* 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49387 and previous config saved to /var/cache/conftool/dbconfig/20230609-152824-ladsgroup.json
* 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"
* 15:22 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"
* 15:17 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49386 and previous config saved to /var/cache/conftool/dbconfig/20230609-151318-ladsgroup.json
* 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49385 and previous config saved to /var/cache/conftool/dbconfig/20230609-145812-ladsgroup.json
* 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49384 and previous config saved to /var/cache/conftool/dbconfig/20230609-144305-ladsgroup.json
* 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49383 and previous config saved to /var/cache/conftool/dbconfig/20230609-142731-ladsgroup.json
* 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 14:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 14:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49382 and previous config saved to /var/cache/conftool/dbconfig/20230609-142655-ladsgroup.json
* 14:14 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
* 14:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49381 and previous config saved to /var/cache/conftool/dbconfig/20230609-141149-ladsgroup.json
* 13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49380 and previous config saved to /var/cache/conftool/dbconfig/20230609-135643-ladsgroup.json
* 13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49379 and previous config saved to /var/cache/conftool/dbconfig/20230609-134137-ladsgroup.json
* 13:29 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk
* 13:29 sukhe: start pybal on lvs2013
* 13:28 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk
* 13:25 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49378 and previous config saved to /var/cache/conftool/dbconfig/20230609-132541-ladsgroup.json
* 13:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 13:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49377 and previous config saved to /var/cache/conftool/dbconfig/20230609-132520-ladsgroup.json
* 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49376 and previous config saved to /var/cache/conftool/dbconfig/20230609-131014-ladsgroup.json
* 13:07 sukhe: stop pybal on lvs2013 to test lvs2014
* 13:02 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs2014
* 13:02 sukhe: sudo cumin 'A:lvs and A:codfw' 'enable-puppet "CR 928818"'
* 13:01 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
* 12:59 sukhe: sudo cumin 'A:lvs and A:codfw' 'disable-puppet "CR 928818"'
* 12:57 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014
* 12:57 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
* 12:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014
* 12:55 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
* 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49373 and previous config saved to /var/cache/conftool/dbconfig/20230609-125508-ladsgroup.json
* 12:50 krinkle@deploy1002: Finished scap: {{Gerrit|I385d28d2edacb37}} (duration: 06m 59s)
* 12:43 krinkle@deploy1002: Started scap: {{Gerrit|I385d28d2edacb37}}
* 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49371 and previous config saved to /var/cache/conftool/dbconfig/20230609-124002-ladsgroup.json
* 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001"
* 12:29 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001"
* 12:27 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49370 and previous config saved to /var/cache/conftool/dbconfig/20230609-122303-ladsgroup.json
* 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49369 and previous config saved to /var/cache/conftool/dbconfig/20230609-122243-ladsgroup.json
* 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002"
* 12:15 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002"
* 12:13 aborrero@cumin2002: START - Cookbook sre.dns.netbox
* 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49368 and previous config saved to /var/cache/conftool/dbconfig/20230609-120737-ladsgroup.json
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 778 hosts
* 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49367 and previous config saved to /var/cache/conftool/dbconfig/20230609-115230-ladsgroup.json
* 11:52 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 778 hosts
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 1262 hosts
* 11:49 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 1262 hosts
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49366 and previous config saved to /var/cache/conftool/dbconfig/20230609-113724-ladsgroup.json
* 11:27 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49365 and previous config saved to /var/cache/conftool/dbconfig/20230609-112250-ladsgroup.json
* 11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49364 and previous config saved to /var/cache/conftool/dbconfig/20230609-112229-ladsgroup.json
* 11:20 sukhe: pcc-db1001: sudo systemctl start pcc_facts_processor.service
* 11:14 sukhe: sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49363 and previous config saved to /var/cache/conftool/dbconfig/20230609-110723-ladsgroup.json
* 11:02 sukhe: homer "cr*-codfw*" commit "Gerrit: 928113 add new LVS host lvs2014
* 10:53 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2014.codfw.wmnet with OS bullseye
* 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49362 and previous config saved to /var/cache/conftool/dbconfig/20230609-105217-ladsgroup.json
* 10:40 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
* 10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49361 and previous config saved to /var/cache/conftool/dbconfig/20230609-103711-ladsgroup.json
* 10:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
* 10:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49360 and previous config saved to /var/cache/conftool/dbconfig/20230609-102217-ladsgroup.json
* 10:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 10:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49359 and previous config saved to /var/cache/conftool/dbconfig/20230609-102156-ladsgroup.json
* 10:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
* 10:12 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 10:12 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 10:09 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 10:08 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49358 and previous config saved to /var/cache/conftool/dbconfig/20230609-100650-ladsgroup.json
* 09:57 elukey: increase <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change topic partitions (3->5) on kafka main clusters - [[phab:T338357|T338357]]
* 09:56 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:54 moritzm: installing jupyter-core security updates on bullseye
* 09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49357 and previous config saved to /var/cache/conftool/dbconfig/20230609-095144-ladsgroup.json
* 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49356 and previous config saved to /var/cache/conftool/dbconfig/20230609-093638-ladsgroup.json
* 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49355 and previous config saved to /var/cache/conftool/dbconfig/20230609-092141-ladsgroup.json
* 09:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 09:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 09:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 09:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49354 and previous config saved to /var/cache/conftool/dbconfig/20230609-090829-ladsgroup.json
* 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49353 and previous config saved to /var/cache/conftool/dbconfig/20230609-085322-ladsgroup.json
* 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49352 and previous config saved to /var/cache/conftool/dbconfig/20230609-083816-ladsgroup.json
* 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49351 and previous config saved to /var/cache/conftool/dbconfig/20230609-082310-ladsgroup.json
* 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49350 and previous config saved to /var/cache/conftool/dbconfig/20230609-080708-ladsgroup.json
* 08:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49349 and previous config saved to /var/cache/conftool/dbconfig/20230609-080637-ladsgroup.json
* 07:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49348 and previous config saved to /var/cache/conftool/dbconfig/20230609-075130-ladsgroup.json
* 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49347 and previous config saved to /var/cache/conftool/dbconfig/20230609-073624-ladsgroup.json
* 07:33 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1492.eqiad.wmnet
* 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49346 and previous config saved to /var/cache/conftool/dbconfig/20230609-072118-ladsgroup.json
* 07:19 moritzm: powercycling restbase2018 (kernel hung following what looks like I/O errors)
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49345 and previous config saved to /var/cache/conftool/dbconfig/20230609-070520-ladsgroup.json
* 07:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 07:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49344 and previous config saved to /var/cache/conftool/dbconfig/20230609-070459-ladsgroup.json
* 06:50 moritzm: installing wireshark security updates
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49343 and previous config saved to /var/cache/conftool/dbconfig/20230609-064953-ladsgroup.json
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster2005.codfw.wmnet
* 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster2005.codfw.wmnet
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster1005.eqiad.wmnet
* 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster1005.eqiad.wmnet
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: prometheus3001.esams.wmnet
* 06:48 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: prometheus3001.esams.wmnet
* 06:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
* 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49342 and previous config saved to /var/cache/conftool/dbconfig/20230609-063447-ladsgroup.json
* 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49341 and previous config saved to /var/cache/conftool/dbconfig/20230609-061941-ladsgroup.json
* 06:06 eileen: config {{Gerrit|97c57848}} -> {{Gerrit|6f4a9d19}}  restart jobs
* 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49340 and previous config saved to /var/cache/conftool/dbconfig/20230609-060438-ladsgroup.json
* 06:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 06:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 05:53 eileen: civicrm upgraded from {{Gerrit|158896cc}} to {{Gerrit|5bbed553}}
* 05:52 eileen: config revision changed from {{Gerrit|8b71fa7a}} to {{Gerrit|97c57848}}
* 05:50 moritzm: installing cpio security updates
* 05:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 05:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 05:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 05:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 05:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 05:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49339 and previous config saved to /var/cache/conftool/dbconfig/20230609-052315-ladsgroup.json
* 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49338 and previous config saved to /var/cache/conftool/dbconfig/20230609-050809-ladsgroup.json
* 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49337 and previous config saved to /var/cache/conftool/dbconfig/20230609-045302-ladsgroup.json
* 04:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49336 and previous config saved to /var/cache/conftool/dbconfig/20230609-043756-ladsgroup.json
* 04:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49335 and previous config saved to /var/cache/conftool/dbconfig/20230609-042306-ladsgroup.json
* 04:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 04:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 04:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49334 and previous config saved to /var/cache/conftool/dbconfig/20230609-042246-ladsgroup.json
* 04:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49333 and previous config saved to /var/cache/conftool/dbconfig/20230609-040739-ladsgroup.json
* 03:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49332 and previous config saved to /var/cache/conftool/dbconfig/20230609-035233-ladsgroup.json
* 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49331 and previous config saved to /var/cache/conftool/dbconfig/20230609-033727-ladsgroup.json
* 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49330 and previous config saved to /var/cache/conftool/dbconfig/20230609-032127-ladsgroup.json
* 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 03:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49329 and previous config saved to /var/cache/conftool/dbconfig/20230609-032106-ladsgroup.json
* 03:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49328 and previous config saved to /var/cache/conftool/dbconfig/20230609-030600-ladsgroup.json
* 02:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49327 and previous config saved to /var/cache/conftool/dbconfig/20230609-025054-ladsgroup.json
* 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49326 and previous config saved to /var/cache/conftool/dbconfig/20230609-023548-ladsgroup.json
* 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49325 and previous config saved to /var/cache/conftool/dbconfig/20230609-022054-ladsgroup.json
* 02:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 02:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49324 and previous config saved to /var/cache/conftool/dbconfig/20230609-022034-ladsgroup.json
* 02:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1002.eqiad.wmnet with OS bullseye
* 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49323 and previous config saved to /var/cache/conftool/dbconfig/20230609-020528-ladsgroup.json
* 02:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
* 02:01 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
* 02:00 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
* 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49322 and previous config saved to /var/cache/conftool/dbconfig/20230609-015021-ladsgroup.json
* 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
* 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
* 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49321 and previous config saved to /var/cache/conftool/dbconfig/20230609-013515-ladsgroup.json
* 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS bullseye
* 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49320 and previous config saved to /var/cache/conftool/dbconfig/20230609-011945-ladsgroup.json
* 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49319 and previous config saved to /var/cache/conftool/dbconfig/20230609-011924-ladsgroup.json
* 01:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49318 and previous config saved to /var/cache/conftool/dbconfig/20230609-010418-ladsgroup.json
* 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
* 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
* 00:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 00:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49317 and previous config saved to /var/cache/conftool/dbconfig/20230609-004912-ladsgroup.json
* 00:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 00:47 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
* 00:34 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS bullseye
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49316 and previous config saved to /var/cache/conftool/dbconfig/20230609-003406-ladsgroup.json
* 00:31 eileen: civicrm upgraded from {{Gerrit|6f64e77d}} to {{Gerrit|158896cc}}
* 00:25 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['pki-root1002']
* 00:25 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002']
* 00:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['pki-root1002']
* 00:24 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002']
* 00:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49315 and previous config saved to /var/cache/conftool/dbconfig/20230609-001821-ladsgroup.json
* 00:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 00:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49314 and previous config saved to /var/cache/conftool/dbconfig/20230609-001732-ladsgroup.json
* 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49313 and previous config saved to /var/cache/conftool/dbconfig/20230609-000226-ladsgroup.json


== 2022-02-25 ==
== 2023-06-08 ==
* 23:32 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 23:55 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
* 23:30 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 23:54 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
* 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T300992|T300992]])', diff saved to https://phabricator.wikimedia.org/P21540 and previous config saved to /var/cache/conftool/dbconfig/20220225-213704-ladsgroup
* 23:54 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
* 23:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
* 23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49312 and previous config saved to /var/cache/conftool/dbconfig/20230608-234720-ladsgroup.json
* 23:42 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for pki-root - pt1979@cumin2002"
* 23:40 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add


== 2022-02-24 ==
== 2023-06-07 ==
* 23:35 ryankemper: [[phab:T302526|T302526]] Deployed https://gerrit.wikimedia.org/r/765652 and ran puppet across wcqs*
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49219 and previous config saved to /var/cache/conftool/dbconfig/20230607-235624-ladsgroup.json
* 22:06 mutante: static-bugzilla.wikimedia.org - kubernetes - deployed gerrit:765572  - first prod service behind a k8s ingress ([[phab:T290966|T290966]])
* 23:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 22:05 mutante: phabricator - disabled git repo - labs-tools-harvesting-data-refinery/repository/master/
* 23:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 21:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2086.codfw.wmnet with OS bullseye
* 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49218 and previous config saved to /var/cache/conftool/dbconfig/20230607-235603-ladsgroup.json
* 21:45 brennen: end of UTC late backport & config window
* 23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P49217 and previous config saved to /var/cache/conftool/dbconfig/20230607-234522-ladsgroup.json
* 21:43 dancy@deploy1002: Started scap: testing scap container image building
* 23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49216 and previous config saved to /var/cache/conftool/dbconfig/20230607-234057-ladsgroup.json
* 21:43 tzatziki: removing 1 file for legal compliance
* 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49215 and previous config saved to /var/cache/conftool/dbconfig/20230607-233016-ladsgroup.json
* 21:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2085.codfw.wmnet with OS bullseye
* 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49214 and previous config saved to /var/cache/conftool/dbconfig/20230607-232551-ladsgroup.json
* 21:41 mutante: phabricator - disabled git repo "frig" - outdated fundraising stuff, checked with fr-tech, not needed [[phab:T296022|T296022]]
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P49213 and previous config saved to /var/cache/conftool/dbconfig/20230607-232223-ladsgroup.json
* 21:40 brennen@deploy1002: Synchronized php-1.38.0-wmf.23/includes: Backport: [[gerrit:765626{{!}}Revert "Revert "Revert "Show message fallback keys when using &uselang=qqx"""]] (duration: 00m 57s)
* 23:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 21:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2
* 23:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org


== 2022-02-23 ==
== 2023-06-06 ==
* 23:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS stretch
* 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48961 and previous config saved to /var/cache/conftool/dbconfig/20230606-235248-ladsgroup.json
* 23:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48960 and previous config saved to /var/cache/conftool/dbconfig/20230606-234810-ladsgroup.json
* 23:09 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
* 23:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-a1-codfw.mgmt.codfw.wmnet
* 22:58 mutante: phabricator - disabled empty but active repo: wikidata-query-LDFServer (WQLD) created in 2018 by qchris ([[phab:T296022|T296022]])
* 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48959 and previous config saved to /var/cache/conftool/dbconfig/20230606-233742-ladsgroup.json
* 22:51 mutante: phabricator - disabled empty but active repos: dibyaduttabook and xtools-H ([[phab:T296022|T296022]])
* 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48958 and previous config saved to /var/cache/conftool/dbconfig/20230606-233304-ladsgroup.json
* 22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS stretch
* 23:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
* 22:37 mutante: phabricator - disabling repository dibyaduttabook
* 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48955 and previous config saved to /var/cache/conftool/dbconfig/20230606-232235-ladsgroup.json
* 22:09 reedy@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/SecurePoll/cli/wm-scripts/ucoc/: (no justification provided) (duration: 00m 50s)
* 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox
* 22:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:17 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code


== 2022-02-22 ==
== 2023-06-05 ==
* 23:20 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
* 23:53 ladsgroup@cumin1001: dbctl commit (dc=all)
* 23:18 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
* 23:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:53 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: Backport: [[gerrit:764833{{!}}VisualEditor: Avoid undefined index for mobileformat ([T302344])]] (duration: 00m 49s)
* 22:52 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/DiscussionTools/includes/ApiDiscussionToolsEdit.php: Backport: [[gerrit:764834{{!}}DiscussionTools: Avoid undefined index for mobileformat ([T302344])]] (duration: 00m 51s)
* 22:45 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
* 22:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
* 22:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2067.codfw.wmnet with OS stretch
* 22:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED
* 22:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 22:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 22:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:02 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
* 21:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:47 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED
* 21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch
* 21:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED
* 21:43 urbanecm@deploy1002: Synchronized wmf-config/filebackend.php: {{Gerrit|91b81ac9dc42893c872f09620566379ab6158f12}}: filebackend: migrate $wmfSwift* to $wmgSwift* ([[phab:T45956|T45956]]) (duration: 00m 52s)
* 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:38 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|99f244c9539a5ae2af0bd9dddb8aae45dbc44704}}: [Cleanup] Remove non-existent config wgVectorUseWvuiSearch (duration: 00m 50s)
* 21:34 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|717232793e002ba501a3cbd2be96255760e14ba2}}: [Vector] Enable table of contents on beta  cluster (duration: 00m 50s)
* 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6d1d9a9ee2d633cf67e81fd2277deb4a61b87891}}: InitialiseSettings: General cleanup, wgRemoveGroups (A-D) ([[phab:T301647|T301647]]) (duration: 00m 50s)
* 21:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED
* 21:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED
* 21:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ee7608c7b56b579e2aaa50b504b6c2e28b63058e}}: Deploy the fawiki test safety survey to production ([[phab:T297629|T297629]]) (duration: 00m 51s)
* 21:19 cwhite: end opensearch upgrade (codfw) [[phab:T299168|T299168]]
* 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:12 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED
* 21:06 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
* 21:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20
* 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:14 tzatziki: Removing one file for legal compliance
* 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002"
* 23:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 23:14 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002"
* 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20850 and previous config saved to /var/cache/conftool/dbconfig/20220215-230454-marostegui.json
* 23:12 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:55 tzatziki: Removing 5 files for legal compliance
* 23:11 jforrester@deploy1002: Finished deploy [integration/docroot@6eefe56]: {{Gerrit|I5c1b92322ae59bfe8a9233ad23c3c89b844f5fb7}} for [[phab:T334492|T334492]] (duration: 00m 05s)
* 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20849 and previous config saved to /var/cache/conftool/dbconfig/20220215-224950-marostegui.json
* 23:10 jforrester@deploy1002: Started deploy [integration/docroot@6eefe56]: {{Gerrit|I5c1b92322ae59bfe8a9233ad23c3c89b844f5fb7}} for [[phab:T334492|T334492]]
* 22:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20848 and previous config saved to /var/cache/conftool/dbconfig/20220215-223445-marostegui.json
* 23:09 jforrester@deploy1002: Finished deploy [integration/docroot@ab77611]: {{Gerrit|Idf6c7ad01ed18785b850967252c6867d7871e902}} (duration: 00m 08s)
* 22:28 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
* 23:09 jforrester@deploy1002: Started deploy [integration/docroot@ab77611]: {{Gerrit|Idf6c7ad01ed18785b850967252c6867d7871e902}}
* 22:27 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48803 and previous config saved to /var/cache/conftool/dbconfig/20230605-230752-ladsgroup.json
* 22:27 jhuneidi@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
* 23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to
* 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
* 22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
* 22:25 jhuneidi@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
* 22:24 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
* 22:23 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
* 22:23 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
* 22:21 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
* 22:21 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
* 22:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20847 and previous config saved to /var/cache/conftool/dbconfig/20220215-221940-marostegui.json
* 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20846 and previous config saved to /var/cache/conftool/dbconfig/20220215-220041-marostegui.json
* 22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20845 and previous config saved to /var/cache/conftool/dbconfig/20220215-220034-marostegui.json
* 22:00 hoo: Updated the Wikidata property suggester with data from the 2022-02-07 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 21:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20844 and previous config saved to /var/cache/conftool/dbconfig/20220215-214529-marostegui.json
* 21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:41 urbanecm: UTC late B&C window completed
* 21:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2e0b51f6c314bfd685f79544c6cb2260feb380a0}}: amiwiki: Deploy Growth features to newcomers (duration: 00m 49s)
* 21:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:36 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|b3e8161445d4f778cab8cbabe709f9583ac62df2}}: Apply max width setting to all Wikisource page namespaces ([[phab:T300563|T300563]]; 2/2) (duration: 00m 49s)
* 21:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b3e8161445d4f778cab8cbabe709f9583ac62df2}}: Apply max width setting to all Wikisource page namespaces ([[phab:T300563|T300563]]; 1/2) (duration: 00m 50s)
* 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20843 and previous config saved to /var/cache/conftool/dbconfig/20220215-213024-marostegui.json
* 21:22 eileen: civicrm revision {{Gerrit|815e3091}} -> {{Gerrit|84953e1d}}
* 21:20 eileen: localsettings  checkout revision ({{Gerrit|02f4888c}} -> {{Gerrit|2a6d2e45}})
* 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20842 and previous config saved to /var/cache/conftool/dbconfig/20220215-211519-marostegui.json
* 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d97b43ea0428621c6fd9352af9840e0db4545c08}}: Remove MFUseDesktopContributionsPage config ([[phab:T300583|T300583]]) (duration: 00m 52s)
* 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20841 and previous config saved to /var/cache/conftool/dbconfig/20220215-205547-marostegui.json
* 20:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 20:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20840 and previous config saved to /var/cache/conftool/dbconfig/20220215-205539-marostegui.json
* 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20838 and previous config saved to /var/cache/conftool/dbconfig/20220215-204035-marostegui.json
* 20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20837 and previous config saved to /var/cache/conftool/dbconfig/20220215-202530-marostegui.json
* 20:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20836 and previous config saved to /var/cache/conftool/dbconfig/20220215-201025-marostegui.json
* 19:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1015.eqiad.wmnet with OS buster
* 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20835 and previous config saved to /var/cache/conftool/dbconfig/20220215-195051-marostegui.json
* 19:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 19:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T300381|T300381]])', diff saved to https://phabricator.wikimedia.org/P20834 and


== 2022-02-03 ==
== 2023-06-03 ==
* 23:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20159 and previous config saved to /var/cache/conftool/dbconfig/20220203-233447-marostegui.json
* 13:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade
* 23:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20158 and previous config saved to /var/cache/conftool/dbconfig/20220203-231942-marostegui.json
* 13:41 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade
* 23:15 ryankemper: [[phab:T294805|T294805]] Added a silence on alerts.wikimedia.org for `CirrusSearchJVMGCOldPoolFlatlined`
* 13:28 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
* 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20157 and previous config saved to /var/cache/conftool/dbconfig/20220203-230437-marostegui.json
* 13:28 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet
* 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20156 and previous config saved to /var/cache/conftool/dbconfig/20220203-224933-marostegui.json
* 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20155 and previous config saved to /var/cache/conftool/dbconfig/20220203-223923-marostegui.json
* 22:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 22:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20154 and previous config saved to /var/cache/conftool/dbconfig/20220203-223916-marostegui.json
* 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20153 and previous config saved to /var/cache/conftool/dbconfig/20220203-222411-marostegui.json
* 22:18 ryankemper: [[phab:T294805|T294805]] Monitoring https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?orgId=1&var-cirrus_group=eqiad&var-cluster=elasticsearch&var-exported_cluster=production-search&var-smoothing=1&refresh=1m&from=now-3h&to=now as new hosts join the fleet
* 22:18 ryankemper: [[phab:T294805|T294805]] Bringing in new eqiad hosts in batches of 4, with 15-20 mins between batches: `ryankemper@cumin1001:~$ sudo -E cumin -b 4 'elastic1*' 'sudo run-puppet-agent --force; sudo run-puppet-agent; sleep 900'` tmux session `es_eqiad`
* 22:13 ryankemper: [[phab:T294805|T294805]] https://gerrit.wikimedia.org/r/c/operations/puppet/+/759617/ fixed the dependency issues, going to start bringing new hosts into service
* 22:09 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20152 and previous config saved to /var/cache/conftool/dbconfig/20220203-220906-marostegui.json
* 22:05 eileen: civicrm revision {{Gerrit|7dcdc017}} -> {{Gerrit|04cbf35b}}
* 22:04 volans@cumin2002: START - Cookbook sre.dns.netbox
* 21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20150 and previous config saved to /var/cache/conftool/dbconfig/20220203-215402-marostegui.json
* 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20149 and previous config saved to /var/cache/conftool/dbconfig/20220203-215154-marostegui.json
* 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20148 and previous config saved to /var/cache/conftool/dbconfig/20220203-215121-marostegui.json
* 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20147 and previous config saved to /var/cache/conftool/dbconfig/20220203-213616-marostegui.json
* 21:28 rzl: root@apt1001:/home/rzl# reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy  # [[phab:T300324|T300324]]
* 21:27 rzl: root@apt1001:/home/rzl# reprepro copy stretch-wikimedia buster-wikimedia envoyproxy  # [[phab:T300324|T300324]]
* 21:21 ryankemper: [[phab:T294805|T294805]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759588; hoping this resolves dependency issues. Running puppet agent on `elastic1068`
* 21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20145 and previous config saved to /var/cache/conftool/dbconfig/20220203-212111-marostegui.json
* 21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20144 and previous config saved to /var/cache/conftool/dbconfig/20220203-210607-marostegui.json
* 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20143 and previous config saved to /var/cache/conftool/dbconfig/20220203-210358-marostegui.json
* 21:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 21:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20142 and previous config saved to /var/cache/conftool/dbconfig/20220203-210350-marostegui.json
* 20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20140 and previous config saved to /var/cache/conftool/dbconfig/20220203-204846-marostegui.json
* 20:43 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_recount_categories.service  # [[phab:T299823|T299823]]
* 20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20139 and previous config saved to /var/cache/conftool/dbconfig/20220203-203341-marostegui.json
* 20:26 ryankemper: [[phab:T294805|T294805]] Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib` wasn't there: https://phabricator.wikimedia.org/P20138
* 20:26 ryankemper: [[phab:T294805|T294805]] Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib' wasn't there: https://phabricator.wikimedia.org/P20138
* 20:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
* 20:25 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
* 20:22 ryankemper: [[phab:T294805|T294805]] Running puppet on single elastic host: `ryankemper@elastic1068:~$ sudo run-puppet-agent --force`
* 20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20137 and previous config saved to /var/cache/conftool/dbconfig/20220203-201836-marostegui.json
* 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20136 and previous config saved to /var/cache/conftool/dbconfig/20220203-201729-marostegui.json
* 20:17 ryankemper: [[phab:T294805|T294805]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759317 to activate roles for elastic eqiad replacement hosts
* 20:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 20:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
* 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20135 and previous config saved to /var/cache/conftool/dbconfig/20220203-201721-marostegui.json
* 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:16 ryankemper: [[phab:T294805|T294805]] Disabled puppet on `elastic1*` in preparation for bringing new hosts into service: `ryankemper@cumin1001:~$ sudo cumin 'elastic1*' 'sudo disable-puppet "Add new eqiad replacement hosts elastic10[68-83] - [[phab:T294805|T294805]]"'`
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1003.eqiad.wmnet with OS buster
* 20:11 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.20  refs [[phab:T293961|T293961]]
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:08 mutante: planet1002/planet2002 -  sudo systemctl start planet-update-en  to manually start update after adding diff.wikimedia.org [[phab:T230444|T230444]]
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:07 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: [[gerrit:759313{{!}}Drop skin override (T300814)]] (2/2) (duration: 00m 49s)
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:06 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/skin.json: Backport: [[gerrit:759313{{!}}Drop skin override (T300814)]] (1/2) (duration: 00m 49s)
* 20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS buster
* 20:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20134 and previous config saved to /var/cache/conftool/dbconfig/20220203-200217-marostegui.json
* 19:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20133 and previous config saved to /var/cache/conftool/dbconfig/20220203-194712-marostegui.json
* 19:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
* 19:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:41 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1003.eqiad.wmnet with OS buster
* 19:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
* 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:39 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1004.eqiad.wmnet with OS buster
* 19:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
* 19:34 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: [[gerrit:759310{{!}}Pass skin name to Hooks::isSkinLegacy (T299971)]] (duration: 00m 49s)
* 19:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:33 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/ContentTranslation/modules/entrypoints/ext.cx.entrypoints.contributionsmenu.js: Backport: [[gerrit:759314{{!}}Update skin checks with new vector skin key. (T298916 T300814)]] (duration: 00m 50s)
* 19:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
* 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20132 and previous config saved to /var/cache/conftool/dbconfig/20220203-193208-marostegui.json
* 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:29 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/modules/ext.wikiEditor.js: Backport: [[gerrit:759316{{!}}New bucket for abtest data (T291308)]] (2/2) (duration: 00m 50s)
* 19:28 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/includes/Hooks.php: Backport: [[gerrit:759316{{!}}New bucket for abtest data (T291308)]] (1/2) (duration: 00m 49s)
* 19:27 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: Backport: [[gerrit:759315{{!}}New bucket for abtest data (T291308)]] (duration: 00m 50s)
* 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:26 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:759557{{!}}commonswiki: Add three domains to the wgCopyUploadsDomains allowlist (T299835 T300848)]] (duration: 00m 54s)
* 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20131 and previous config saved to /var/cache/conftool/dbconfig/20220203-183648-marostegui.json
* 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20130 and previous config saved to /var/cache/conftool/dbconfig/20220203-183634-marostegui.json
* 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20129 and previous config saved to /var/cache/conftool/dbconfig/20220203-182129-marostegui.json
* 18:17 dancy: restarted php7.2-fpm processes on mediawiki12
* 18:10 dancy: killed 8 spinning php7.2-fpm processes on mediawiki12
* 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20128 and previous config saved to /var/cache/conftool/dbconfig/20220203-180624-marostegui.json
* 17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20127 and previous config saved to /var/cache/conftool/dbconfig/20220203-175120-marostegui.json
* 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20126 and previous config saved to /var/cache/conftool/dbconfig/20220203-174913-marostegui.json
* 17:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20125 and previous config saved to /var/cache/conftool/dbconfig/20220203-174905-marostegui.json
* 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20122 and previous config saved to /var/cache/conftool/dbconfig/20220203-173400-marostegui.json
* 17:22 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
* 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20120 and previous config saved to /var/cache/conftool/dbconfig/20220203-171856-marostegui.json
* 17:13 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
* 17:12 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
* 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20118 and previous config saved to /var/cache/conftool/dbconfig/20220203-170351-marostegui.json
* 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20117 and previous config saved to /var/cache/conftool/dbconfig/20220203-170144-marostegui.json
* 17:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 17:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
* 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20116 and previous config saved to /var/cache/conftool/dbconfig/20220203-170136-marostegui.json
* 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20115 and previous config saved to /var/cache/conftool/dbconfig/20220203-164632-marostegui.json
* 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20114 and previous config saved to /var/cache/conftool/dbconfig/20220203-163127-marostegui.json
* 16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20113 and previous config saved to /var/cache/conftool/dbconfig/20220203-162316-marostegui.json
* 16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20111 and previous config saved to /var/cache/conftool/dbconfig/20220203-161622-marostegui.json
* 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20110 and previous config saved to /var/cache/conftool/dbconfig/20220203-161515-marostegui.json
* 16:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20109 and previous config saved to /var/cache/conftool/dbconfig/20220203-161508-marostegui.json
* 16:10 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
* 16:10 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
* 16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20108 and previous config saved to /var/cache/conftool/dbconfig/20220203-160811-marostegui.json
* 16:00 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
* 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20107 and previous config saved to /var/cache/conftool/dbconfig/20220203-160003-marostegui.json
* 15:55 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts restbase2011.codfw.wmnet
* 15:55 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
* 15:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20106 and previous config saved to /var/cache/conftool/dbconfig/20220203-155306-marostegui.json
* 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20105 and previous config saved to /var/cache/conftool/dbconfig/20220203-154458-marostegui.json
* 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20104 and previous config saved to /var/cache/conftool/dbconfig/20220203-153801-marostegui.json
* 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20103 and previous config saved to /var/cache/conftool/dbconfig/20220203-153653-marostegui.json
* 15:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 15:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20102 and previous config saved to /var/cache/conftool/dbconfig/20220203-153646-marostegui.json
* 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20101 and previous config saved to /var/cache/conftool/dbconfig/20220203-152953-marostegui.json
* 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20100 and previous config saved to /var/cache/conftool/dbconfig/20220203-152746-marostegui.json
* 15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20099 and previous config saved to /var/cache/conftool/dbconfig/20220203-152739-marostegui.json
* 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20098 and previous config saved to /var/cache/conftool/dbconfig/20220203-152141-marostegui.json
* 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20097 and previous config saved to /var/cache/conftool/dbconfig/20220203-151234-marostegui.json
* 15:12 moritzm: installing apache security updates on gerrit1001
* 15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20096 and previous config saved to /var/cache/conftool/dbconfig/20220203-150636-marostegui.json
* 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20095 and previous config saved to /var/cache/conftool/dbconfig/20220203-145729-marostegui.json
* 14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20094 and previous config saved to /var/cache/conftool/dbconfig/20220203-145132-marostegui.json
* 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20093 and previous config saved to /var/cache/conftool/dbconfig/20220203-145024-marostegui.json
* 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20092 and previous config saved to /var/cache/conftool/dbconfig/20220203-145017-marostegui.json
* 14:44 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20091 and previous config saved to /var/cache/conftool/dbconfig/20220203-144224-marostegui.json
* 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20090 and previous config saved to /var/cache/conftool/dbconfig/20220203-144017-marostegui.json
* 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20089 and previous config saved to /var/cache/conftool/dbconfig/20220203-143544-marostegui.json
* 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20088 and previous config saved to /var/cache/conftool/dbconfig/20220203-143512-marostegui.json
* 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20087 and previous config saved to /var/cache/conftool/dbconfig/20220203-142039-marostegui.json
* 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20086 and previous config saved to /var/cache/conftool/dbconfig/20220203-142007-marostegui.json
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20085 and previous config saved to /var/cache/conftool/dbconfig/20220203-140534-marostegui.json
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20084 and previous config saved to /var/cache/conftool/dbconfig/20220203-140503-marostegui.json
* 13:53 XioNoX: eqiad: push Capirca generated border-in filters
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20083 and previous config saved to /var/cache/conftool/dbconfig/20220203-135029-marostegui.json
* 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20082 and previous config saved to /var/cache/conftool/dbconfig/20220203-134952-marostegui.json
* 13:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 13:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20081 and previous config saved to /var/cache/conftool/dbconfig/20220203-134944-marostegui.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20080 and previous config saved to /var/cache/conftool/dbconfig/20220203-134746-marostegui.json
* 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20079 and previous config saved to /var/cache/conftool/dbconfig/20220203-134739-marostegui.json
* 13:44 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:40 jayme@cumin1001: START - Cookbook sre.dns.netbox
* 13:35 jbond: disable puppet fleet wide for puppetdb restart
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20078 and previous config saved to /var/cache/conftool/dbconfig/20220203-133439-marostegui.json
* 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20077 and previous config saved to /var/cache/conftool/dbconfig/20220203-133234-marostegui.json
* 13:28 marostegui: Test [[phab:T300858|T300858]]
* 13:28 moritzm: installing apache security updates
* 13:27 jayme: moved kubernetes staging master,nodes,etcd from wikimedia_cluster "kubernetes" to "kubernetes-staging" - [[phab:T273866|T273866]]
* 13:27 XioNoX: esams: push Capirca generated border-in filters
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20076 and previous config saved to /var/cache/conftool/dbconfig/20220203-131935-marostegui.json
* 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20075 and previous config saved to /var/cache/conftool/dbconfig/20220203-131729-marostegui.json
* 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20074 and previous config saved to /var/cache/conftool/dbconfig/20220203-130430-marostegui.json
* 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20073 and previous config saved to /var/cache/conftool/dbconfig/20220203-130224-marostegui.json
* 12:58 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
* 12:57 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
* 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20072 and previous config saved to /var/cache/conftool/dbconfig/20220203-125737-marostegui.json
* 12:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20071 and previous config saved to /var/cache/conftool/dbconfig/20220203-125730-marostegui.json
* 12:53 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
* 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
* 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
* 12:52 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
* 12:51 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
* 12:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
* 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
* 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
* 12:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:48 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
* 12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
* 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
* 12:44 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
* 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
* 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
* 12:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
* 12:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
* 12:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20069 and previous config saved to /var/cache/conftool/dbconfig/20220203-124225-marostegui.json
* 12:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:38 taavi: UTC morning backport window done
* 12:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:33 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:759479{{!}}mniwiktionary: Add localized mobile wordmark (T294709)]] (2/2) (duration: 00m 49s)
* 12:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:32 taavi@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-mni.svg: Config: [[gerrit:759479{{!}}mniwiktionary: Add localized mobile wordmark (T294709)]] (1/2) (duration: 00m 50s)
* 12:29 XioNoX: eqsin: push Capirca generated border-in filters
* 12:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20068 and previous config saved to /var/cache/conftool/dbconfig/20220203-122720-marostegui.json
* 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20067 and previous config saved to /var/cache/conftool/dbconfig/20220203-122612-marostegui.json
* 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20066 and previous config saved to /var/cache/conftool/dbconfig/20220203-122529-marostegui.json
* 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:19 XioNoX: codfw: push Capirca generated border-in filters
* 12:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:16 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:759469{{!}}commonswiki: Add www.gbols.smns-bw.org to the wgCopyUploadsDomains allowlist (T300842)]] (duration: 00m 50s)
* 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20065 and previous config saved to /var/cache/conftool/dbconfig/20220203-121216-marostegui.json
* 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20064 and previous config saved to /var/cache/conftool/dbconfig/20220203-121024-marostegui.json
* 12:10 XioNoX: eqord: push Capirca generated border-in filters
* 12:09 mlitn@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:759240{{!}}[WikibaseMediaInfo] Stop normalizing full text scores (T296631)]] (duration: 00m 52s)
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20063 and previous config saved to /var/cache/conftool/dbconfig/20220203-120832-marostegui.json
* 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20062 and previous config saved to /var/cache/conftool/dbconfig/20220203-120825-marostegui.json
* 11:57 kart_: Updated cxserver to 2022-02-03-112745-production, this should unbreak Flores MT!
* 11:57 XioNoX: ulsfo: push Capirca generated border-in filters
* 11:55 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
* 11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20061 and previous config saved to /var/cache/conftool/dbconfig/20220203-115519-marostegui.json
* 11:53 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
* 11:53 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20060 and previous config saved to /var/cache/conftool/dbconfig/20220203-115320-marostegui.json
* 11:51 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
* 11:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
* 11:49 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
* 11:47 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
* 11:46 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
* 11:46 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
* 11:45 moritzm: installing openjdk-11 security updates
* 11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20059 and previous config saved to /var/cache/conftool/dbconfig/20220203-114015-marostegui.json
* 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20058 and previous config saved to /var/cache/conftool/dbconfig/20220203-113907-marostegui.json
* 11:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 11:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20057 and previous config saved to /var/cache/conftool/dbconfig/20220203-113859-marostegui.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20056 and previous config saved to /var/cache/conftool/dbconfig/20220203-113815-marostegui.json
* 11:36 arturo: reprepro changes @ apt1001 after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/758050
* 11:33 moritzm: draining ganeti1020 for eventual reimage
* 11:26 vgutierrez: rolling varnish-fe restart to catch the new listen_depth config value
* 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20055 and previous config saved to /var/cache/conftool/dbconfig/20220203-112355-marostegui.json
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20054 and previous config saved to /var/cache/conftool/dbconfig/20220203-112311-marostegui.json
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20053 and previous config saved to /var/cache/conftool/dbconfig/20220203-111921-marostegui.json
* 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20052 and previous config saved to /var/cache/conftool/dbconfig/20220203-111908-marostegui.json
* 11:15 topranks: Adding BGP peering to lsw1-f1-eqiad on cr2-eqiad.  [[phab:T299758|T299758]].
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20051 and previous config saved to /var/cache/conftool/dbconfig/20220203-110850-marostegui.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20050 and previous config saved to /var/cache/conftool/dbconfig/20220203-110403-marostegui.json
* 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20049 and previous config saved to /var/cache/conftool/dbconfig/20220203-105345-marostegui.json
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20048 and previous config saved to /var/cache/conftool/dbconfig/20220203-105238-marostegui.json
* 10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20047 and previous config saved to /var/cache/conftool/dbconfig/20220203-105230-marostegui.json
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20046 and previous config saved to /var/cache/conftool/dbconfig/20220203-104858-marostegui.json
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20045 and previous config saved to /var/cache/conftool/dbconfig/20220203-103725-marostegui.json
* 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20044 and previous config saved to /var/cache/conftool/dbconfig/20220203-103354-marostegui.json
* 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20043 and previous config saved to /var/cache/conftool/dbconfig/20220203-103008-marostegui.json
* 10:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 10:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20042 and previous config saved to /var/cache/conftool/dbconfig/20220203-103001-marostegui.json
* 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20041 and previous config saved to /var/cache/conftool/dbconfig/20220203-102221-marostegui.json
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20040 and previous config saved to /var/cache/conftool/dbconfig/20220203-101456-marostegui.json
* 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1015.eqiad.wmnet
* 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1014.eqiad.wmnet
* 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20039 and previous config saved to /var/cache/conftool/dbconfig/20220203-100716-marostegui.json
* 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1013.eqiad.wmnet
* 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1012.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1010.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1015.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1014.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1013.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1012.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1011.eqiad.wmnet
* 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1010.eqiad.wmnet
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20038 and previous config saved to /var/cache/conftool/dbconfig/20220203-095952-marostegui.json
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20037 and previous config saved to /var/cache/conftool/dbconfig/20220203-095907-marostegui.json
* 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20036 and previous config saved to /var/cache/conftool/dbconfig/20220203-095859-marostegui.json
* 09:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1183.eqiad.wmnet with OS bullseye
* 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20034 and previous config saved to /var/cache/conftool/dbconfig/20220203-094447-marostegui.json
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20033 and previous config saved to /var/cache/conftool/dbconfig/20220203-094354-marostegui.json
* 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20032 and previous config saved to /var/cache/conftool/dbconfig/20220203-094107-marostegui.json
* 09:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 09:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20031 and previous config saved to /var/cache/conftool/dbconfig/20220203-094059-marostegui.json
* 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1183.eqiad.wmnet with OS bullseye
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20030 and previous config saved to /var/cache/conftool/dbconfig/20220203-092850-marostegui.json
* 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20029 and previous config saved to /var/cache/conftool/dbconfig/20220203-092554-marostegui.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20028 and previous config saved to /var/cache/conftool/dbconfig/20220203-091345-marostegui.json
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20027 and previous config saved to /var/cache/conftool/dbconfig/20220203-091237-marostegui.json
* 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20026 and previous config saved to /var/cache/conftool/dbconfig/20220203-091224-marostegui.json
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20025 and previous config saved to /var/cache/conftool/dbconfig/20220203-091050-marostegui.json
* 09:00 marostegui: Failover m2 from db1183 to db1159 - [[phab:T300329|T300329]]
* 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20024 and previous config saved to /var/cache/conftool/dbconfig/20220203-085720-marostegui.json
* 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20023 and previous config saved to /var/cache/conftool/dbconfig/20220203-085545-marostegui.json
* 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20022 and previous config saved to /var/cache/conftool/dbconfig/20220203-085159-marostegui.json
* 08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 08:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20021 and previous config saved to /var/cache/conftool/dbconfig/20220203-085151-marostegui.json
* 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20020 and previous config saved to /var/cache/conftool/dbconfig/20220203-084215-marostegui.json
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20019 and previous config saved to /var/cache/conftool/dbconfig/20220203-083647-marostegui.json
* 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20018 and previous config saved to /var/cache/conftool/dbconfig/20220203-082710-marostegui.json
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20017 and previous config saved to /var/cache/conftool/dbconfig/20220203-082302-marostegui.json
* 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 08:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20016 and previous config saved to /var/cache/conftool/dbconfig/20220203-082249-marostegui.json
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20015 and previous config saved to /var/cache/conftool/dbconfig/20220203-082142-marostegui.json
* 08:10 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 5hours)
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20014 and previous config saved to /var/cache/conftool/dbconfig/20220203-080745-marostegui.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20013 and previous config saved to /var/cache/conftool/dbconfig/20220203-080637-marostegui.json
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20012 and previous config saved to /var/cache/conftool/dbconfig/20220203-080254-marostegui.json
* 08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 08:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20011 and previous config saved to /var/cache/conftool/dbconfig/20220203-080247-marostegui.json
* 07:55 _joe_: restarted php-fpm on wtp1029, segfaulting
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20010 and previous config saved to /var/cache/conftool/dbconfig/20220203-075240-marostegui.json
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20009 and previous config saved to /var/cache/conftool/dbconfig/20220203-074742-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20008 and previous config saved to /var/cache/conftool/dbconfig/20220203-073735-marostegui.json
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20007 and previous config saved to /var/cache/conftool/dbconfig/20220203-073237-marostegui.json
* 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20006 and previous config saved to /var/cache/conftool/dbconfig/20220203-073129-marostegui.json
* 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 07:23 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 [[phab:T300329|T300329]]
* 07:23 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 [[phab:T300329|T300329]]
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20005 and previous config saved to /var/cache/conftool/dbconfig/20220203-071732-marostegui.json
* 07:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 07:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20004 and previous config saved to /var/cache/conftool/dbconfig/20220203-071348-marostegui.json
* 07:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 07:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 07:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P20003 and previous config saved to /var/cache/conftool/dbconfig/20220203-071141-marostegui.json
* 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P20002 and previous config saved to /var/cache/conftool/dbconfig/20220203-071111-marostegui.json
* 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20001 and previous config saved to /var/cache/conftool/dbconfig/20220203-065636-marostegui.json
* 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20000 and previous config saved to /var/cache/conftool/dbconfig/20220203-065606-marostegui.json
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19999 and previous config saved to /var/cache/conftool/dbconfig/20220203-064131-marostegui.json
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P19998 and previous config saved to /var/cache/conftool/dbconfig/20220203-064101-marostegui.json
* 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19997 and previous config saved to /var/cache/conftool/dbconfig/20220203-062627-marostegui.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19996 and previous config saved to /var/cache/conftool/dbconfig/20220203-062556-marostegui.json
* 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19995 and previous config saved to /var/cache/conftool/dbconfig/20220203-062243-marostegui.json
* 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19994 and previous config saved to /var/cache/conftool/dbconfig/20220203-061703-marostegui.json
* 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 01:12 brennen: UTC late backport window finished
* 01:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti2029.codfw.wmnet with OS buster
* 01:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:09 brennen@deploy1002: Finished scap: Backports: [[gerrit:759308{{!}}Changes the labels of the Vector skins (T299927)]] and [[gerrit:759309{{!}}Pass skin name to Hooks::isSkinLegacy (T299971)]] (duration: 24m 48s)
* 01:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:44 brennen@deploy1002: Started scap: Backports: [[gerrit:759308{{!}}Changes the labels of the Vector skins (T299927)]] and [[gerrit:759309{{!}}Pass skin name to Hooks::isSkinLegacy (T299971)]]
* 00:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster


== 2022-02-02 ==
== 2023-06-02 ==
* 22:26 mutante: gitlab - introducing parameter to fetch TLS certs either with acmechief or certbot (if in cloud). Boolean $use_acmechief = lookup('profile::gitlab::use_acmechief'), confirmed noop in prod on gitlab1001.wikimedia.org ( [[phab:T297411|T297411]])
* 20:16 apergos: rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps'
* 21:36 ejegg: updated CiviCRM from {{Gerrit|2bd5fb5e}} to {{Gerrit|7dcdc017}}
* 18:42 bblack: dns*: puppets are all re-enabled, ntp restarts are done, etc
* 20:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 20:04 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.20  refs [[phab:T293961|T293961]] (duration: 00m 49s)
* 17:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
* 20:03 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.20  refs [[phab:T293961|T293961]]
* 17:27 bblack: dns*: disabling puppet to control rollout of NTP config fixups
* 19:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:03 bblack: dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed
* 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:35 sukhe: restart ntp.service on dns1002
* 19:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:26 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 19:49 dancy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:759307{{!}}cowikimedia: Allow bureaucrats to remove sysop and bureaucrat flags (T300779)]] (duration: 00m 50s)
* 13:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 19:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:25 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 19:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:25 ottomata: deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479
* 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:24 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:24 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 19:42 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: [[gerrit:622408{{!}}multiversion: Improve error message if wikiversions.php has wrong format]] (duration: 00m 49s)
* 13:24 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 19:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:24 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 19:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|62b2acb}}: Migration mode enabled everywhere ([[phab:T299927|T299927]]) (duration: 00m 49s)
* 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:03 moritzm: installing at-spi2-core bugfix updates from Bullseye point release
* 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:35 moritzm: installing texlive-security updates on buster
* 19:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:18 akosiaris: update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) [[phab:T337836|T337836]]. Reload systemd for unit changes to take effect
* 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet
* 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:52 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet
* 19:27 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/SkinVector.php: {{Gerrit|bdc20ddb357e6a993cc4a3d9ddbecac964843744}}: Fix the opt in URl ([[phab:T300097|T300097]]) (duration: 00m 49s)
* 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet
* 19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet
* 19:24 urbanecm@deploy1002: Synchronized wmf-config/: {{Gerrit|a48f8bd}}: Migrate calls of wmf* constants to wmg* constants ([[phab:T45956|T45956]]) (duration: 00m 51s)
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet
* 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet
* 19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19993 and previous config saved to /var/cache/conftool/dbconfig/20220202-191918-marostegui.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet
* 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet
* 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts:
* 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 0 hosts:
* 19:14 urbanecm@deploy1002: Synchronized multiversion/buildConfigCache.php: {{Gerrit|83f1f6a}}: Consistently write to $wmgRealm the same value as to $wmfRealm ([[phab:T45956|T45956]]) (duration: 00m 49s)
* 08:42 moritzm: installing traceroute bugfix updates from Bullseye point release
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org
* 19:10 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/<nowiki>{</nowiki>kywiki,kywiki-1.5x,kywiki-2x<nowiki>}</nowiki>.png ([[phab:T300241|T300241]])
* 07:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org
* 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org
* 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org
* 19:09 topranks: Running homer to enable interface et-1/0/2 on cr1-eqiad (towards lsw1-e1-eqiad) to test connectivity.
* 07:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 19:09 urbanecm@deploy1002: Synchronized logos/config.yaml: {{Gerrit|335cbee}}: kywiki: update logo (3/3; [[phab:T300241|T300241]]) (duration: 00m 49s)
* 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:22 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 19:08 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|335cbee}}: kywiki: update logo (2/3; [[phab:T300241|T300241]]) (duration: 00m 53s)
* 07:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
* 19:07 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|335cbee}}: kywiki: update logo (1/3; [[phab:T300241|T300241]]) (duration: 00m 50s)
* 01:53 ejegg: fundraising python tools upgraded from {{Gerrit|759d4c89}} to {{Gerrit|2ca83336}}
* 19:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19992 and previous config saved to /var/cache/conftool/dbconfig/20220202-190414-marostegui.json
* 01:22 cstone: civicrm upgraded from {{Gerrit|3819d6d1}} to {{Gerrit|bcc8fccc}}
* 18:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19991 and previous config saved to /var/cache/conftool/dbconfig/20220202-184909-marostegui.json
* 18:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19990 and previous config saved to /var/cache/conftool/dbconfig/20220202-183404-marostegui.json
* 18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19989 and previous config saved to /var/cache/conftool/dbconfig/20220202-183034-marostegui.json
* 18:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 18:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19988 and previous config saved to /var/cache/conftool/dbconfig/20220202-183027-marostegui.json
* 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 18:16 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/filerepo/file/ForeignAPIFile.php: Backport: [[gerrit:758925{{!}}Revert "Support audio on filepage in InstantCommons" (T300751)]] (duration: 00m 51s)
* 18:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19987 and previous config saved to /var/cache/conftool/dbconfig/20220202-181522-marostegui.json
* 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19986 and previous config saved to /var/cache/conftool/dbconfig/20220202-180018-marostegui.json
* 17:45 cwhite: end logstash upgrade (codfw) [[phab:T299168|T299168]]
* 17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19985 and previous config saved to /var/cache/conftool/dbconfig/20220202-174513-marostegui.json
* 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19984 and previous config saved to /var/cache/conftool/dbconfig/20220202-174138-marostegui.json
* 17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19983 and previous config saved to /var/cache/conftool/dbconfig/20220202-174125-marostegui.json
* 17:32 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 17:26 cwhite: begin logstash upgrade (codfw) [[phab:T299168|T299168]]
* 17:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19982 and previous config saved to /var/cache/conftool/dbconfig/20220202-172620-marostegui.json
* 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19981 and previous config saved to /var/cache/conftool/dbconfig/20220202-171115-marostegui.json
* 16:59 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
* 16:59 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 16:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19979 and previous config saved to /var/cache/conftool/dbconfig/20220202-165611-marostegui.json
* 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
* 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
* 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
* 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2012.codfw.wmnet
* 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2011.codfw.wmnet
* 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2010.codfw.wmnet
* 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
* 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
* 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
* 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2012.codfw.wmnet
* 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2011.codfw.wmnet
* 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2010.codfw.wmnet
* 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2005.codfw.wmnet
* 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2006.codfw.wmnet
* 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2007.codfw.wmnet
* 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2008.codfw.wmnet
* 16:41 Emperor: standardising nginx weights for codfw swift proxies to match eqiad ones [[phab:T300738|T300738]]
* 16:41 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2009.codfw.wmnet
* 16:41 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2009.codfw.wmnet
* 16:39 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
* 16:38 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
* 16:30 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:27 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
* 16:26 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
* 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19977 and previous config saved to /var/cache/conftool/dbconfig/20220202-162435-marostegui.json
* 16:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 16:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19976 and previous config saved to /var/cache/conftool/dbconfig/20220202-162428-marostegui.json
* 16:24 jbond: disable ldap email checks on mx2001
* 16:19 Emperor: rolling restart of swift frontends to bring new ones into service [[phab:T300738|T300738]]
* 16:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19975 and previous config saved to /var/cache/conftool/dbconfig/20220202-160923-marostegui.json
* 15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19974 and previous config saved to /var/cache/conftool/dbconfig/20220202-155418-marostegui.json
* 15:45 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
* 15:44 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:43 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
* 15:43 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:41 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
* 15:41 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19973 and previous config saved to /var/cache/conftool/dbconfig/20220202-153913-marostegui.json
* 15:37 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
* 15:37 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:35 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
* 15:35 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
* 15:34 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
* 15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19972 and previous config saved to /var/cache/conftool/dbconfig/20220202-153206-marostegui.json
* 15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
* 15:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 15:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
* 15:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 15:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
* 15:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 15:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19970 and previous config saved to /var/cache/conftool/dbconfig/20220202-152552-marostegui.json
* 15:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
* 15:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
* 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19969 and previous config saved to /var/cache/conftool/dbconfig/20220202-151047-marostegui.json
* 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19968 and previous config saved to /var/cache/conftool/dbconfig/20220202-150832-root.json
* 15:00 XioNoX: esams: push Capirca generated loopback filters
* 14:59 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19967 and previous config saved to /var/cache/conftool/dbconfig/20220202-145542-marostegui.json
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19966 and previous config saved to /var/cache/conftool/dbconfig/20220202-145329-root.json
* 14:47 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:44 XioNoX: codfw: push Capirca generated loopback filters
* 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19965 and previous config saved to /var/cache/conftool/dbconfig/20220202-144038-marostegui.json
* 14:39 jayme@cumin1001: START - Cookbook sre.dns.netbox
* 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19963 and previous config saved to /var/cache/conftool/dbconfig/20220202-143825-root.json
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19962 and previous config saved to /var/cache/conftool/dbconfig/20220202-143221-marostegui.json
* 14:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 14:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19961 and previous config saved to /var/cache/conftool/dbconfig/20220202-143214-marostegui.json
* 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19960 and previous config saved to /var/cache/conftool/dbconfig/20220202-142321-root.json
* 14:21 XioNoX: eqsin: push Capirca generated loopback filters
* 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19959 and previous config saved to /var/cache/conftool/dbconfig/20220202-141709-marostegui.json
* 14:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:15 XioNoX: cr2-eqdfw: push Capirca generated loopback filters
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es1020 - as it is the master', diff saved to https://phabricator.wikimedia.org/P19958 and previous config saved to /var/cache/conftool/dbconfig/20220202-141455-marostegui.json
* 14:13 vgutierrez: pool cp1087 running envoy as TLS terminator - [[phab:T271421|T271421]]
* 14:09 XioNoX: cr2-eqord: push Capirca generated loopback filters
* 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19957 and previous config saved to /var/cache/conftool/dbconfig/20220202-140818-root.json
* 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179 schema change', diff saved to https://phabricator.wikimedia.org/P19956 and previous config saved to /var/cache/conftool/dbconfig/20220202-140317-marostegui.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19955 and previous config saved to /var/cache/conftool/dbconfig/20220202-140239-root.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19954 and previous config saved to /var/cache/conftool/dbconfig/20220202-140204-marostegui.json
* 13:50 elukey: move docker on ml-serve-ctrl* nodes from device mapper to overlay2
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19953 and previous config saved to /var/cache/conftool/dbconfig/20220202-134735-root.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19952 and previous config saved to /var/cache/conftool/dbconfig/20220202-134659-marostegui.json
* 13:40 XioNoX: ULSFO routers: push Capirca generated loopback filters
* 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19951 and previous config saved to /var/cache/conftool/dbconfig/20220202-133713-marostegui.json
* 13:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 13:35 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
* 13:34 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
* 13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on production
* 13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
* 13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
* 13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
* 13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on production
* 13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on canary
* 13:32 ottomata: roll restarting eventgate-main to pick up stream-configs for rdf-streaming-updater.reconcile
* 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19949 and previous config saved to /var/cache/conftool/dbconfig/20220202-133231-root.json
* 13:31 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
* 13:31 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
* 13:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 13:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 13:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
* 13:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 13:29 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
* 13:28 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on canary
* 13:28 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync on production
* 13:25 XioNoX: rename cr3-ulsfo loopback terms in preparation of move to Capirca
* 13:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 13:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19947 and previous config saved to /var/cache/conftool/dbconfig/20220202-132510-marostegui.json
* 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19946 and previous config saved to /var/cache/conftool/dbconfig/20220202-131728-root.json
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19945 and previous config saved to /var/cache/conftool/dbconfig/20220202-131006-marostegui.json
* 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19944 and previous config saved to /var/cache/conftool/dbconfig/20220202-130224-root.json
* 12:59 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:670224{{!}}ULS: Remove unused ULSEventLogging variable (T275894)]] (duration: 00m 49s)
* 12:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19942 and previous config saved to /var/cache/conftool/dbconfig/20220202-125500-marostegui.json
* 12:54 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:570625{{!}}Clean-up decommisioned Print schema configs (T196159)]] (duration: 00m 50s)
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19941 and previous config saved to /var/cache/conftool/dbconfig/20220202-125034-root.json
* 12:43 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1087.eqiad.wmnet with OS buster
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19940 and previous config saved to /var/cache/conftool/dbconfig/20220202-124122-marostegui.json
* 12:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 12:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19939 and previous config saved to /var/cache/conftool/dbconfig/20220202-124115-marostegui.json
* 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19938 and previous config saved to /var/cache/conftool/dbconfig/20220202-123956-marostegui.json
* 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19937 and previous config saved to /var/cache/conftool/dbconfig/20220202-123531-root.json
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19936 and previous config saved to /var/cache/conftool/dbconfig/20220202-123127-marostegui.json
* 12:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
* 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19934 and previous config saved to /var/cache/conftool/dbconfig/20220202-122610-marostegui.json
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19933 and previous config saved to /var/cache/conftool/dbconfig/20220202-122112-marostegui.json
* 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
* 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 65%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19932 and previous config saved to /var/cache/conftool/dbconfig/20220202-122027-root.json
* 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:11 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:758443{{!}}prod: READ_NEW for CentralAuth hidden level migration (T289068)]] (duration: 00m 50s)
* 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19930 and previous config saved to /var/cache/conftool/dbconfig/20220202-121105-marostegui.json
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19929 and previous config saved to /var/cache/conftool/dbconfig/20220202-120608-marostegui.json
* 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19928 and previous config saved to /var/cache/conftool/dbconfig/20220202-120524-root.json
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19927 and previous config saved to /var/cache/conftool/dbconfig/20220202-115601-marostegui.json
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19926 and previous config saved to /var/cache/conftool/dbconfig/20220202-115103-marostegui.json
* 11:50 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp1087.eqiad.wmnet with OS buster
* 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19925 and previous config saved to /var/cache/conftool/dbconfig/20220202-115020-root.json
* 11:48 vgutierrez: depool cp1087 to be reimaged as cache::text_envoy - [[phab:T271421|T271421]]
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19924 and previous config saved to /var/cache/conftool/dbconfig/20220202-114639-marostegui.json
* 11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
* 11:45 _joe_: repooling thanos-fe1001 [[phab:T300119|T300119]]
* 11:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19923 and previous config saved to /var/cache/conftool/dbconfig/20220202-113558-marostegui.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19922 and previous config saved to /var/cache/conftool/dbconfig/20220202-113516-root.json
* 11:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19921 and previous config saved to /var/cache/conftool/dbconfig/20220202-113007-marostegui.json
* 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19920 and previous config saved to /var/cache/conftool/dbconfig/20220202-112849-marostegui.json
* 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:28 _joe_: depooling thanos-fe1001 for testing [[phab:T300119|T300119]]
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 15%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19919 and previous config saved to /var/cache/conftool/dbconfig/20220202-112013-root.json
* 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19918 and previous config saved to /var/cache/conftool/dbconfig/20220202-111804-marostegui.json
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19917 and previous config saved to /var/cache/conftool/dbconfig/20220202-111502-marostegui.json
* 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19916 and previous config saved to /var/cache/conftool/dbconfig/20220202-110509-root.json
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19915 and previous config saved to /var/cache/conftool/dbconfig/20220202-110259-marostegui.json
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19914 and previous config saved to /var/cache/conftool/dbconfig/20220202-105957-marostegui.json
* 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19913 and previous config saved to /var/cache/conftool/dbconfig/20220202-105006-root.json
* 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19912 and previous config saved to /var/cache/conftool/dbconfig/20220202-104755-marostegui.json
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19911 and previous config saved to /var/cache/conftool/dbconfig/20220202-104453-marostegui.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges and recentchanges groups from s4 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P19910 and previous config saved to /var/cache/conftool/dbconfig/20220202-103830-marostegui.json
* 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 2%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19909 and previous config saved to /var/cache/conftool/dbconfig/20220202-103502-root.json
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1021 after reimage', diff saved to https://phabricator.wikimedia.org/P19908 and previous config saved to /var/cache/conftool/dbconfig/20220202-103436-marostegui.json
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 1%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19907 and previous config saved to /var/cache/conftool/dbconfig/20220202-103401-root.json
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19906 and previous config saved to /var/cache/conftool/dbconfig/20220202-103250-marostegui.json
* 10:28 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 ([[phab:T298558|T298558]])', diff saved to https://phabricator.wikimedia.org/P19905 and previous config saved to /var/cache/conftool/dbconfig/20220202-102717-marostegui.json
* 10:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 10:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
* 10:23 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:22 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1008.eqiad.wmnet with OS buster
* 10:21 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:21 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:11 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1021.eqiad.wmnet with OS bullseye
* 10:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
* 10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
* 10:09 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
* 10:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 10:06 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 04s)
* 10:06 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 10:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 10:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1008.eqiad.wmnet with OS buster
* 09:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1019.eqiad.wmnet with OS buster
* 09:40 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bullseye
* 09:39 moritzm: installing apache/apache-modsecurity2 security updates
* 09:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19904 and previous config saved to /var/cache/conftool/dbconfig/20220202-093231-marostegui.json
* 09:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 09:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19903 and previous config saved to /var/cache/conftool/dbconfig/20220202-093223-marostegui.json
* 09:28 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19902 and previous config saved to /var/cache/conftool/dbconfig/20220202-091718-marostegui.json
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS buster
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1021 [[phab:T300127|T300127]]', diff saved to https://phabricator.wikimedia.org/P19901 and previous config saved to /var/cache/conftool/dbconfig/20220202-091355-marostegui.json
* 09:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
* 09:10 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:08 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
* 09:08 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 09:07 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Enable writes on es4 [[phab:T300127|T300127]] (duration: 00m 50s)
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19900 and previous config saved to /var/cache/conftool/dbconfig/20220202-090214-marostegui.json
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1020 to es4 primary and set section read-write [[phab:T300127|T300127]]', diff saved to https://phabricator.wikimedia.org/P19899 and previous config saved to /var/cache/conftool/dbconfig/20220202-090121-marostegui.json
* 09:00 marostegui: Starting es4 eqiad failover from es1021 to es1020 - [[phab:T300127|T300127]]
* 08:52 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
* 08:52 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
* 08:48 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Switchover es4 [[phab:T300127|T300127]]
* 08:48 root@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Switchover es4 [[phab:T300127|T300127]]
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19898 and previous config saved to /var/cache/conftool/dbconfig/20220202-084709-marostegui.json
* 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19897 and previous config saved to /var/cache/conftool/dbconfig/20220202-084150-marostegui.json
* 08:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 08:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19896 and previous config saved to /var/cache/conftool/dbconfig/20220202-084143-marostegui.json
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19895 and previous config saved to /var/cache/conftool/dbconfig/20220202-082638-marostegui.json
* 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19894 and previous config saved to /var/cache/conftool/dbconfig/20220202-081134-marostegui.json
* 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19893 and previous config saved to /var/cache/conftool/dbconfig/20220202-075629-marostegui.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19892 and previous config saved to /var/cache/conftool/dbconfig/20220202-075244-marostegui.json
* 07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 07:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19891 and previous config saved to /var/cache/conftool/dbconfig/20220202-075236-marostegui.json
* 07:51 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1) (duration: 04m 09s)
* 07:47 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1)
* 07:46 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 07:45 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 07:44 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (duration: 02m 19s)
* 07:42 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Set es1020 with weight 10 [[phab:T300127|T300127]]', diff saved to https://phabricator.wikimedia.org/P19890 and previous config saved to /var/cache/conftool/dbconfig/20220202-073918-root.json
* 07:38 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switchover es4 [[phab:T300127|T300127]]
* 07:38 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switchover es4 [[phab:T300127|T300127]]
* 07:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19889 and previous config saved to /var/cache/conftool/dbconfig/20220202-073731-marostegui.json
* 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:36 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Disable writes on es4 [[phab:T300127|T300127]] (duration: 00m 50s)
* 07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:30 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Disable writes on es4 [[phab:T300127|T300127]] (duration: 00m 51s)
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19888 and previous config saved to /var/cache/conftool/dbconfig/20220202-072227-marostegui.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19887 and previous config saved to /var/cache/conftool/dbconfig/20220202-070722-marostegui.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T300402|T300402]])', diff saved to https://phabricator.wikimedia.org/P19886 and previous config saved to /var/cache/conftool/dbconfig/20220202-070012-marostegui.json
* 07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 06:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 06:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
* 02:54 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS buster
* 02:19 ejegg: updated CiviCRM from {{Gerrit|0513f1b7}} to {{Gerrit|3d379e25}}
* 01:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS buster
* 01:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS buster
* 01:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
* 01:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
* 01:12 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
* 01:12 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
* 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:03 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:753788{{!}}rdf-streaming-updater: add the reconciliation stream (T279541)]] (duration: 00m 49s)
* 00:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
* 00:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:51 urbanecm: UTC late B&C window completed
* 00:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b560843182f2f8dc0b189cd80f021b60749c5c90}}: Add wgUploadNavigationUrl upload page of ptwikinews ([[phab:T300466|T300466]]) (duration: 00m 50s)
* 00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS buster
* 00:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:40 urbanecm@deploy1002: Synchronized docroot/noc/db.php: {{Gerrit|06444c16d29d78256d270564ae25ad887d3a2112}}: Start writing to some wmg* constants ([[phab:T45956|T45956]]; 2/2) (duration: 00m 49s)
* 00:39 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|06444c16d29d78256d270564ae25ad887d3a2112}}: Start writing to some wmg* constants ([[phab:T45956|T45956]]; 1/2) (duration: 00m 49s)
* 00:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:29 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b2c13c64029cba3cd34f0e6144d322508fb4afb4}}: Enable migration mode on all group 0, group 1 and desktop-improvement wikis ([[phab:T299927|T299927]]) (duration: 01m 58s)
* 00:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS buster
* 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn


== 2022-02-01 ==
== 2023-06-01 ==
* 22:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
* 21:06 samtar@deploy1002: Finished scap: Backport for [[gerrit:925858{{!}}Remove deleted config wgVectorStickyHeaderEdit (T337955)]] (duration: 08m 30s)
* 22:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2002-dev.codfw.wmnet with OS bullseye
* 20:59 samtar@deploy1002: esanders and samtar: Backport for [[gerrit:925858{{!}}Remove deleted config wgVectorStickyHeaderEdit (T337955)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
* 20:57 samtar@deploy1002: Started scap: Backport for [[gerrit:925858{{!}}Remove deleted config wgVectorStickyHeaderEdit (T337955)]]
* 22:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ml-serve2005.codfw.wmnet with OS buster
* 20:54 samtar@deploy1002: Finished scap: Backport for [[gerrit:925792{{!}}Remove config and AB test code for edit buttons in sticky header (T337955)]] (duration: 10m 29s)
* 22:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
* 20:45 samtar@deploy1002: samtar and ksarabia: Backport for [[gerrit:925792{{!}}Remove config and AB test code for
* 21:55 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2002-dev.codfw.wmnet with OS bullseye
* 21:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:14 Lucas_WMDE: Deployed patch for [[phab:T297754|T297754]]
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:42 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.17 (duration: 01m 35s)
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:40 mwdebug-deploy@deploy1002: helmfile [


==Archives==
==Archives ==
See [[Server Admin Log/Archives]].
See [[Server Admin Log/Archives]].
<noinclude>
<noinclude>

Latest revision as of 21:50, 9 June 2023

2023-06-09

  • 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
  • 21:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
  • 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
  • 20:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
  • 20:38 btullis@cumin1001: END (ERROR) - Cookbook sre.aqs.roll-restart-reboot (exit_code=97) rolling restart_daemons on A:aqs
  • 20:23 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs
  • 17:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
  • 17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1016.eqiad.wmnet with OS buster
  • 17:34 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49398 and previous config saved to /var/cache/conftool/dbconfig/20230609-173202-ladsgroup.json
  • 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49397 and previous config saved to /var/cache/conftool/dbconfig/20230609-171656-ladsgroup.json
  • 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P49396 and previous config saved to /var/cache/conftool/dbconfig/20230609-170150-ladsgroup.json
  • 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS buster
  • 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49395 and previous config saved to /var/cache/conftool/dbconfig/20230609-164644-ladsgroup.json
  • 16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T336886)', diff saved to https://phabricator.wikimedia.org/P49394 and previous config saved to /var/cache/conftool/dbconfig/20230609-163007-ladsgroup.json
  • 16:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 16:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49393 and previous config saved to /var/cache/conftool/dbconfig/20230609-162946-ladsgroup.json
  • 16:20 urandom: powercycling restbase1028
  • 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49392 and previous config saved to /var/cache/conftool/dbconfig/20230609-161440-ladsgroup.json
  • 16:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['snapshot1016']
  • 16:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['snapshot1016']
  • 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P49391 and previous config saved to /var/cache/conftool/dbconfig/20230609-155934-ladsgroup.json
  • 15:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49390 and previous config saved to /var/cache/conftool/dbconfig/20230609-154428-ladsgroup.json
  • 15:30 andrewbogott: wikitech-static: deleted everything in /srv/mediawiki/images/wikitech/archive for T338520
  • 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T336886)', diff saved to https://phabricator.wikimedia.org/P49388 and previous config saved to /var/cache/conftool/dbconfig/20230609-152845-ladsgroup.json
  • 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49387 and previous config saved to /var/cache/conftool/dbconfig/20230609-152824-ladsgroup.json
  • 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1017.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host snapshot1016.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"
  • 15:22 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for snapshot101[6-7] - pt1979@cumin2002"
  • 15:17 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49386 and previous config saved to /var/cache/conftool/dbconfig/20230609-151318-ladsgroup.json
  • 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P49385 and previous config saved to /var/cache/conftool/dbconfig/20230609-145812-ladsgroup.json
  • 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49384 and previous config saved to /var/cache/conftool/dbconfig/20230609-144305-ladsgroup.json
  • 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T336886)', diff saved to https://phabricator.wikimedia.org/P49383 and previous config saved to /var/cache/conftool/dbconfig/20230609-142731-ladsgroup.json
  • 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 14:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 14:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49382 and previous config saved to /var/cache/conftool/dbconfig/20230609-142655-ladsgroup.json
  • 14:14 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet
  • 14:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49381 and previous config saved to /var/cache/conftool/dbconfig/20230609-141149-ladsgroup.json
  • 13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P49380 and previous config saved to /var/cache/conftool/dbconfig/20230609-135643-ladsgroup.json
  • 13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49379 and previous config saved to /var/cache/conftool/dbconfig/20230609-134137-ladsgroup.json
  • 13:29 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk
  • 13:29 sukhe: start pybal on lvs2013
  • 13:28 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cp4037.ulsfo.wmnet with reason: Working on vk
  • 13:25 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet
  • 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49378 and previous config saved to /var/cache/conftool/dbconfig/20230609-132541-ladsgroup.json
  • 13:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 13:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49377 and previous config saved to /var/cache/conftool/dbconfig/20230609-132520-ladsgroup.json
  • 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49376 and previous config saved to /var/cache/conftool/dbconfig/20230609-131014-ladsgroup.json
  • 13:07 sukhe: stop pybal on lvs2013 to test lvs2014
  • 13:02 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs2014
  • 13:02 sukhe: sudo cumin 'A:lvs and A:codfw' 'enable-puppet "CR 928818"'
  • 13:01 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
  • 12:59 sukhe: sudo cumin 'A:lvs and A:codfw' 'disable-puppet "CR 928818"'
  • 12:57 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014
  • 12:57 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
  • 12:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2014
  • 12:55 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2014
  • 12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P49373 and previous config saved to /var/cache/conftool/dbconfig/20230609-125508-ladsgroup.json
  • 12:50 krinkle@deploy1002: Finished scap: I385d28 (duration: 06m 59s)
  • 12:43 krinkle@deploy1002: Started scap: I385d28
  • 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49371 and previous config saved to /var/cache/conftool/dbconfig/20230609-124002-ladsgroup.json
  • 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:30 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001"
  • 12:29 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Re-add DNS for cloud-hosts-codfw vlan. - cmooney@cumin1001"
  • 12:27 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T336886)', diff saved to https://phabricator.wikimedia.org/P49370 and previous config saved to /var/cache/conftool/dbconfig/20230609-122303-ladsgroup.json
  • 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 12:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T336886)', diff saved to https://phabricator.wikimedia.org/P49369 and previous config saved to /var/cache/conftool/dbconfig/20230609-122243-ladsgroup.json
  • 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:16 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002"
  • 12:15 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2003-dev - aborrero@cumin2002"
  • 12:13 aborrero@cumin2002: START - Cookbook sre.dns.netbox
  • 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49368 and previous config saved to /var/cache/conftool/dbconfig/20230609-120737-ladsgroup.json
  • 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 778 hosts
  • 11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P49367 and previous config saved to /var/cache/conftool/dbconfig/20230609-115230-ladsgroup.json
  • 11:52 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 778 hosts
  • 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Fsero out of all services on: 1262 hosts
  • 11:49 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Fsero out of all services on: 1262 hosts
  • 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T336886)', diff saved to https://phabricator.wikimedia.org/P49366 and previous config saved to /var/cache/conftool/dbconfig/20230609-113724-ladsgroup.json
  • 11:27 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T336886)', diff saved to https://phabricator.wikimedia.org/P49365 and previous config saved to /var/cache/conftool/dbconfig/20230609-112250-ladsgroup.json
  • 11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T336886)', diff saved to https://phabricator.wikimedia.org/P49364 and previous config saved to /var/cache/conftool/dbconfig/20230609-112229-ladsgroup.json
  • 11:20 sukhe: pcc-db1001: sudo systemctl start pcc_facts_processor.service
  • 11:14 sukhe: sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080
  • 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49363 and previous config saved to /var/cache/conftool/dbconfig/20230609-110723-ladsgroup.json
  • 11:02 sukhe: homer "cr*-codfw*" commit "Gerrit: 928113 add new LVS host lvs2014
  • 10:53 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2014.codfw.wmnet with OS bullseye
  • 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P49362 and previous config saved to /var/cache/conftool/dbconfig/20230609-105217-ladsgroup.json
  • 10:40 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
  • 10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T336886)', diff saved to https://phabricator.wikimedia.org/P49361 and previous config saved to /var/cache/conftool/dbconfig/20230609-103711-ladsgroup.json
  • 10:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
  • 10:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T336886)', diff saved to https://phabricator.wikimedia.org/P49360 and previous config saved to /var/cache/conftool/dbconfig/20230609-102217-ladsgroup.json
  • 10:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T336886)', diff saved to https://phabricator.wikimedia.org/P49359 and previous config saved to /var/cache/conftool/dbconfig/20230609-102156-ladsgroup.json
  • 10:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
  • 10:12 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
  • 10:12 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop: sync
  • 10:09 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
  • 10:08 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
  • 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49358 and previous config saved to /var/cache/conftool/dbconfig/20230609-100650-ladsgroup.json
  • 09:57 elukey: increase {eqiad,codfw}.change-prop.transcludes.resource-change topic partitions (3->5) on kafka main clusters - T338357
  • 09:56 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 09:54 moritzm: installing jupyter-core security updates on bullseye
  • 09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P49357 and previous config saved to /var/cache/conftool/dbconfig/20230609-095144-ladsgroup.json
  • 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T336886)', diff saved to https://phabricator.wikimedia.org/P49356 and previous config saved to /var/cache/conftool/dbconfig/20230609-093638-ladsgroup.json
  • 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T336886)', diff saved to https://phabricator.wikimedia.org/P49355 and previous config saved to /var/cache/conftool/dbconfig/20230609-092141-ladsgroup.json
  • 09:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 09:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 09:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T336886)', diff saved to https://phabricator.wikimedia.org/P49354 and previous config saved to /var/cache/conftool/dbconfig/20230609-090829-ladsgroup.json
  • 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49353 and previous config saved to /var/cache/conftool/dbconfig/20230609-085322-ladsgroup.json
  • 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P49352 and previous config saved to /var/cache/conftool/dbconfig/20230609-083816-ladsgroup.json
  • 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T336886)', diff saved to https://phabricator.wikimedia.org/P49351 and previous config saved to /var/cache/conftool/dbconfig/20230609-082310-ladsgroup.json
  • 08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T336886)', diff saved to https://phabricator.wikimedia.org/P49350 and previous config saved to /var/cache/conftool/dbconfig/20230609-080708-ladsgroup.json
  • 08:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T336886)', diff saved to https://phabricator.wikimedia.org/P49349 and previous config saved to /var/cache/conftool/dbconfig/20230609-080637-ladsgroup.json
  • 07:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49348 and previous config saved to /var/cache/conftool/dbconfig/20230609-075130-ladsgroup.json
  • 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P49347 and previous config saved to /var/cache/conftool/dbconfig/20230609-073624-ladsgroup.json
  • 07:33 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1492.eqiad.wmnet
  • 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T336886)', diff saved to https://phabricator.wikimedia.org/P49346 and previous config saved to /var/cache/conftool/dbconfig/20230609-072118-ladsgroup.json
  • 07:19 moritzm: powercycling restbase2018 (kernel hung following what looks like I/O errors)
  • 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T336886)', diff saved to https://phabricator.wikimedia.org/P49345 and previous config saved to /var/cache/conftool/dbconfig/20230609-070520-ladsgroup.json
  • 07:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 07:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T336886)', diff saved to https://phabricator.wikimedia.org/P49344 and previous config saved to /var/cache/conftool/dbconfig/20230609-070459-ladsgroup.json
  • 06:50 moritzm: installing wireshark security updates
  • 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49343 and previous config saved to /var/cache/conftool/dbconfig/20230609-064953-ladsgroup.json
  • 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster2005.codfw.wmnet
  • 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster2005.codfw.wmnet
  • 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: puppetmaster1005.eqiad.wmnet
  • 06:49 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: puppetmaster1005.eqiad.wmnet
  • 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: prometheus3001.esams.wmnet
  • 06:48 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: prometheus3001.esams.wmnet
  • 06:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
  • 06:44 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
  • 06:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P49342 and previous config saved to /var/cache/conftool/dbconfig/20230609-063447-ladsgroup.json
  • 06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T336886)', diff saved to https://phabricator.wikimedia.org/P49341 and previous config saved to /var/cache/conftool/dbconfig/20230609-061941-ladsgroup.json
  • 06:06 eileen: config 97c57848 -> 6f4a9d19 restart jobs
  • 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T336886)', diff saved to https://phabricator.wikimedia.org/P49340 and previous config saved to /var/cache/conftool/dbconfig/20230609-060438-ladsgroup.json
  • 06:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 06:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 05:53 eileen: civicrm upgraded from 158896cc to 5bbed553
  • 05:52 eileen: config revision changed from 8b71fa7a to 97c57848
  • 05:50 moritzm: installing cpio security updates
  • 05:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 05:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 05:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 05:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T336886)', diff saved to https://phabricator.wikimedia.org/P49339 and previous config saved to /var/cache/conftool/dbconfig/20230609-052315-ladsgroup.json
  • 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49338 and previous config saved to /var/cache/conftool/dbconfig/20230609-050809-ladsgroup.json
  • 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P49337 and previous config saved to /var/cache/conftool/dbconfig/20230609-045302-ladsgroup.json
  • 04:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T336886)', diff saved to https://phabricator.wikimedia.org/P49336 and previous config saved to /var/cache/conftool/dbconfig/20230609-043756-ladsgroup.json
  • 04:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1219 (T336886)', diff saved to https://phabricator.wikimedia.org/P49335 and previous config saved to /var/cache/conftool/dbconfig/20230609-042306-ladsgroup.json
  • 04:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 04:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 04:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T336886)', diff saved to https://phabricator.wikimedia.org/P49334 and previous config saved to /var/cache/conftool/dbconfig/20230609-042246-ladsgroup.json
  • 04:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49333 and previous config saved to /var/cache/conftool/dbconfig/20230609-040739-ladsgroup.json
  • 03:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P49332 and previous config saved to /var/cache/conftool/dbconfig/20230609-035233-ladsgroup.json
  • 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T336886)', diff saved to https://phabricator.wikimedia.org/P49331 and previous config saved to /var/cache/conftool/dbconfig/20230609-033727-ladsgroup.json
  • 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1218 (T336886)', diff saved to https://phabricator.wikimedia.org/P49330 and previous config saved to /var/cache/conftool/dbconfig/20230609-032127-ladsgroup.json
  • 03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 03:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T336886)', diff saved to https://phabricator.wikimedia.org/P49329 and previous config saved to /var/cache/conftool/dbconfig/20230609-032106-ladsgroup.json
  • 03:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49328 and previous config saved to /var/cache/conftool/dbconfig/20230609-030600-ladsgroup.json
  • 02:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P49327 and previous config saved to /var/cache/conftool/dbconfig/20230609-025054-ladsgroup.json
  • 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T336886)', diff saved to https://phabricator.wikimedia.org/P49326 and previous config saved to /var/cache/conftool/dbconfig/20230609-023548-ladsgroup.json
  • 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1207 (T336886)', diff saved to https://phabricator.wikimedia.org/P49325 and previous config saved to /var/cache/conftool/dbconfig/20230609-022054-ladsgroup.json
  • 02:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 02:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T336886)', diff saved to https://phabricator.wikimedia.org/P49324 and previous config saved to /var/cache/conftool/dbconfig/20230609-022034-ladsgroup.json
  • 02:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49323 and previous config saved to /var/cache/conftool/dbconfig/20230609-020528-ladsgroup.json
  • 02:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
  • 02:01 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
  • 02:00 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P49322 and previous config saved to /var/cache/conftool/dbconfig/20230609-015021-ladsgroup.json
  • 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
  • 01:48 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
  • 01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T336886)', diff saved to https://phabricator.wikimedia.org/P49321 and previous config saved to /var/cache/conftool/dbconfig/20230609-013515-ladsgroup.json
  • 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS bullseye
  • 01:29 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1206 (T336886)', diff saved to https://phabricator.wikimedia.org/P49320 and previous config saved to /var/cache/conftool/dbconfig/20230609-011945-ladsgroup.json
  • 01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T336886)', diff saved to https://phabricator.wikimedia.org/P49319 and previous config saved to /var/cache/conftool/dbconfig/20230609-011924-ladsgroup.json
  • 01:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49318 and previous config saved to /var/cache/conftool/dbconfig/20230609-010418-ladsgroup.json
  • 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
  • 00:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
  • 00:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
  • 00:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1011.eqiad.wmnet with OS bullseye
  • 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P49317 and previous config saved to /var/cache/conftool/dbconfig/20230609-004912-ladsgroup.json
  • 00:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
  • 00:47 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
  • 00:34 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS bullseye
  • 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T336886)', diff saved to https://phabricator.wikimedia.org/P49316 and previous config saved to /var/cache/conftool/dbconfig/20230609-003406-ladsgroup.json
  • 00:31 eileen: civicrm upgraded from 6f64e77d to 158896cc
  • 00:25 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['pki-root1002']
  • 00:25 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002']
  • 00:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['pki-root1002']
  • 00:24 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pki-root1002']
  • 00:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 (T336886)', diff saved to https://phabricator.wikimedia.org/P49315 and previous config saved to /var/cache/conftool/dbconfig/20230609-001821-ladsgroup.json
  • 00:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 00:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 00:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 00:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T336886)', diff saved to https://phabricator.wikimedia.org/P49314 and previous config saved to /var/cache/conftool/dbconfig/20230609-001732-ladsgroup.json
  • 00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49313 and previous config saved to /var/cache/conftool/dbconfig/20230609-000226-ladsgroup.json

2023-06-08

  • 23:55 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1011.eqiad.wmnet with OS bullseye
  • 23:54 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup1010.eqiad.wmnet with OS bullseye
  • 23:54 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
  • 23:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host backup1010.eqiad.wmnet with OS bullseye
  • 23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P49312 and previous config saved to /var/cache/conftool/dbconfig/20230608-234720-ladsgroup.json
  • 23:42 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:41 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for pki-root - pt1979@cumin2002"
  • 23:40 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for pki-root - pt1979@cumin2002"
  • 23:38 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T336886)', diff saved to https://phabricator.wikimedia.org/P49311 and previous config saved to /var/cache/conftool/dbconfig/20230608-233214-ladsgroup.json
  • 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T336886)', diff saved to https://phabricator.wikimedia.org/P49310 and previous config saved to /var/cache/conftool/dbconfig/20230608-231650-ladsgroup.json
  • 23:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 23:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T336886)', diff saved to https://phabricator.wikimedia.org/P49309 and previous config saved to /var/cache/conftool/dbconfig/20230608-231629-ladsgroup.json
  • 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P49308 and previous config saved to /var/cache/conftool/dbconfig/20230608-230123-ladsgroup.json
  • 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P49307 and previous config saved to /var/cache/conftool/dbconfig/20230608-224617-ladsgroup.json
  • 22:39 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on gerrit1001.wikimedia.org with reason: decom
  • 22:39 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on gerrit1001.wikimedia.org with reason: decom
  • 22:37 mutante: gerrit1001 - rmdir /etc/ssh/userkeys/gerrit.d which leads to puppet warnings because it cant remove empty dir
  • 22:35 mutante: removing gerrit role from former gerrit prod machine gerrit1001, removes firewall rules, shell access, monitoring..etc
  • 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T336886)', diff saved to https://phabricator.wikimedia.org/P49306 and previous config saved to /var/cache/conftool/dbconfig/20230608-223111-ladsgroup.json
  • 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T336886)', diff saved to https://phabricator.wikimedia.org/P49305 and previous config saved to /var/cache/conftool/dbconfig/20230608-221536-ladsgroup.json
  • 22:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 22:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 22:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T336886)', diff saved to https://phabricator.wikimedia.org/P49304 and previous config saved to /var/cache/conftool/dbconfig/20230608-221515-ladsgroup.json
  • 22:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P49303 and previous config saved to /var/cache/conftool/dbconfig/20230608-220009-ladsgroup.json
  • 21:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P49302 and previous config saved to /var/cache/conftool/dbconfig/20230608-214503-ladsgroup.json
  • 21:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T336886)', diff saved to https://phabricator.wikimedia.org/P49301 and previous config saved to /var/cache/conftool/dbconfig/20230608-212957-ladsgroup.json
  • 21:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T336886)', diff saved to https://phabricator.wikimedia.org/P49300 and previous config saved to /var/cache/conftool/dbconfig/20230608-211419-ladsgroup.json
  • 21:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 21:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 21:08 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup1011.eqiad.wmnet']
  • 21:07 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1011.eqiad.wmnet']
  • 21:07 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup1011.eqiad.wmnet']
  • 21:07 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup1010.eqiad.wmnet']
  • 21:07 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1010.eqiad.wmnet']
  • 21:06 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup1010.eqiad.wmnet']
  • 21:06 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1011.eqiad.wmnet']
  • 21:05 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup1010.eqiad.wmnet']
  • 21:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 21:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T336886)', diff saved to https://phabricator.wikimedia.org/P49298 and previous config saved to /var/cache/conftool/dbconfig/20230608-204722-ladsgroup.json
  • 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P49297 and previous config saved to /var/cache/conftool/dbconfig/20230608-203216-ladsgroup.json
  • 20:31 ladsgroup@deploy1002: Finished scap: Backport for Externallinks: Make port part of the index (T337149) (duration: 10m 10s)
  • 20:22 ladsgroup@deploy1002: ladsgroup: Backport for Externallinks: Make port part of the index (T337149) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 20:21 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1028.eqiad.wmnet with OS bullseye
  • 20:20 ladsgroup@deploy1002: Started scap: Backport for Externallinks: Make port part of the index (T337149)
  • 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P49296 and previous config saved to /var/cache/conftool/dbconfig/20230608-201710-ladsgroup.json
  • 20:12 ladsgroup@deploy1002: Finished scap: Backport for Remove VectorLimitedWidthIndicator (T336197) (duration: 07m 32s)
  • 20:06 ladsgroup@deploy1002: ladsgroup and ksarabia: Backport for Remove VectorLimitedWidthIndicator (T336197) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:05 ladsgroup@deploy1002: Started scap: Backport for Remove VectorLimitedWidthIndicator (T336197)
  • 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T336886)', diff saved to https://phabricator.wikimedia.org/P49295 and previous config saved to /var/cache/conftool/dbconfig/20230608-200204-ladsgroup.json
  • 20:01 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1028.eqiad.wmnet with reason: host reimage
  • 19:54 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1028.eqiad.wmnet with reason: host reimage
  • 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T336886)', diff saved to https://phabricator.wikimedia.org/P49294 and previous config saved to /var/cache/conftool/dbconfig/20230608-194555-ladsgroup.json
  • 19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 19:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T336886)', diff saved to https://phabricator.wikimedia.org/P49293 and previous config saved to /var/cache/conftool/dbconfig/20230608-194534-ladsgroup.json
  • 19:40 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1028.eqiad.wmnet with OS bullseye
  • 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P49292 and previous config saved to /var/cache/conftool/dbconfig/20230608-193028-ladsgroup.json
  • 19:22 jclark@cumin1001: START - Cookbook sre.hosts.provision for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P49291 and previous config saved to /var/cache/conftool/dbconfig/20230608-191522-ladsgroup.json
  • 19:08 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup1011.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T336886)', diff saved to https://phabricator.wikimedia.org/P49290 and previous config saved to /var/cache/conftool/dbconfig/20230608-190016-ladsgroup.json
  • 18:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 (T336886)', diff saved to https://phabricator.wikimedia.org/P49289 and previous config saved to /var/cache/conftool/dbconfig/20230608-184312-ladsgroup.json
  • 18:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 18:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49288 and previous config saved to /var/cache/conftool/dbconfig/20230608-184251-ladsgroup.json
  • 18:36 jclark@cumin1001: START - Cookbook sre.hosts.provision for host backup1011.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:36 jclark@cumin1001: START - Cookbook sre.hosts.provision for host backup1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P49287 and previous config saved to /var/cache/conftool/dbconfig/20230608-182745-ladsgroup.json
  • 18:24 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool sessionstore in eqiad: maintenance
  • 18:19 eevans@cumin1001: START - Cookbook sre.discovery.service-route pool sessionstore in eqiad: maintenance
  • 18:18 urandom: (Re)pooling sessionstore/eqiad — T337426
  • 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P49286 and previous config saved to /var/cache/conftool/dbconfig/20230608-181238-ladsgroup.json
  • 18:09 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.12 refs T337526
  • 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49285 and previous config saved to /var/cache/conftool/dbconfig/20230608-175732-ladsgroup.json
  • 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49284 and previous config saved to /var/cache/conftool/dbconfig/20230608-174135-ladsgroup.json
  • 17:41 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 17:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:36 stevemunene@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 17:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:31 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:31 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:30 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:30 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:28 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 17:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T336886)', diff saved to https://phabricator.wikimedia.org/P49283 and previous config saved to /var/cache/conftool/dbconfig/20230608-172746-ladsgroup.json
  • 17:24 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:14 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P49282 and previous config saved to /var/cache/conftool/dbconfig/20230608-171240-ladsgroup.json
  • 17:10 stevemunene@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 17:05 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetmaster1006.eqiad.wmnet with OS bullseye
  • 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 16:58 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 16:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P49281 and previous config saved to /var/cache/conftool/dbconfig/20230608-165734-ladsgroup.json
  • 16:56 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:aqs
  • 16:46 urandom: Starting traffic test against sessionstore.svc.eqiad.wmnet — T337426
  • 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetmaster1006.eqiad.wmnet with reason: host reimage
  • 16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T336886)', diff saved to https://phabricator.wikimedia.org/P49280 and previous config saved to /var/cache/conftool/dbconfig/20230608-164228-ladsgroup.json
  • 16:41 urandom: Upgrading Cassandra to 4.1.1, sessionstore1003 — T337426
  • 16:39 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetmaster1006.eqiad.wmnet with reason: host reimage
  • 16:38 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetmaster1006.eqiad.wmnet with OS bullseye
  • 16:36 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host puppetmaster1006.eqiad.wmnet with OS bullseye
  • 16:35 urandom: Upgrading Cassandra to 4.1.1, sessionstore1002 — T337426
  • 16:34 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs
  • 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 (T336886)', diff saved to https://phabricator.wikimedia.org/P49279 and previous config saved to /var/cache/conftool/dbconfig/20230608-162650-ladsgroup.json
  • 16:26 urandom: Upgrading Cassandra to 4.1.1, sessionstore1001 — T337426
  • 16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 16:22 urandom: creating pre-upgrade Cassandra snapshots, sessionstore/eqiad — T337426
  • 16:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 16:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 16:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 16:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 16:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 16:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 16:11 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool sessionstore in eqiad: maintenance
  • 16:06 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs2014.codfw.wmnet with OS bullseye
  • 16:06 eevans@cumin1001: START - Cookbook sre.discovery.service-route depool sessionstore in eqiad: maintenance
  • 16:06 urandom: depooling eqiad sessionstore for Cassandra upgrade — T337426
  • 16:00 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 15:58 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
  • 15:58 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs2014.codfw.wmnet with OS bullseye
  • 15:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:23 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
  • 15:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetmaster1006.eqiad.wmnet with OS bullseye
  • 15:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['puppetmaster1006']
  • 15:13 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['puppetmaster1006']
  • 15:09 moritzm: installing c-ares security updates on bullseye
  • 14:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:42 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 14:41 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 14:41 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 14:41 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 14:36 moritzm: installing libwep security updates on buster
  • 14:29 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003']
  • 14:28 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003']
  • 14:28 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 14:28 jclark@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 14:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host puppetmaster1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:19 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:19 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
  • 14:17 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
  • 14:15 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 14:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2014.codfw.wmnet with OS bullseye
  • 14:14 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 14:13 XioNoX: cloudsw2-c8-eqiad> request system zeroize - T338459
  • 14:13 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 14:11 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:10 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:10 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:09 XioNoX: decom cloudsw2-c8-eqiad - T338459
  • 14:08 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:07 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1001"
  • 14:07 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:07 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:06 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 14:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:04 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
  • 14:02 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
  • 14:01 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 14:00 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 13:59 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 13:58 ladsgroup@deploy1002: Finished scap: Backport for Remove svwiktionary, svwiki and dawiki from legacy encoding (T128156 T128152 T128153) (duration: 09m 13s)
  • 13:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
  • 13:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2014.codfw.wmnet with reason: host reimage
  • 13:52 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
  • 13:52 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 13:51 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 13:51 ladsgroup@deploy1002: ladsgroup: Backport for Remove svwiktionary, svwiki and dawiki from legacy encoding (T128156 T128152 T128153) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 13:51 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 13:50 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 13:49 ladsgroup@deploy1002: Started scap: Backport for Remove svwiktionary, svwiki and dawiki from legacy encoding (T128156 T128152 T128153)
  • 13:49 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host puppetmaster1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:48 jclark@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1002.eqiad.wmnet with reason: host reimage
  • 13:44 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:44 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
  • 13:43 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 13:43 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 13:43 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse for new ns-recursor.openstack.codfw1dev.wikimediacloud.org IP. - cmooney@cumin1001"
  • 13:41 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 13:40 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 13:39 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 13:36 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 13:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
  • 13:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:29 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 13:06 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 13:06 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 13:05 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 13:05 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 12:57 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 12:57 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 12:36 cmooney@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937 (duration: 17m 22s)
  • 12:19 topranks: De-pooling lvs1017 to move link to lsw1-e1-eqiad to ssw1-e1-eqiad T322937
  • 12:18 cmooney@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937
  • 12:12 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:11 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 12:03 vgutierrez: restore cp4052 HAProxy configuration - T317799
  • 11:51 vgutierrez: repooling cp4052 - T317799
  • 11:40 vgutierrez: depooling cp4052 for some HAProxy tests - T317799
  • 11:28 Amir1: mwscript maintenance/storage/moveToExternal.php --wiki=nlwiki --iconv DB cluster26 (T128154)
  • 11:03 Amir1: mwscript maintenance/storage/moveToExternal.php --wiki=dawiki --iconv DB cluster27 (T128153)
  • 10:49 Amir1: mwscript maintenance/storage/moveToExternal.php --wiki=svwiki --iconv DB cluster27 (T128153)
  • 10:22 jiji@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:21 jiji@cumin1001: START - Cookbook sre.dns.netbox
  • 09:58 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@bb7526e]: (no justification provided) (duration: 00m 08s)
  • 09:57 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@bb7526e]: (no justification provided)
  • 09:40 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetserver2001.codfw.wmnet with OS bookworm
  • 09:40 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin2002"
  • 09:24 vgutierrez: updated to HAProxy 2.7.9 on cp4052 and cp5032
  • 09:22 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp5032.eqsin.wmnet,cp4052.ulsfo.wmnet} and A:cp
  • 09:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 09:18 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 09:17 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp5032.eqsin.wmnet,cp4052.ulsfo.wmnet} and A:cp
  • 09:10 vgutierrez: fetch HAProxy 2.7.9 for thirdparty/haproxy27 bullseye (apt.wm.o)
  • 08:54 apergos: UTC morning backport and config training window done
  • 08:38 ariel@deploy1002: Finished scap: Backport for [ruwiki] Add an editautoreviewprotected level protecion (T337430) (duration: 08m 25s)
  • 08:31 ariel@deploy1002: ariel and superpes: Backport for [ruwiki] Add an editautoreviewprotected level protecion (T337430) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 08:30 ariel@deploy1002: Started scap: Backport for [ruwiki] Add an editautoreviewprotected level protecion (T337430)
  • 08:25 ariel@deploy1002: Finished scap: Backport for [fiwiki] Limitate the use of the ContentTranslation tool (T337412) (duration: 09m 16s)
  • 08:17 ariel@deploy1002: superpes and ariel: Backport for [fiwiki] Limitate the use of the ContentTranslation tool (T337412) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 08:16 ariel@deploy1002: Started scap: Backport for [fiwiki] Limitate the use of the ContentTranslation tool (T337412)
  • 08:12 ariel@deploy1002: Finished scap: Backport for [itwiktionary] Add a tagline (T337688) (duration: 08m 07s)
  • 08:06 ariel@deploy1002: ariel and superpes: Backport for [itwiktionary] Add a tagline (T337688) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 08:04 ariel@deploy1002: Started scap: Backport for [itwiktionary] Add a tagline (T337688)
  • 07:49 ariel@deploy1002: Finished scap: Backport for [kaawiki] Change the logo with an HD version and the tagline (T337641) (duration: 09m 09s)
  • 07:41 ariel@deploy1002: ariel and superpes: Backport for [kaawiki] Change the logo with an HD version and the tagline (T337641) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 07:40 ariel@deploy1002: Started scap: Backport for [kaawiki] Change the logo with an HD version and the tagline (T337641)
  • 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T336886)', diff saved to https://phabricator.wikimedia.org/P49271 and previous config saved to /var/cache/conftool/dbconfig/20230608-073524-ladsgroup.json
  • 07:27 kartik@deploy1002: Finished scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337834) (duration: 09m 19s)
  • 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P49270 and previous config saved to /var/cache/conftool/dbconfig/20230608-072018-ladsgroup.json
  • 07:19 kartik@deploy1002: kartik: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337834) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 07:17 kartik@deploy1002: Started scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337834)
  • 07:14 elukey: delete pod kask-production-7dfdfc7cbc-2vw5q in wikikube codfw, since it was scheduled on a non dedicated node
  • 07:14 kartik@deploy1002: Finished scap: Backport for Enable Content and Section Translation for 9 Wikipedia (T337290) (duration: 09m 52s)
  • 07:06 kartik@deploy1002: kartik: Backport for Enable Content and Section Translation for 9 Wikipedia (T337290) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P49268 and previous config saved to /var/cache/conftool/dbconfig/20230608-070512-ladsgroup.json
  • 07:04 kartik@deploy1002: Started scap: Backport for Enable Content and Section Translation for 9 Wikipedia (T337290)
  • 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T336886)', diff saved to https://phabricator.wikimedia.org/P49267 and previous config saved to /var/cache/conftool/dbconfig/20230608-065006-ladsgroup.json
  • 06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T336886)', diff saved to https://phabricator.wikimedia.org/P49266 and previous config saved to /var/cache/conftool/dbconfig/20230608-064508-ladsgroup.json
  • 06:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 06:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 06:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T336886)', diff saved to https://phabricator.wikimedia.org/P49265 and previous config saved to /var/cache/conftool/dbconfig/20230608-064447-ladsgroup.json
  • 06:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P49264 and previous config saved to /var/cache/conftool/dbconfig/20230608-062941-ladsgroup.json
  • 06:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P49263 and previous config saved to /var/cache/conftool/dbconfig/20230608-061435-ladsgroup.json
  • 06:10 elukey: kill remaining processes for `andyrussg` on stat100x nodes to unblock puppet
  • 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T336886)', diff saved to https://phabricator.wikimedia.org/P49262 and previous config saved to /var/cache/conftool/dbconfig/20230608-055929-ladsgroup.json
  • 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T336886)', diff saved to https://phabricator.wikimedia.org/P49261 and previous config saved to /var/cache/conftool/dbconfig/20230608-055432-ladsgroup.json
  • 05:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T336886)', diff saved to https://phabricator.wikimedia.org/P49260 and previous config saved to /var/cache/conftool/dbconfig/20230608-055411-ladsgroup.json
  • 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P49259 and previous config saved to /var/cache/conftool/dbconfig/20230608-053904-ladsgroup.json
  • 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P49258 and previous config saved to /var/cache/conftool/dbconfig/20230608-052358-ladsgroup.json
  • 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T336886)', diff saved to https://phabricator.wikimedia.org/P49257 and previous config saved to /var/cache/conftool/dbconfig/20230608-050852-ladsgroup.json
  • 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T336886)', diff saved to https://phabricator.wikimedia.org/P49256 and previous config saved to /var/cache/conftool/dbconfig/20230608-050353-ladsgroup.json
  • 05:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 05:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 05:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 05:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49255 and previous config saved to /var/cache/conftool/dbconfig/20230608-050328-ladsgroup.json
  • 04:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P49254 and previous config saved to /var/cache/conftool/dbconfig/20230608-044821-ladsgroup.json
  • 04:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P49253 and previous config saved to /var/cache/conftool/dbconfig/20230608-043315-ladsgroup.json
  • 04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49252 and previous config saved to /var/cache/conftool/dbconfig/20230608-041809-ladsgroup.json
  • 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49251 and previous config saved to /var/cache/conftool/dbconfig/20230608-041311-ladsgroup.json
  • 04:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 04:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 04:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 04:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49250 and previous config saved to /var/cache/conftool/dbconfig/20230608-040935-ladsgroup.json
  • 03:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P49249 and previous config saved to /var/cache/conftool/dbconfig/20230608-035428-ladsgroup.json
  • 03:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P49248 and previous config saved to /var/cache/conftool/dbconfig/20230608-033922-ladsgroup.json
  • 03:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49247 and previous config saved to /var/cache/conftool/dbconfig/20230608-032416-ladsgroup.json
  • 03:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49246 and previous config saved to /var/cache/conftool/dbconfig/20230608-031911-ladsgroup.json
  • 03:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 03:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 03:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49245 and previous config saved to /var/cache/conftool/dbconfig/20230608-031901-ladsgroup.json
  • 03:11 eileen: civicrm upgraded from 066095b8 to 6f64e77d
  • 03:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P49244 and previous config saved to /var/cache/conftool/dbconfig/20230608-030355-ladsgroup.json
  • 02:54 samtar@deploy1002: Finished scap: Backport for Remove additional v1 suffix when computing internalRestbaseURL (T334842 T338381) (duration: 09m 50s)
  • 02:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P49243 and previous config saved to /var/cache/conftool/dbconfig/20230608-024849-ladsgroup.json
  • 02:46 samtar@deploy1002: samtar: Backport for Remove additional v1 suffix when computing internalRestbaseURL (T334842 T338381) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 02:44 samtar@deploy1002: Started scap: Backport for Remove additional v1 suffix when computing internalRestbaseURL (T334842 T338381)
  • 02:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49242 and previous config saved to /var/cache/conftool/dbconfig/20230608-023343-ladsgroup.json
  • 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49241 and previous config saved to /var/cache/conftool/dbconfig/20230608-022842-ladsgroup.json
  • 02:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 02:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T336886)', diff saved to https://phabricator.wikimedia.org/P49240 and previous config saved to /var/cache/conftool/dbconfig/20230608-022821-ladsgroup.json
  • 02:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P49239 and previous config saved to /var/cache/conftool/dbconfig/20230608-021315-ladsgroup.json
  • 01:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P49238 and previous config saved to /var/cache/conftool/dbconfig/20230608-015809-ladsgroup.json
  • 01:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T336886)', diff saved to https://phabricator.wikimedia.org/P49237 and previous config saved to /var/cache/conftool/dbconfig/20230608-014303-ladsgroup.json
  • 01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T336886)', diff saved to https://phabricator.wikimedia.org/P49236 and previous config saved to /var/cache/conftool/dbconfig/20230608-013808-ladsgroup.json
  • 01:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 01:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 01:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T336886)', diff saved to https://phabricator.wikimedia.org/P49235 and previous config saved to /var/cache/conftool/dbconfig/20230608-013736-ladsgroup.json
  • 01:23 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 01:23 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 01:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P49234 and previous config saved to /var/cache/conftool/dbconfig/20230608-012230-ladsgroup.json
  • 01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49233 and previous config saved to /var/cache/conftool/dbconfig/20230608-010853-ladsgroup.json
  • 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P49232 and previous config saved to /var/cache/conftool/dbconfig/20230608-010724-ladsgroup.json
  • 00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P49231 and previous config saved to /var/cache/conftool/dbconfig/20230608-005347-ladsgroup.json
  • 00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T336886)', diff saved to https://phabricator.wikimedia.org/P49230 and previous config saved to /var/cache/conftool/dbconfig/20230608-005218-ladsgroup.json
  • 00:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T336886)', diff saved to https://phabricator.wikimedia.org/P49229 and previous config saved to /var/cache/conftool/dbconfig/20230608-004713-ladsgroup.json
  • 00:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T336886)', diff saved to https://phabricator.wikimedia.org/P49228 and previous config saved to /var/cache/conftool/dbconfig/20230608-004653-ladsgroup.json
  • 00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P49227 and previous config saved to /var/cache/conftool/dbconfig/20230608-003841-ladsgroup.json
  • 00:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P49226 and previous config saved to /var/cache/conftool/dbconfig/20230608-003146-ladsgroup.json
  • 00:28 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
  • 00:28 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-cluster
  • 00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49225 and previous config saved to /var/cache/conftool/dbconfig/20230608-002335-ladsgroup.json
  • 00:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P49224 and previous config saved to /var/cache/conftool/dbconfig/20230608-001640-ladsgroup.json
  • 00:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49223 and previous config saved to /var/cache/conftool/dbconfig/20230608-001555-ladsgroup.json
  • 00:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 00:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 00:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49222 and previous config saved to /var/cache/conftool/dbconfig/20230608-001534-ladsgroup.json
  • 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T336886)', diff saved to https://phabricator.wikimedia.org/P49221 and previous config saved to /var/cache/conftool/dbconfig/20230608-000134-ladsgroup.json
  • 00:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P49220 and previous config saved to /var/cache/conftool/dbconfig/20230608-000028-ladsgroup.json

2023-06-07

  • 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T336886)', diff saved to https://phabricator.wikimedia.org/P49219 and previous config saved to /var/cache/conftool/dbconfig/20230607-235624-ladsgroup.json
  • 23:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 23:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T336886)', diff saved to https://phabricator.wikimedia.org/P49218 and previous config saved to /var/cache/conftool/dbconfig/20230607-235603-ladsgroup.json
  • 23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P49217 and previous config saved to /var/cache/conftool/dbconfig/20230607-234522-ladsgroup.json
  • 23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49216 and previous config saved to /var/cache/conftool/dbconfig/20230607-234057-ladsgroup.json
  • 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49215 and previous config saved to /var/cache/conftool/dbconfig/20230607-233016-ladsgroup.json
  • 23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P49214 and previous config saved to /var/cache/conftool/dbconfig/20230607-232551-ladsgroup.json
  • 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49213 and previous config saved to /var/cache/conftool/dbconfig/20230607-232223-ladsgroup.json
  • 23:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 23:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49212 and previous config saved to /var/cache/conftool/dbconfig/20230607-232203-ladsgroup.json
  • 23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T336886)', diff saved to https://phabricator.wikimedia.org/P49211 and previous config saved to /var/cache/conftool/dbconfig/20230607-231045-ladsgroup.json
  • 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P49210 and previous config saved to /var/cache/conftool/dbconfig/20230607-230657-ladsgroup.json
  • 23:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T336886)', diff saved to https://phabricator.wikimedia.org/P49209 and previous config saved to /var/cache/conftool/dbconfig/20230607-230540-ladsgroup.json
  • 23:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 23:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 23:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 23:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 22:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 22:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T336886)', diff saved to https://phabricator.wikimedia.org/P49208 and previous config saved to /var/cache/conftool/dbconfig/20230607-225926-ladsgroup.json
  • 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P49207 and previous config saved to /var/cache/conftool/dbconfig/20230607-225150-ladsgroup.json
  • 22:45 zabe@deploy1002: Finished scap: T338287 (duration: 07m 30s)
  • 22:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P49206 and previous config saved to /var/cache/conftool/dbconfig/20230607-224420-ladsgroup.json
  • 22:38 zabe@deploy1002: Started scap: T338287
  • 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49205 and previous config saved to /var/cache/conftool/dbconfig/20230607-223644-ladsgroup.json
  • 22:34 zabe@deploy1002: Sync cancelled.
  • 22:34 zabe@deploy1002: zabe: Backport for Use cuc_timestamp as index field when reading old (T338287) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 22:32 zabe@deploy1002: Started scap: Backport for Use cuc_timestamp as index field when reading old (T338287)
  • 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P49204 and previous config saved to /var/cache/conftool/dbconfig/20230607-222914-ladsgroup.json
  • 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49203 and previous config saved to /var/cache/conftool/dbconfig/20230607-222905-ladsgroup.json
  • 22:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 22:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49202 and previous config saved to /var/cache/conftool/dbconfig/20230607-222844-ladsgroup.json
  • 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T336886)', diff saved to https://phabricator.wikimedia.org/P49201 and previous config saved to /var/cache/conftool/dbconfig/20230607-221408-ladsgroup.json
  • 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P49200 and previous config saved to /var/cache/conftool/dbconfig/20230607-221338-ladsgroup.json
  • 22:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1221 (T336886)', diff saved to https://phabricator.wikimedia.org/P49199 and previous config saved to /var/cache/conftool/dbconfig/20230607-220859-ladsgroup.json
  • 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T336886)', diff saved to https://phabricator.wikimedia.org/P49198 and previous config saved to /var/cache/conftool/dbconfig/20230607-220821-ladsgroup.json
  • 22:05 eileen: civicrm upgraded from bcc8fccc to 066095b8
  • 22:05 zabe@deploy1002: Finished scap: Backport for Use cuc_timestamp as index field when reading old (T338287) (duration: 11m 48s)
  • 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P49197 and previous config saved to /var/cache/conftool/dbconfig/20230607-215831-ladsgroup.json
  • 21:55 zabe@deploy1002: dreamyjazz and zabe: Backport for Use cuc_timestamp as index field when reading old (T338287) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 21:53 zabe@deploy1002: Started scap: Backport for Use cuc_timestamp as index field when reading old (T338287)
  • 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P49196 and previous config saved to /var/cache/conftool/dbconfig/20230607-215315-ladsgroup.json
  • 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49195 and previous config saved to /var/cache/conftool/dbconfig/20230607-214325-ladsgroup.json
  • 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P49194 and previous config saved to /var/cache/conftool/dbconfig/20230607-213809-ladsgroup.json
  • 21:36 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
  • 21:36 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet
  • 21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49193 and previous config saved to /var/cache/conftool/dbconfig/20230607-213530-ladsgroup.json
  • 21:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T336886)', diff saved to https://phabricator.wikimedia.org/P49192 and previous config saved to /var/cache/conftool/dbconfig/20230607-213509-ladsgroup.json
  • 21:33 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 21:32 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 21:32 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs1016.eqiad.wmnet
  • 21:32 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs1016.eqiad.wmnet
  • 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T336886)', diff saved to https://phabricator.wikimedia.org/P49191 and previous config saved to /var/cache/conftool/dbconfig/20230607-212303-ladsgroup.json
  • 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P49190 and previous config saved to /var/cache/conftool/dbconfig/20230607-212003-ladsgroup.json
  • 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T336886)', diff saved to https://phabricator.wikimedia.org/P49189 and previous config saved to /var/cache/conftool/dbconfig/20230607-211807-ladsgroup.json
  • 21:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 21:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 21:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T336886)', diff saved to https://phabricator.wikimedia.org/P49188 and previous config saved to /var/cache/conftool/dbconfig/20230607-211746-ladsgroup.json
  • 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P49187 and previous config saved to /var/cache/conftool/dbconfig/20230607-210457-ladsgroup.json
  • 21:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P49186 and previous config saved to /var/cache/conftool/dbconfig/20230607-210240-ladsgroup.json
  • 20:50 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1023.eqiad.wmnet with OS bullseye
  • 20:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T336886)', diff saved to https://phabricator.wikimedia.org/P49185 and previous config saved to /var/cache/conftool/dbconfig/20230607-204951-ladsgroup.json
  • 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P49184 and previous config saved to /var/cache/conftool/dbconfig/20230607-204734-ladsgroup.json
  • 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T336886)', diff saved to https://phabricator.wikimedia.org/P49183 and previous config saved to /var/cache/conftool/dbconfig/20230607-204728-ladsgroup.json
  • 20:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 20:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 20:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T336886)', diff saved to https://phabricator.wikimedia.org/P49182 and previous config saved to /var/cache/conftool/dbconfig/20230607-204652-ladsgroup.json
  • 20:45 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 20:35 catrope@deploy1002: Finished scap: Backport for Link to translations of CC BY-SA 4.0 where possible (T319064) (duration: 12m 12s)
  • 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T336886)', diff saved to https://phabricator.wikimedia.org/P49181 and previous config saved to /var/cache/conftool/dbconfig/20230607-203228-ladsgroup.json
  • 20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P49180 and previous config saved to /var/cache/conftool/dbconfig/20230607-203146-ladsgroup.json
  • 20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T336886)', diff saved to https://phabricator.wikimedia.org/P49179 and previous config saved to /var/cache/conftool/dbconfig/20230607-202733-ladsgroup.json
  • 20:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 20:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 20:24 catrope@deploy1002: catrope: Backport for Link to translations of CC BY-SA 4.0 where possible (T319064) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 20:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T336886)', diff saved to https://phabricator.wikimedia.org/P49178 and previous config saved to /var/cache/conftool/dbconfig/20230607-202408-ladsgroup.json
  • 20:23 catrope@deploy1002: Started scap: Backport for Link to translations of CC BY-SA 4.0 where possible (T319064)
  • 20:18 catrope@deploy1002: Finished scap: Backport for Deploy GDI safety survey to JA and RU wikis. (T337728) (duration: 10m 53s)
  • 20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P49177 and previous config saved to /var/cache/conftool/dbconfig/20230607-201640-ladsgroup.json
  • 20:15 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
  • 20:15 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
  • 20:15 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
  • 20:14 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: attempting WDQS stack on bullseye
  • 20:11 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
  • 20:11 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet
  • 20:09 catrope@deploy1002: catrope and essexigyan: Backport for Deploy GDI safety survey to JA and RU wikis. (T337728) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P49176 and previous config saved to /var/cache/conftool/dbconfig/20230607-200902-ladsgroup.json
  • 20:07 catrope@deploy1002: Started scap: Backport for Deploy GDI safety survey to JA and RU wikis. (T337728)
  • 20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T336886)', diff saved to https://phabricator.wikimedia.org/P49175 and previous config saved to /var/cache/conftool/dbconfig/20230607-200134-ladsgroup.json
  • 19:54 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1023.eqiad.wmnet with OS bullseye
  • 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P49174 and previous config saved to /var/cache/conftool/dbconfig/20230607-195356-ladsgroup.json
  • 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T336886)', diff saved to https://phabricator.wikimedia.org/P49173 and previous config saved to /var/cache/conftool/dbconfig/20230607-195316-ladsgroup.json
  • 19:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 19:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T336886)', diff saved to https://phabricator.wikimedia.org/P49172 and previous config saved to /var/cache/conftool/dbconfig/20230607-195255-ladsgroup.json
  • 19:51 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 19:41 taavi: manually created 3 global accounts T338197
  • 19:40 bblack: cp*: disabling puppet temporarily out of an abundance of caution
  • 19:40 eevans@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
  • 19:40 eevans@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
  • 19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T336886)', diff saved to https://phabricator.wikimedia.org/P49171 and previous config saved to /var/cache/conftool/dbconfig/20230607-193850-ladsgroup.json
  • 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P49170 and previous config saved to /var/cache/conftool/dbconfig/20230607-193749-ladsgroup.json
  • 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T336886)', diff saved to https://phabricator.wikimedia.org/P49169 and previous config saved to /var/cache/conftool/dbconfig/20230607-193357-ladsgroup.json
  • 19:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 19:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49168 and previous config saved to /var/cache/conftool/dbconfig/20230607-193326-ladsgroup.json
  • 19:23 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 19:23 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P49167 and previous config saved to /var/cache/conftool/dbconfig/20230607-192243-ladsgroup.json
  • 19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P49166 and previous config saved to /var/cache/conftool/dbconfig/20230607-191820-ladsgroup.json
  • 19:16 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool sessionstore in codfw: maintenance
  • 19:11 eevans@cumin1001: START - Cookbook sre.discovery.service-route pool sessionstore in codfw: maintenance
  • 19:11 urandom: (Re)pooling codfw sessionstore — T337426
  • 19:09 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore2001.codfw.wmnet
  • 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T336886)', diff saved to https://phabricator.wikimedia.org/P49165 and previous config saved to /var/cache/conftool/dbconfig/20230607-190737-ladsgroup.json
  • 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T336886)', diff saved to https://phabricator.wikimedia.org/P49164 and previous config saved to /var/cache/conftool/dbconfig/20230607-190514-ladsgroup.json
  • 19:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P49163 and previous config saved to /var/cache/conftool/dbconfig/20230607-190314-ladsgroup.json
  • 19:02 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore2001.codfw.wmnet
  • 18:59 pt1979@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 18:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 18:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 18:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 18:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 18:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49162 and previous config saved to /var/cache/conftool/dbconfig/20230607-184808-ladsgroup.json
  • 18:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T336886)', diff saved to https://phabricator.wikimedia.org/P49161 and previous config saved to /var/cache/conftool/dbconfig/20230607-184712-ladsgroup.json
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T336886)', diff saved to https://phabricator.wikimedia.org/P49160 and previous config saved to /var/cache/conftool/dbconfig/20230607-184411-ladsgroup.json
  • 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 18:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49159 and previous config saved to /var/cache/conftool/dbconfig/20230607-184351-ladsgroup.json
  • 18:41 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P49158 and previous config saved to /var/cache/conftool/dbconfig/20230607-183206-ladsgroup.json
  • 18:31 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3052.esams.wmnet
  • 18:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P49157 and previous config saved to /var/cache/conftool/dbconfig/20230607-182845-ladsgroup.json
  • 18:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1135.eqiad.wmnet with reason: T338354
  • 18:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1135.eqiad.wmnet with reason: T338354
  • 18:22 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp3052.esams.wmnet
  • 18:20 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.12 refs T337526 (duration: 06m 05s)
  • 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P49156 and previous config saved to /var/cache/conftool/dbconfig/20230607-181700-ladsgroup.json
  • 18:14 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.12 refs T337526
  • 18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P49155 and previous config saved to /var/cache/conftool/dbconfig/20230607-181339-ladsgroup.json
  • 18:08 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@d90d5c8]: (no justification provided) (duration: 00m 33s)
  • 18:07 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@d90d5c8]: (no justification provided)
  • 18:04 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs2014.codfw.wmnet with OS bullseye
  • 18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T336886)', diff saved to https://phabricator.wikimedia.org/P49154 and previous config saved to /var/cache/conftool/dbconfig/20230607-180154-ladsgroup.json
  • 17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49153 and previous config saved to /var/cache/conftool/dbconfig/20230607-175833-ladsgroup.json
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1222 (T336886)', diff saved to https://phabricator.wikimedia.org/P49152 and previous config saved to /var/cache/conftool/dbconfig/20230607-175347-ladsgroup.json
  • 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T336886)', diff saved to https://phabricator.wikimedia.org/P49151 and previous config saved to /var/cache/conftool/dbconfig/20230607-175337-ladsgroup.json
  • 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T336886)', diff saved to https://phabricator.wikimedia.org/P49150 and previous config saved to /var/cache/conftool/dbconfig/20230607-175327-ladsgroup.json
  • 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49149 and previous config saved to /var/cache/conftool/dbconfig/20230607-175316-ladsgroup.json
  • 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3050.esams.wmnet,service=ats-be
  • 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3050.esams.wmnet,service=cdn
  • 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3051.esams.wmnet,service=ats-be
  • 17:50 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3051.esams.wmnet,service=cdn
  • 17:46 inflatador: bking@wdqs depool wdqs2012 T321605
  • 17:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3051.esams.wmnet
  • 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P49148 and previous config saved to /var/cache/conftool/dbconfig/20230607-173821-ladsgroup.json
  • 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P49147 and previous config saved to /var/cache/conftool/dbconfig/20230607-173810-ladsgroup.json
  • 17:34 cwhite@cumin2002: dbctl commit (dc=all): 'depool db1135', diff saved to https://phabricator.wikimedia.org/P49146 and previous config saved to /var/cache/conftool/dbconfig/20230607-173453-cwhite.json
  • 17:33 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp3051.esams.wmnet
  • 17:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P49145 and previous config saved to /var/cache/conftool/dbconfig/20230607-172315-ladsgroup.json
  • 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P49144 and previous config saved to /var/cache/conftool/dbconfig/20230607-172304-ladsgroup.json
  • 17:13 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:13 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:12 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 17:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T336886)', diff saved to https://phabricator.wikimedia.org/P49143 and previous config saved to /var/cache/conftool/dbconfig/20230607-170808-ladsgroup.json
  • 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49142 and previous config saved to /var/cache/conftool/dbconfig/20230607-170758-ladsgroup.json
  • 17:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2014.codfw.wmnet with OS bullseye
  • 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T336886)', diff saved to https://phabricator.wikimedia.org/P49141 and previous config saved to /var/cache/conftool/dbconfig/20230607-170551-ladsgroup.json
  • 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T336886)', diff saved to https://phabricator.wikimedia.org/P49140 and previous config saved to /var/cache/conftool/dbconfig/20230607-170530-ladsgroup.json
  • 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49139 and previous config saved to /var/cache/conftool/dbconfig/20230607-170252-ladsgroup.json
  • 17:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 17:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49138 and previous config saved to /var/cache/conftool/dbconfig/20230607-165934-ladsgroup.json
  • 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P49137 and previous config saved to /var/cache/conftool/dbconfig/20230607-165024-ladsgroup.json
  • 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P49135 and previous config saved to /var/cache/conftool/dbconfig/20230607-164428-ladsgroup.json
  • 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P49134 and previous config saved to /var/cache/conftool/dbconfig/20230607-163518-ladsgroup.json
  • 16:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P49133 and previous config saved to /var/cache/conftool/dbconfig/20230607-162922-ladsgroup.json
  • 16:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:29 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:23 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3050.esams.wmnet
  • 16:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs2014']
  • 16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T336886)', diff saved to https://phabricator.wikimedia.org/P49132 and previous config saved to /var/cache/conftool/dbconfig/20230607-162012-ladsgroup.json
  • 16:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T336886)', diff saved to https://phabricator.wikimedia.org/P49131 and previous config saved to /var/cache/conftool/dbconfig/20230607-161800-ladsgroup.json
  • 16:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 16:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 16:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T336886)', diff saved to https://phabricator.wikimedia.org/P49130 and previous config saved to /var/cache/conftool/dbconfig/20230607-161740-ladsgroup.json
  • 16:15 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp3050.esams.wmnet
  • 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49129 and previous config saved to /var/cache/conftool/dbconfig/20230607-161416-ladsgroup.json
  • 16:13 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:12 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2014']
  • 16:11 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2014']
  • 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T336886)', diff saved to https://phabricator.wikimedia.org/P49128 and previous config saved to /var/cache/conftool/dbconfig/20230607-160912-ladsgroup.json
  • 16:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T336886)', diff saved to https://phabricator.wikimedia.org/P49127 and previous config saved to /var/cache/conftool/dbconfig/20230607-160851-ladsgroup.json
  • 16:07 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 16:04 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin2002"
  • 16:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2014.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P49126 and previous config saved to /var/cache/conftool/dbconfig/20230607-160234-ladsgroup.json
  • 16:00 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lists1003.wikimedia.org
  • 15:57 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 15:56 urandom: Beginning (3 hour) generated traffic testing of sessionstore.svc.codfw.wmnet — T337426
  • 15:56 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P49125 and previous config saved to /var/cache/conftool/dbconfig/20230607-155345-ladsgroup.json
  • 15:52 urandom: Upgrading Cassandra to 4.1.1, sessionstore2003 — T337426
  • 15:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists1003.wikimedia.org
  • 15:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
  • 15:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P49124 and previous config saved to /var/cache/conftool/dbconfig/20230607-154727-ladsgroup.json
  • 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
  • 15:44 urandom: Upgrading Cassandra to 4.1.1, sessionstore2002 — T337426
  • 15:43 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host lvs2014.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:42 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:42 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for lvs2014 - pt1979@cumin2002"
  • 15:41 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entry for lvs2014 - pt1979@cumin2002"
  • 15:40 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetserver2001.codfw.wmnet with reason: host reimage
  • 15:39 moritzm: installing isc-dhcp bugfixes updates from Bullseye 11.7 point release
  • 15:38 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P49123 and previous config saved to /var/cache/conftool/dbconfig/20230607-153839-ladsgroup.json
  • 15:37 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetserver2001.codfw.wmnet with reason: host reimage
  • 15:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 15:33 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 15:33 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T336886)', diff saved to https://phabricator.wikimedia.org/P49122 and previous config saved to /var/cache/conftool/dbconfig/20230607-153221-ladsgroup.json
  • 15:26 moritzm: rolling restart of FPM on mw canaries to pick up libwebp security updates
  • 15:26 pt1979@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 15:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T336886)', diff saved to https://phabricator.wikimedia.org/P49121 and previous config saved to /var/cache/conftool/dbconfig/20230607-152456-ladsgroup.json
  • 15:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49120 and previous config saved to /var/cache/conftool/dbconfig/20230607-152425-ladsgroup.json
  • 15:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T336886)', diff saved to https://phabricator.wikimedia.org/P49119 and previous config saved to /var/cache/conftool/dbconfig/20230607-152333-ladsgroup.json
  • 15:23 elukey: all varnishkafka instances on caching nodes are getting restarted due to https://gerrit.wikimedia.org/r/c/operations/puppet/+/928087 - T337825
  • 15:22 jiji@deploy1002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 15:22 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:22 elukey: re-enable puppet on caching nodes
  • 15:22 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:21 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:21 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:21 claime: Bumping prewarmparsoid concurrency to 45 in changeprop-jobqueue - T320534
  • 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T336886)', diff saved to https://phabricator.wikimedia.org/P49118 and previous config saved to /var/cache/conftool/dbconfig/20230607-151835-ladsgroup.json
  • 15:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T336886)', diff saved to https://phabricator.wikimedia.org/P49117 and previous config saved to /var/cache/conftool/dbconfig/20230607-151815-ladsgroup.json
  • 15:17 moritzm: installing libwebp security updates on buster
  • 15:17 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2001.codfw.wmnet with OS bookworm
  • 15:17 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetserver2001.codfw.wmnet with OS bookworm
  • 15:14 urandom: Upgrading Cassandra to 4.1.1, sessionstore2001 — T337426
  • 15:14 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 15:10 elukey: disable puppet on all caching nodes to rollout a varnishakfka change (ref: https://gerrit.wikimedia.org/r/c/operations/puppet/+/928087)
  • 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P49116 and previous config saved to /var/cache/conftool/dbconfig/20230607-150919-ladsgroup.json
  • 15:08 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2001.codfw.wmnet with OS bookworm
  • 15:07 eevans@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool sessionstore in codfw: maintenance
  • 15:06 jbond@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetserver2001.mgmt.codfw.wmnet on all recursors
  • 15:06 jbond@cumin2002: START - Cookbook sre.dns.wipe-cache puppetserver2001.mgmt.codfw.wmnet on all recursors
  • 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P49115 and previous config saved to /var/cache/conftool/dbconfig/20230607-150309-ladsgroup.json
  • 15:02 eevans@cumin1001: START - Cookbook sre.discovery.service-route depool sessionstore in codfw: maintenance
  • 15:02 urandom: de-pooling sessionstore/codfw — T337426
  • 14:56 sukhe: homer "cr*-codfw*" commit "Gerrit: 928068 remove decommissioned host lvs2010"
  • 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetserver1001.eqiad.wmnet with OS bookworm
  • 14:54 jbond@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin1001"
  • 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P49114 and previous config saved to /var/cache/conftool/dbconfig/20230607-145413-ladsgroup.json
  • 14:54 moritzm: installing postgresql 13 security updates (clients/libs, server instances all updated already)
  • 14:53 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jbond@cumin1001"
  • 14:51 jbond@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:50 jbond@cumin2002: START - Cookbook sre.dns.netbox
  • 14:49 jbond@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs2010.codfw.wmnet
  • 14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:48 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 14:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P49112 and previous config saved to /var/cache/conftool/dbconfig/20230607-144803-ladsgroup.json
  • 14:43 jbond@cumin2002: START - Cookbook sre.dns.netbox
  • 14:40 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetserver1001.eqiad.wmnet with reason: host reimage
  • 14:40 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_eqiad and A:cp
  • 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49111 and previous config saved to /var/cache/conftool/dbconfig/20230607-143907-ladsgroup.json
  • 14:39 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs2010.codfw.wmnet
  • 14:37 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetserver1001.eqiad.wmnet with reason: host reimage
  • 14:36 cgoubert@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:33 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:33 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_eqiad and A:cp
  • 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T336886)', diff saved to https://phabricator.wikimedia.org/P49110 and previous config saved to /var/cache/conftool/dbconfig/20230607-143256-ladsgroup.json
  • 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49109 and previous config saved to /var/cache/conftool/dbconfig/20230607-143235-ladsgroup.json
  • 14:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 14:32 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T336886)', diff saved to https://phabricator.wikimedia.org/P49108 and previous config saved to /var/cache/conftool/dbconfig/20230607-143215-ladsgroup.json
  • 14:32 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:31 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T336886)', diff saved to https://phabricator.wikimedia.org/P49107 and previous config saved to /var/cache/conftool/dbconfig/20230607-142756-ladsgroup.json
  • 14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 14:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T336886)', diff saved to https://phabricator.wikimedia.org/P49106 and previous config saved to /var/cache/conftool/dbconfig/20230607-142736-ladsgroup.json
  • 14:26 cgoubert@deploy1002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:25 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetserver1001.eqiad.wmnet with OS bookworm
  • 14:24 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetserver1001.eqiad.wmnet with OS bookworm
  • 14:23 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P49104 and previous config saved to /var/cache/conftool/dbconfig/20230607-141709-ladsgroup.json
  • 14:17 aborrero@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudnet2006-dev
  • 14:16 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2006-dev
  • 14:14 aborrero@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudnet2005-dev
  • 14:14 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2005-dev
  • 14:14 aborrero@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudnet2006-dev
  • 14:13 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2006-dev
  • 14:13 aborrero@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudnet2005-dev
  • 14:13 aborrero@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet2005-dev
  • 14:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P49103 and previous config saved to /var/cache/conftool/dbconfig/20230607-141230-ladsgroup.json
  • 14:10 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Enable 'multi-line' mode in preg_match() for wikitextToHTML regex (T338264) (duration: 09m 16s)
  • 14:05 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetserver1001.eqiad.wmnet with OS bookworm
  • 14:03 lucaswerkmeister-wmde@deploy1002: d3r1ck01 and lucaswerkmeister-wmde: Backport for Enable 'multi-line' mode in preg_match() for wikitextToHTML regex (T338264) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
  • 14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P49102 and previous config saved to /var/cache/conftool/dbconfig/20230607-140203-ladsgroup.json
  • 14:01 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Enable 'multi-line' mode in preg_match() for wikitextToHTML regex (T338264)
  • 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P49101 and previous config saved to /var/cache/conftool/dbconfig/20230607-135724-ladsgroup.json
  • 13:47 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Enable cache warming jobs for parsoid per default. (T329366) (duration: 10m 27s)
  • 13:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T336886)', diff saved to https://phabricator.wikimedia.org/P49100 and previous config saved to /var/cache/conftool/dbconfig/20230607-134656-ladsgroup.json
  • 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T336886)', diff saved to https://phabricator.wikimedia.org/P49099 and previous config saved to /var/cache/conftool/dbconfig/20230607-134218-ladsgroup.json
  • 13:40 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
  • 13:39 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
  • 13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T336886)', diff saved to https://phabricator.wikimedia.org/P49098 and previous config saved to /var/cache/conftool/dbconfig/20230607-133933-ladsgroup.json
  • 13:39 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
  • 13:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 13:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49097 and previous config saved to /var/cache/conftool/dbconfig/20230607-133854-ladsgroup.json
  • 13:38 lucaswerkmeister-wmde@deploy1002: daniel and lucaswerkmeister-wmde: Backport for Enable cache warming jobs for parsoid per default. (T329366) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 13:38 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1027.eqiad.wmnet']
  • 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T336886)', diff saved to https://phabricator.wikimedia.org/P49096 and previous config saved to /var/cache/conftool/dbconfig/20230607-133725-ladsgroup.json
  • 13:37 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Enable cache warming jobs for parsoid per default. (T329366)
  • 13:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 13:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T336886)', diff saved to https://phabricator.wikimedia.org/P49095 and previous config saved to /var/cache/conftool/dbconfig/20230607-133704-ladsgroup.json
  • 13:30 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 13:29 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1024.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:29 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 13:28 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1024.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P49093 and previous config saved to /var/cache/conftool/dbconfig/20230607-132348-ladsgroup.json
  • 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P49092 and previous config saved to /var/cache/conftool/dbconfig/20230607-132158-ladsgroup.json
  • 13:20 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:20 topranks: removing remote vlan configuration from lsw1-f1-eqiad T322937
  • 13:19 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:10 ladsgroup@deploy1002: Finished scap: Backport for Revert "Revert "Remove legacy encoding option from dawiktionary"" (duration: 07m 11s)
  • 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P49090 and previous config saved to /var/cache/conftool/dbconfig/20230607-130841-ladsgroup.json
  • 13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P49089 and previous config saved to /var/cache/conftool/dbconfig/20230607-130651-ladsgroup.json
  • 13:04 ladsgroup@deploy1002: ladsgroup: Backport for Revert "Revert "Remove legacy encoding option from dawiktionary"" synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 13:03 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 13:03 ladsgroup@deploy1002: Started scap: Backport for Revert "Revert "Remove legacy encoding option from dawiktionary""
  • 13:02 cmooney@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937 (duration: 11m 45s)
  • 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49088 and previous config saved to /var/cache/conftool/dbconfig/20230607-125335-ladsgroup.json
  • 12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T336886)', diff saved to https://phabricator.wikimedia.org/P49087 and previous config saved to /var/cache/conftool/dbconfig/20230607-125145-ladsgroup.json
  • 12:51 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetserver1001.eqiad.wmnet with OS bookworm
  • 12:50 topranks: Depooling lvs1019 to move link from lsw1-f1-eqiad to ssw1-f1-eqiad
  • 12:50 cmooney@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in eqiad, blocking deploys T322937
  • 12:46 Amir1: mwscript maintenance/storage/moveToExternal.php --iconv DB cluster27 on dawiktionary and svwiktionary (T128155 and T128156)
  • 12:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T336886)', diff saved to https://phabricator.wikimedia.org/P49086 and previous config saved to /var/cache/conftool/dbconfig/20230607-124543-ladsgroup.json
  • 12:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 12:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 12:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T336886)', diff saved to https://phabricator.wikimedia.org/P49085 and previous config saved to /var/cache/conftool/dbconfig/20230607-123926-ladsgroup.json
  • 12:37 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:37 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet - aborrero@cumin2002"
  • 12:36 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet - aborrero@cumin2002"
  • 12:33 aborrero@cumin2002: START - Cookbook sre.dns.netbox
  • 12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T336886)', diff saved to https://phabricator.wikimedia.org/P49084 and previous config saved to /var/cache/conftool/dbconfig/20230607-123002-ladsgroup.json
  • 12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P49083 and previous config saved to /var/cache/conftool/dbconfig/20230607-122420-ladsgroup.json
  • 12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P49082 and previous config saved to /var/cache/conftool/dbconfig/20230607-121456-ladsgroup.json
  • 12:13 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetserver1001.eqiad.wmnet with OS bookworm
  • 12:12 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetserver1001.eqiad.wmnet on all recursors
  • 12:12 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache puppetserver1001.eqiad.wmnet on all recursors
  • 12:11 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetserver.eqiad.wmnet on all recursors
  • 12:11 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache puppetserver.eqiad.wmnet on all recursors
  • 12:11 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:10 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
  • 12:09 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
  • 12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P49081 and previous config saved to /var/cache/conftool/dbconfig/20230607-120914-ladsgroup.json
  • 12:07 jbond@cumin1001: START - Cookbook sre.dns.netbox
  • 12:07 jbond@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host puppetserver1001
  • 12:06 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver1001
  • 12:06 jbond@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host puppetserver2001
  • 12:04 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver2001
  • 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P49080 and previous config saved to /var/cache/conftool/dbconfig/20230607-115950-ladsgroup.json
  • 11:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T336886)', diff saved to https://phabricator.wikimedia.org/P49079 and previous config saved to /var/cache/conftool/dbconfig/20230607-115408-ladsgroup.json
  • 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T336886)', diff saved to https://phabricator.wikimedia.org/P49078 and previous config saved to /var/cache/conftool/dbconfig/20230607-115156-ladsgroup.json
  • 11:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 11:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 (T336886)', diff saved to https://phabricator.wikimedia.org/P49077 and previous config saved to /var/cache/conftool/dbconfig/20230607-115124-ladsgroup.json
  • 11:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1138.eqiad.wmnet with reason: Maintenance
  • 11:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1138.eqiad.wmnet with reason: Maintenance
  • 11:48 jbond@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host puppetserver2001
  • 11:46 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver2001
  • 11:46 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:46 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
  • 11:45 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rename puppetmaster1005 -> puppetserver1001 - jbond@cumin1001"
  • 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T336886)', diff saved to https://phabricator.wikimedia.org/P49076 and previous config saved to /var/cache/conftool/dbconfig/20230607-114444-ladsgroup.json
  • 11:44 jbond@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host puppetserver1001
  • 11:43 jbond@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host puppetserver1001
  • 11:43 jbond@cumin1001: START - Cookbook sre.dns.netbox
  • 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T336886)', diff saved to https://phabricator.wikimedia.org/P49075 and previous config saved to /var/cache/conftool/dbconfig/20230607-114120-ladsgroup.json
  • 11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 11:41 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49074 and previous config saved to /var/cache/conftool/dbconfig/20230607-114059-ladsgroup.json
  • 11:40 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 11:35 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 11:35 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 11:30 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster2005
  • 11:30 jbond@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:30 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster1005
  • 11:30 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:30 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1005 decommissioned, removing all IPs except the asset tag one - jbond@cumin1001"
  • 11:29 jbond@cumin2002: START - Cookbook sre.dns.netbox
  • 11:27 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1005 decommissioned, removing all IPs except the asset tag one - jbond@cumin1001"
  • 11:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P49073 and previous config saved to /var/cache/conftool/dbconfig/20230607-112553-ladsgroup.json
  • 11:24 jbond@cumin1001: START - Cookbook sre.dns.netbox
  • 11:24 jbond@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster2005
  • 11:23 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts puppetmaster1005
  • 11:22 jbond@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster1005
  • 11:17 jbond@cumin1001: START - Cookbook sre.hosts.decommission for hosts puppetmaster1005
  • 11:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P49072 and previous config saved to /var/cache/conftool/dbconfig/20230607-111047-ladsgroup.json
  • 10:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49071 and previous config saved to /var/cache/conftool/dbconfig/20230607-105541-ladsgroup.json
  • 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49070 and previous config saved to /var/cache/conftool/dbconfig/20230607-105215-ladsgroup.json
  • 10:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49069 and previous config saved to /var/cache/conftool/dbconfig/20230607-105154-ladsgroup.json
  • 10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P49068 and previous config saved to /var/cache/conftool/dbconfig/20230607-103648-ladsgroup.json
  • 10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P49066 and previous config saved to /var/cache/conftool/dbconfig/20230607-102141-ladsgroup.json
  • 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49065 and previous config saved to /var/cache/conftool/dbconfig/20230607-100635-ladsgroup.json
  • 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T336886)', diff saved to https://phabricator.wikimedia.org/P49064 and previous config saved to /var/cache/conftool/dbconfig/20230607-100307-ladsgroup.json
  • 10:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 10:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T336886)', diff saved to https://phabricator.wikimedia.org/P49063 and previous config saved to /var/cache/conftool/dbconfig/20230607-100247-ladsgroup.json
  • 09:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P49062 and previous config saved to /var/cache/conftool/dbconfig/20230607-094740-ladsgroup.json
  • 09:33 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir
  • 09:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P49061 and previous config saved to /var/cache/conftool/dbconfig/20230607-093234-ladsgroup.json
  • 09:21 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T336886)', diff saved to https://phabricator.wikimedia.org/P49060 and previous config saved to /var/cache/conftool/dbconfig/20230607-091728-ladsgroup.json
  • 09:17 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T336886)', diff saved to https://phabricator.wikimedia.org/P49059 and previous config saved to /var/cache/conftool/dbconfig/20230607-091402-ladsgroup.json
  • 09:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 09:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T336886)', diff saved to https://phabricator.wikimedia.org/P49058 and previous config saved to /var/cache/conftool/dbconfig/20230607-091341-ladsgroup.json
  • 09:07 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 09:06 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 09:00 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 08:59 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_eqiad and A:cp
  • 08:59 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_eqiad and A:cp
  • 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P49057 and previous config saved to /var/cache/conftool/dbconfig/20230607-085835-ladsgroup.json
  • 08:49 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P49056 and previous config saved to /var/cache/conftool/dbconfig/20230607-084329-ladsgroup.json
  • 08:34 fabfur: disable puppet on A:cp-eqiad for varnish <-> haproxy port 80 swap
  • 08:29 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir
  • 08:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T336886)', diff saved to https://phabricator.wikimedia.org/P49055 and previous config saved to /var/cache/conftool/dbconfig/20230607-082823-ladsgroup.json
  • 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T336886)', diff saved to https://phabricator.wikimedia.org/P49054 and previous config saved to /var/cache/conftool/dbconfig/20230607-082500-ladsgroup.json
  • 08:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 08:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 08:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 08:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 08:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T336886)', diff saved to https://phabricator.wikimedia.org/P49053 and previous config saved to /var/cache/conftool/dbconfig/20230607-082434-ladsgroup.json
  • 08:22 moritzm: uploaded ruby 2.5.5-3+deb10u5+wmf1 to apt.wikimedia.org, unbreaking Puppet runs after latest Ruby update for Buster T338294
  • 08:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P49052 and previous config saved to /var/cache/conftool/dbconfig/20230607-080928-ladsgroup.json
  • 07:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P49051 and previous config saved to /var/cache/conftool/dbconfig/20230607-075422-ladsgroup.json
  • 07:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T336886)', diff saved to https://phabricator.wikimedia.org/P49050 and previous config saved to /var/cache/conftool/dbconfig/20230607-073916-ladsgroup.json
  • 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T336886)', diff saved to https://phabricator.wikimedia.org/P49049 and previous config saved to /var/cache/conftool/dbconfig/20230607-073554-ladsgroup.json
  • 07:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 07:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T336886)', diff saved to https://phabricator.wikimedia.org/P49048 and previous config saved to /var/cache/conftool/dbconfig/20230607-073533-ladsgroup.json
  • 07:22 kartik@deploy1002: Finished scap: Backport for Use direct Parsoid in Small and Medium Wikis for Content Translation (T337922) (duration: 18m 06s)
  • 07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P49047 and previous config saved to /var/cache/conftool/dbconfig/20230607-072027-ladsgroup.json
  • 07:06 kartik@deploy1002: kartik: Backport for Use direct Parsoid in Small and Medium Wikis for Content Translation (T337922) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P49046 and previous config saved to /var/cache/conftool/dbconfig/20230607-070521-ladsgroup.json
  • 07:04 kartik@deploy1002: Started scap: Backport for Use direct Parsoid in Small and Medium Wikis for Content Translation (T337922)
  • 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T336886)', diff saved to https://phabricator.wikimedia.org/P49045 and previous config saved to /var/cache/conftool/dbconfig/20230607-065015-ladsgroup.json
  • 06:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T336886)', diff saved to https://phabricator.wikimedia.org/P49044 and previous config saved to /var/cache/conftool/dbconfig/20230607-064652-ladsgroup.json
  • 06:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 06:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 06:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T336886)', diff saved to https://phabricator.wikimedia.org/P49043 and previous config saved to /var/cache/conftool/dbconfig/20230607-064631-ladsgroup.json
  • 06:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T336886)', diff saved to https://phabricator.wikimedia.org/P49042 and previous config saved to /var/cache/conftool/dbconfig/20230607-064215-ladsgroup.json
  • 06:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P49041 and previous config saved to /var/cache/conftool/dbconfig/20230607-063125-ladsgroup.json
  • 06:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P49040 and previous config saved to /var/cache/conftool/dbconfig/20230607-062709-ladsgroup.json
  • 06:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P49039 and previous config saved to /var/cache/conftool/dbconfig/20230607-061618-ladsgroup.json
  • 06:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P49038 and previous config saved to /var/cache/conftool/dbconfig/20230607-061203-ladsgroup.json
  • 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T336886)', diff saved to https://phabricator.wikimedia.org/P49037 and previous config saved to /var/cache/conftool/dbconfig/20230607-060112-ladsgroup.json
  • 05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2161 (T336886)', diff saved to https://phabricator.wikimedia.org/P49036 and previous config saved to /var/cache/conftool/dbconfig/20230607-055746-ladsgroup.json
  • 05:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 05:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T336886)', diff saved to https://phabricator.wikimedia.org/P49035 and previous config saved to /var/cache/conftool/dbconfig/20230607-055726-ladsgroup.json
  • 05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T336886)', diff saved to https://phabricator.wikimedia.org/P49034 and previous config saved to /var/cache/conftool/dbconfig/20230607-055655-ladsgroup.json
  • 05:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T336886)', diff saved to https://phabricator.wikimedia.org/P49033 and previous config saved to /var/cache/conftool/dbconfig/20230607-055320-ladsgroup.json
  • 05:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 05:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 05:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49032 and previous config saved to /var/cache/conftool/dbconfig/20230607-055259-ladsgroup.json
  • 05:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P49031 and previous config saved to /var/cache/conftool/dbconfig/20230607-054220-ladsgroup.json
  • 05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P49030 and previous config saved to /var/cache/conftool/dbconfig/20230607-053753-ladsgroup.json
  • 05:28 kart_: Updated cxserver to 2023-06-07-044025-production (T337290, T337669, T337834)
  • 05:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P49029 and previous config saved to /var/cache/conftool/dbconfig/20230607-052713-ladsgroup.json
  • 05:25 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:25 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:22 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P49028 and previous config saved to /var/cache/conftool/dbconfig/20230607-052247-ladsgroup.json
  • 05:22 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:17 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:17 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T336886)', diff saved to https://phabricator.wikimedia.org/P49027 and previous config saved to /var/cache/conftool/dbconfig/20230607-051207-ladsgroup.json
  • 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T336886)', diff saved to https://phabricator.wikimedia.org/P49026 and previous config saved to /var/cache/conftool/dbconfig/20230607-050844-ladsgroup.json
  • 05:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 05:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T336886)', diff saved to https://phabricator.wikimedia.org/P49025 and previous config saved to /var/cache/conftool/dbconfig/20230607-050823-ladsgroup.json
  • 05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49024 and previous config saved to /var/cache/conftool/dbconfig/20230607-050740-ladsgroup.json
  • 05:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49023 and previous config saved to /var/cache/conftool/dbconfig/20230607-050258-ladsgroup.json
  • 05:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 05:02 kart_: Updated MinT to 2023-06-06-120533-production (T337910, T337686, T337708)
  • 05:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 05:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T336886)', diff saved to https://phabricator.wikimedia.org/P49022 and previous config saved to /var/cache/conftool/dbconfig/20230607-050237-ladsgroup.json
  • 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P49021 and previous config saved to /var/cache/conftool/dbconfig/20230607-045317-ladsgroup.json
  • 04:51 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 04:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P49020 and previous config saved to /var/cache/conftool/dbconfig/20230607-044731-ladsgroup.json
  • 04:45 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 04:39 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 04:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P49019 and previous config saved to /var/cache/conftool/dbconfig/20230607-043810-ladsgroup.json
  • 04:36 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 04:32 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 04:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P49018 and previous config saved to /var/cache/conftool/dbconfig/20230607-043225-ladsgroup.json
  • 04:31 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 04:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T336886)', diff saved to https://phabricator.wikimedia.org/P49017 and previous config saved to /var/cache/conftool/dbconfig/20230607-042304-ladsgroup.json
  • 04:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T336886)', diff saved to https://phabricator.wikimedia.org/P49016 and previous config saved to /var/cache/conftool/dbconfig/20230607-042040-ladsgroup.json
  • 04:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 04:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 04:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 04:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 04:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T336886)', diff saved to https://phabricator.wikimedia.org/P49015 and previous config saved to /var/cache/conftool/dbconfig/20230607-041719-ladsgroup.json
  • 04:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 04:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 04:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 04:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T336886)', diff saved to https://phabricator.wikimedia.org/P49014 and previous config saved to /var/cache/conftool/dbconfig/20230607-041357-ladsgroup.json
  • 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T336886)', diff saved to https://phabricator.wikimedia.org/P49013 and previous config saved to /var/cache/conftool/dbconfig/20230607-041347-ladsgroup.json
  • 04:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 04:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49012 and previous config saved to /var/cache/conftool/dbconfig/20230607-041326-ladsgroup.json
  • 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P49011 and previous config saved to /var/cache/conftool/dbconfig/20230607-035851-ladsgroup.json
  • 03:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P49010 and previous config saved to /var/cache/conftool/dbconfig/20230607-035820-ladsgroup.json
  • 03:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P49009 and previous config saved to /var/cache/conftool/dbconfig/20230607-034345-ladsgroup.json
  • 03:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P49008 and previous config saved to /var/cache/conftool/dbconfig/20230607-034314-ladsgroup.json
  • 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T336886)', diff saved to https://phabricator.wikimedia.org/P49007 and previous config saved to /var/cache/conftool/dbconfig/20230607-032839-ladsgroup.json
  • 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49006 and previous config saved to /var/cache/conftool/dbconfig/20230607-032808-ladsgroup.json
  • 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1214 (T336886)', diff saved to https://phabricator.wikimedia.org/P49005 and previous config saved to /var/cache/conftool/dbconfig/20230607-032522-ladsgroup.json
  • 03:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 03:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T336886)', diff saved to https://phabricator.wikimedia.org/P49004 and previous config saved to /var/cache/conftool/dbconfig/20230607-032501-ladsgroup.json
  • 03:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P49003 and previous config saved to /var/cache/conftool/dbconfig/20230607-032428-ladsgroup.json
  • 03:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 03:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 03:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T336886)', diff saved to https://phabricator.wikimedia.org/P49002 and previous config saved to /var/cache/conftool/dbconfig/20230607-032407-ladsgroup.json
  • 03:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P49001 and previous config saved to /var/cache/conftool/dbconfig/20230607-030955-ladsgroup.json
  • 03:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P49000 and previous config saved to /var/cache/conftool/dbconfig/20230607-030901-ladsgroup.json
  • 02:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P48999 and previous config saved to /var/cache/conftool/dbconfig/20230607-025449-ladsgroup.json
  • 02:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P48998 and previous config saved to /var/cache/conftool/dbconfig/20230607-025355-ladsgroup.json
  • 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T336886)', diff saved to https://phabricator.wikimedia.org/P48997 and previous config saved to /var/cache/conftool/dbconfig/20230607-023943-ladsgroup.json
  • 02:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T336886)', diff saved to https://phabricator.wikimedia.org/P48996 and previous config saved to /var/cache/conftool/dbconfig/20230607-023848-ladsgroup.json
  • 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1211 (T336886)', diff saved to https://phabricator.wikimedia.org/P48995 and previous config saved to /var/cache/conftool/dbconfig/20230607-023624-ladsgroup.json
  • 02:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T336886)', diff saved to https://phabricator.wikimedia.org/P48994 and previous config saved to /var/cache/conftool/dbconfig/20230607-023613-ladsgroup.json
  • 02:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 02:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T336886)', diff saved to https://phabricator.wikimedia.org/P48993 and previous config saved to /var/cache/conftool/dbconfig/20230607-023603-ladsgroup.json
  • 02:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 02:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 02:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T336886)', diff saved to https://phabricator.wikimedia.org/P48992 and previous config saved to /var/cache/conftool/dbconfig/20230607-023537-ladsgroup.json
  • 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P48991 and previous config saved to /var/cache/conftool/dbconfig/20230607-022057-ladsgroup.json
  • 02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P48990 and previous config saved to /var/cache/conftool/dbconfig/20230607-022031-ladsgroup.json
  • 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P48989 and previous config saved to /var/cache/conftool/dbconfig/20230607-020550-ladsgroup.json
  • 02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P48988 and previous config saved to /var/cache/conftool/dbconfig/20230607-020518-ladsgroup.json
  • 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T336886)', diff saved to https://phabricator.wikimedia.org/P48987 and previous config saved to /var/cache/conftool/dbconfig/20230607-015043-ladsgroup.json
  • 01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T336886)', diff saved to https://phabricator.wikimedia.org/P48986 and previous config saved to /var/cache/conftool/dbconfig/20230607-015012-ladsgroup.json
  • 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T336886)', diff saved to https://phabricator.wikimedia.org/P48985 and previous config saved to /var/cache/conftool/dbconfig/20230607-014635-ladsgroup.json
  • 01:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1209 (T336886)', diff saved to https://phabricator.wikimedia.org/P48984 and previous config saved to /var/cache/conftool/dbconfig/20230607-014626-ladsgroup.json
  • 01:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 01:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T336886)', diff saved to https://phabricator.wikimedia.org/P48983 and previous config saved to /var/cache/conftool/dbconfig/20230607-014614-ladsgroup.json
  • 01:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T336886)', diff saved to https://phabricator.wikimedia.org/P48982 and previous config saved to /var/cache/conftool/dbconfig/20230607-014605-ladsgroup.json
  • 01:39 sukhe: run authdns-update: T338280
  • 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P48981 and previous config saved to /var/cache/conftool/dbconfig/20230607-013108-ladsgroup.json
  • 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P48980 and previous config saved to /var/cache/conftool/dbconfig/20230607-013059-ladsgroup.json
  • 01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P48979 and previous config saved to /var/cache/conftool/dbconfig/20230607-011602-ladsgroup.json
  • 01:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P48978 and previous config saved to /var/cache/conftool/dbconfig/20230607-011553-ladsgroup.json
  • 01:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T336886)', diff saved to https://phabricator.wikimedia.org/P48977 and previous config saved to /var/cache/conftool/dbconfig/20230607-010055-ladsgroup.json
  • 01:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T336886)', diff saved to https://phabricator.wikimedia.org/P48976 and previous config saved to /var/cache/conftool/dbconfig/20230607-010047-ladsgroup.json
  • 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1203 (T336886)', diff saved to https://phabricator.wikimedia.org/P48975 and previous config saved to /var/cache/conftool/dbconfig/20230607-005722-ladsgroup.json
  • 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T336886)', diff saved to https://phabricator.wikimedia.org/P48974 and previous config saved to /var/cache/conftool/dbconfig/20230607-005713-ladsgroup.json
  • 00:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 00:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 00:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T336886)', diff saved to https://phabricator.wikimedia.org/P48973 and previous config saved to /var/cache/conftool/dbconfig/20230607-005654-ladsgroup.json
  • 00:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 00:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 00:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 00:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 00:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 00:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48972 and previous config saved to /var/cache/conftool/dbconfig/20230607-005155-ladsgroup.json
  • 00:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P48971 and previous config saved to /var/cache/conftool/dbconfig/20230607-004148-ladsgroup.json
  • 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P48970 and previous config saved to /var/cache/conftool/dbconfig/20230607-003649-ladsgroup.json
  • 00:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P48969 and previous config saved to /var/cache/conftool/dbconfig/20230607-002642-ladsgroup.json
  • 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P48968 and previous config saved to /var/cache/conftool/dbconfig/20230607-002143-ladsgroup.json
  • 00:14 urbanecm:: Deployed security patch for T338276
  • 00:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T336886)', diff saved to https://phabricator.wikimedia.org/P48967 and previous config saved to /var/cache/conftool/dbconfig/20230607-001136-ladsgroup.json
  • 00:08 urbanecm:: Deployed security patch for T338276
  • 00:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1193 (T336886)', diff saved to https://phabricator.wikimedia.org/P48966 and previous config saved to /var/cache/conftool/dbconfig/20230607-000814-ladsgroup.json
  • 00:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 00:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T336886)', diff saved to https://phabricator.wikimedia.org/P48965 and previous config saved to /var/cache/conftool/dbconfig/20230607-000754-ladsgroup.json
  • 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48964 and previous config saved to /var/cache/conftool/dbconfig/20230607-000637-ladsgroup.json
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1213:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48963 and previous config saved to /var/cache/conftool/dbconfig/20230607-000337-ladsgroup.json
  • 00:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 00:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T336886)', diff saved to https://phabricator.wikimedia.org/P48962 and previous config saved to /var/cache/conftool/dbconfig/20230607-000316-ladsgroup.json
  • 00:01 urbanecm: Deploying security patch for T338276

2023-06-06

  • 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48961 and previous config saved to /var/cache/conftool/dbconfig/20230606-235248-ladsgroup.json
  • 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48960 and previous config saved to /var/cache/conftool/dbconfig/20230606-234810-ladsgroup.json
  • 23:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-a1-codfw.mgmt.codfw.wmnet
  • 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P48959 and previous config saved to /var/cache/conftool/dbconfig/20230606-233742-ladsgroup.json
  • 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P48958 and previous config saved to /var/cache/conftool/dbconfig/20230606-233304-ladsgroup.json
  • 23:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T336886)', diff saved to https://phabricator.wikimedia.org/P48955 and previous config saved to /var/cache/conftool/dbconfig/20230606-232235-ladsgroup.json
  • 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-a1-codfw - pt1979@cumin2002"
  • 23:19 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-a1-codfw - pt1979@cumin2002"
  • 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1192 (T336886)', diff saved to https://phabricator.wikimedia.org/P48954 and previous config saved to /var/cache/conftool/dbconfig/20230606-231913-ladsgroup.json
  • 23:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 23:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T336886)', diff saved to https://phabricator.wikimedia.org/P48953 and previous config saved to /var/cache/conftool/dbconfig/20230606-231853-ladsgroup.json
  • 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T336886)', diff saved to https://phabricator.wikimedia.org/P48952 and previous config saved to /var/cache/conftool/dbconfig/20230606-231758-ladsgroup.json
  • 23:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:16 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-a1-codfw.mgmt.codfw.wmnet
  • 23:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 23:16 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:16 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 23:15 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1210 (T336886)', diff saved to https://phabricator.wikimedia.org/P48951 and previous config saved to /var/cache/conftool/dbconfig/20230606-231408-ladsgroup.json
  • 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 23:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 23:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T336886)', diff saved to https://phabricator.wikimedia.org/P48950 and previous config saved to /var/cache/conftool/dbconfig/20230606-231347-ladsgroup.json
  • 23:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P48949 and previous config saved to /var/cache/conftool/dbconfig/20230606-230347-ladsgroup.json
  • 22:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P48948 and previous config saved to /var/cache/conftool/dbconfig/20230606-225841-ladsgroup.json
  • 22:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 22:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 22:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 22:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P48947 and previous config saved to /var/cache/conftool/dbconfig/20230606-224841-ladsgroup.json
  • 22:48 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P48946 and previous config saved to /var/cache/conftool/dbconfig/20230606-224334-ladsgroup.json
  • 22:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T336886)', diff saved to https://phabricator.wikimedia.org/P48945 and previous config saved to /var/cache/conftool/dbconfig/20230606-223335-ladsgroup.json
  • 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T336886)', diff saved to https://phabricator.wikimedia.org/P48944 and previous config saved to /var/cache/conftool/dbconfig/20230606-223011-ladsgroup.json
  • 22:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 22:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48943 and previous config saved to /var/cache/conftool/dbconfig/20230606-222950-ladsgroup.json
  • 22:29 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T336886)', diff saved to https://phabricator.wikimedia.org/P48942 and previous config saved to /var/cache/conftool/dbconfig/20230606-222828-ladsgroup.json
  • 22:27 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp everywhere (T299954) (duration: 07m 33s)
  • 22:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1200 (T336886)', diff saved to https://phabricator.wikimedia.org/P48941 and previous config saved to /var/cache/conftool/dbconfig/20230606-222534-ladsgroup.json
  • 22:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 22:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 22:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T336886)', diff saved to https://phabricator.wikimedia.org/P48940 and previous config saved to /var/cache/conftool/dbconfig/20230606-222513-ladsgroup.json
  • 22:21 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp everywhere (T299954) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 22:19 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp everywhere (T299954)
  • 22:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P48939 and previous config saved to /var/cache/conftool/dbconfig/20230606-221444-ladsgroup.json
  • 22:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P48938 and previous config saved to /var/cache/conftool/dbconfig/20230606-221007-ladsgroup.json
  • 21:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P48937 and previous config saved to /var/cache/conftool/dbconfig/20230606-215938-ladsgroup.json
  • 21:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P48936 and previous config saved to /var/cache/conftool/dbconfig/20230606-215501-ladsgroup.json
  • 21:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48935 and previous config saved to /var/cache/conftool/dbconfig/20230606-214432-ladsgroup.json
  • 21:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48934 and previous config saved to /var/cache/conftool/dbconfig/20230606-214109-ladsgroup.json
  • 21:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T336886)', diff saved to https://phabricator.wikimedia.org/P48933 and previous config saved to /var/cache/conftool/dbconfig/20230606-214048-ladsgroup.json
  • 21:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T336886)', diff saved to https://phabricator.wikimedia.org/P48932 and previous config saved to /var/cache/conftool/dbconfig/20230606-213954-ladsgroup.json
  • 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1185 (T336886)', diff saved to https://phabricator.wikimedia.org/P48931 and previous config saved to /var/cache/conftool/dbconfig/20230606-213702-ladsgroup.json
  • 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T336886)', diff saved to https://phabricator.wikimedia.org/P48930 and previous config saved to /var/cache/conftool/dbconfig/20230606-213641-ladsgroup.json
  • 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P48929 and previous config saved to /var/cache/conftool/dbconfig/20230606-212542-ladsgroup.json
  • 21:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P48928 and previous config saved to /var/cache/conftool/dbconfig/20230606-212135-ladsgroup.json
  • 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P48927 and previous config saved to /var/cache/conftool/dbconfig/20230606-211036-ladsgroup.json
  • 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P48926 and previous config saved to /var/cache/conftool/dbconfig/20230606-210629-ladsgroup.json
  • 21:03 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 21:03 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1026.eqiad.wmnet with OS bullseye
  • 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T336886)', diff saved to https://phabricator.wikimedia.org/P48925 and previous config saved to /var/cache/conftool/dbconfig/20230606-205530-ladsgroup.json
  • 20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T336886)', diff saved to https://phabricator.wikimedia.org/P48924 and previous config saved to /var/cache/conftool/dbconfig/20230606-205206-ladsgroup.json
  • 20:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 20:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 20:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T336886)', diff saved to https://phabricator.wikimedia.org/P48923 and previous config saved to /var/cache/conftool/dbconfig/20230606-205123-ladsgroup.json
  • 20:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 20:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P48922 and previous config saved to /var/cache/conftool/dbconfig/20230606-205002-ladsgroup.json
  • 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1183 (T336886)', diff saved to https://phabricator.wikimedia.org/P48921 and previous config saved to /var/cache/conftool/dbconfig/20230606-204527-ladsgroup.json
  • 20:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 20:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T336886)', diff saved to https://phabricator.wikimedia.org/P48920 and previous config saved to /var/cache/conftool/dbconfig/20230606-204506-ladsgroup.json
  • 20:41 urbanecm@deploy1002: Finished scap: Backport for PersonalizedPraiseLogger: Only include mentee_id if not null (T338078), PersonalizedPraiseLogger: Only include mentee_id if not null (T338078) (duration: 07m 23s)
  • 20:35 urbanecm@deploy1002: urbanecm: Backport for PersonalizedPraiseLogger: Only include mentee_id if not null (T338078), PersonalizedPraiseLogger: Only include mentee_id if not null (T338078) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48919 and previous config saved to /var/cache/conftool/dbconfig/20230606-203456-ladsgroup.json
  • 20:34 urbanecm@deploy1002: Started scap: Backport for PersonalizedPraiseLogger: Only include mentee_id if not null (T338078), PersonalizedPraiseLogger: Only include mentee_id if not null (T338078)
  • 20:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P48917 and previous config saved to /var/cache/conftool/dbconfig/20230606-203000-ladsgroup.json
  • 20:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48916 and previous config saved to /var/cache/conftool/dbconfig/20230606-201950-ladsgroup.json
  • 20:16 mutante: miscweb1003, miscweb2003 - rm -rf /srv/org/wikimedia/sitemaps after removing httpd virtual host T338064
  • 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P48915 and previous config saved to /var/cache/conftool/dbconfig/20230606-201454-ladsgroup.json
  • 20:09 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 20:09 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1026.eqiad.wmnet with OS bullseye
  • 20:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P48914 and previous config saved to /var/cache/conftool/dbconfig/20230606-200444-ladsgroup.json
  • 19:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T336886)', diff saved to https://phabricator.wikimedia.org/P48913 and previous config saved to /var/cache/conftool/dbconfig/20230606-195948-ladsgroup.json
  • 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T336886)', diff saved to https://phabricator.wikimedia.org/P48912 and previous config saved to /var/cache/conftool/dbconfig/20230606-195557-ladsgroup.json
  • 19:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 19:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 19:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 19:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48911 and previous config saved to /var/cache/conftool/dbconfig/20230606-195320-ladsgroup.json
  • 19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P48910 and previous config saved to /var/cache/conftool/dbconfig/20230606-193814-ladsgroup.json
  • 19:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P48909 and previous config saved to /var/cache/conftool/dbconfig/20230606-192308-ladsgroup.json
  • 19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48908 and previous config saved to /var/cache/conftool/dbconfig/20230606-190802-ladsgroup.json
  • 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P48907 and previous config saved to /var/cache/conftool/dbconfig/20230606-190420-ladsgroup.json
  • 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 19:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T336886)', diff saved to https://phabricator.wikimedia.org/P48906 and previous config saved to /var/cache/conftool/dbconfig/20230606-190402-ladsgroup.json
  • 19:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 19:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 18:10 mutante: disabling https://sitemaps.wikimedia.org - T338064 T332101
  • 18:10 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.12 refs T337526
  • 18:01 sukhe: cumin 'A:cp-text' 'enable-puppet "CR 926611" && run-puppet-agent -q'
  • 18:01 sukhe: re-enable puppet on A:cp-text and force puppet run: T338064
  • 17:54 sukhe: enable puppet on cp4037 to test CR 926611
  • 17:50 sukhe: disable puppet on A:cp-text to roll out CR 926611
  • 17:39 sukhe: sudo cumin 'P:ntp' 'enable-puppet "testing CR 926598" && run-puppet-agent'
  • 17:27 sukhe: sudo cumin 'P:ntp' 'disable-puppet "testing CR 926598"'
  • 17:05 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 17:04 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 17:04 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 17:01 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 16:51 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 16:41 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 16:40 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:40 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:39 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 16:37 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 16:37 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:36 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:30 sukhe: low-traffic/codfw: set routing-options static route 10.2.1.0/24 next-hop 10.192.32.14
  • 16:27 sukhe: restart pybal on lvs2013 to remove bgp-med override
  • 16:23 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 16:12 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 16:12 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 16:06 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 16:03 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T336886)', diff saved to https://phabricator.wikimedia.org/P48904 and previous config saved to /var/cache/conftool/dbconfig/20230606-160151-ladsgroup.json
  • 15:54 jbond@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
  • 15:53 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:52 jbond@cumin1001: START - Cookbook sre.postgresql.postgres-init
  • 15:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P48902 and previous config saved to /var/cache/conftool/dbconfig/20230606-154645-ladsgroup.json
  • 15:46 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:46 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:40 cdanis@deploy1002: Finished scap: Backport for Revert "EventStreamConfig - development.network.probe- disable canary events and hadoop ingestion" (duration: 08m 13s)
  • 15:38 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:37 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:35 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 15:35 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:34 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 15:34 cdanis@deploy1002: cdanis and otto: Backport for Revert "EventStreamConfig - development.network.probe- disable canary events and hadoop ingestion" synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 15:32 zabe: purge wikimaniawiki logos # T337044
  • 15:32 cdanis@deploy1002: Started scap: Backport for Revert "EventStreamConfig - development.network.probe- disable canary events and hadoop ingestion"
  • 15:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P48901 and previous config saved to /var/cache/conftool/dbconfig/20230606-153139-ladsgroup.json
  • 15:30 zabe@deploy1002: Finished scap: Backport for Change project logo for Wikimania to Wikimania 2023 version (T337044) (duration: 08m 02s)
  • 15:26 sukhe: homer "cr*-codfw*" commit "Gerrit: 927725 add new LVS host lvs2013" : T326767
  • 15:24 zabe@deploy1002: robertsky and zabe: Backport for Change project logo for Wikimania to Wikimania 2023 version (T337044) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 15:22 zabe@deploy1002: Started scap: Backport for Change project logo for Wikimania to Wikimania 2023 version (T337044)
  • 15:21 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs2013
  • 15:21 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2013
  • 15:20 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 15:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:19 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T336886)', diff saved to https://phabricator.wikimedia.org/P48900 and previous config saved to /var/cache/conftool/dbconfig/20230606-151633-ladsgroup.json
  • 15:12 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_esams and A:cp
  • 15:08 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1007.eqiad.wmnet
  • 15:07 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 15:06 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:06 mforns@deploy1002: Finished deploy [airflow-dags/analytics@72d9b87]: (no justification provided) (duration: 00m 10s)
  • 15:06 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:06 mforns@deploy1002: Started deploy [airflow-dags/analytics@72d9b87]: (no justification provided)
  • 15:03 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 15:02 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:02 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T336886)', diff saved to https://phabricator.wikimedia.org/P48899 and previous config saved to /var/cache/conftool/dbconfig/20230606-150141-ladsgroup.json
  • 15:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 15:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48898 and previous config saved to /var/cache/conftool/dbconfig/20230606-150120-ladsgroup.json
  • 15:00 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host dumpsdata1007.eqiad.wmnet
  • 14:57 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1026.eqiad.wmnet with OS bullseye
  • 14:57 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 14:56 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 14:53 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 14:53 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 14:53 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:53 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:53 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:53 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:53 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:53 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change entries for moved links eqiad row e f switches - cmooney@cumin1001"
  • 14:51 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change entries for moved links eqiad row e f switches - cmooney@cumin1001"
  • 14:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2013.codfw.wmnet with OS bullseye
  • 14:49 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 14:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P48897 and previous config saved to /var/cache/conftool/dbconfig/20230606-144614-ladsgroup.json
  • 14:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
  • 14:31 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
  • 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P48896 and previous config saved to /var/cache/conftool/dbconfig/20230606-143107-ladsgroup.json
  • 14:25 oblivian@deploy1002: Finished scap: Backport for Load and enable parsoid everywhere (T334980) (duration: 15m 00s)
  • 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48895 and previous config saved to /var/cache/conftool/dbconfig/20230606-141601-ladsgroup.json
  • 14:16 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2013.codfw.wmnet with OS bullseye
  • 14:15 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 14:12 oblivian@deploy1002: oblivian: Backport for Load and enable parsoid everywhere (T334980) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 14:10 oblivian@deploy1002: Started scap: Backport for Load and enable parsoid everywhere (T334980)
  • 14:08 eoghan@cumin1001: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 14:06 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1,3]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
  • 14:06 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1,3]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
  • 14:03 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1026.eqiad.wmnet with OS bullseye
  • 14:03 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1027.eqiad.wmnet with OS bullseye
  • 14:01 oblivian@deploy1002: Finished scap: Backport for Enable parser cache warming jobs for parsoid on enwiki (T329366) (duration: 07m 57s)
  • 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48894 and previous config saved to /var/cache/conftool/dbconfig/20230606-140051-ladsgroup.json
  • 14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48893 and previous config saved to /var/cache/conftool/dbconfig/20230606-140030-ladsgroup.json
  • 13:59 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging AndyRussG out of all services on: 780 hosts
  • 13:58 jmm@cumin2002: START - Cookbook sre.idm.logout Logging AndyRussG out of all services on: 780 hosts
  • 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging AndyRussG out of all services on: 1259 hosts
  • 13:57 jmm@cumin2002: START - Cookbook sre.idm.logout Logging AndyRussG out of all services on: 1259 hosts
  • 13:55 oblivian@deploy1002: oblivian and daniel: Backport for Enable parser cache warming jobs for parsoid on enwiki (T329366) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 13:53 oblivian@deploy1002: Started scap: Backport for Enable parser cache warming jobs for parsoid on enwiki (T329366)
  • 13:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 13:50 oblivian@deploy1002: Finished scap: Backport for Drop wmgMemoryLimitParsoid from IS.php (duration: 07m 21s)
  • 13:49 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1023.eqiad.wmnet with OS bullseye
  • 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P48891 and previous config saved to /var/cache/conftool/dbconfig/20230606-134524-ladsgroup.json
  • 13:45 oblivian@deploy1002: oblivian: Backport for Drop wmgMemoryLimitParsoid from IS.php synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 13:43 oblivian@deploy1002: Started scap: Backport for Drop wmgMemoryLimitParsoid from IS.php
  • 13:41 oblivian@deploy1002: Finished scap: Backport for Raise memory limit to match parsoid (T334980) (duration: 07m 53s)
  • 13:41 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
  • 13:41 elukey@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: sync
  • 13:35 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1-2]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
  • 13:35 oblivian@deploy1002: oblivian: Backport for Raise memory limit to match parsoid (T334980) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 13:34 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e1-eqiad.mgmt,lsw1-f[1-2]-eqiad.mgmt with reason: Migrate lsw1-f2-eqiad uplinks to spine
  • 13:33 oblivian@deploy1002: Started scap: Backport for Raise memory limit to match parsoid (T334980)
  • 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P48890 and previous config saved to /var/cache/conftool/dbconfig/20230606-133018-ladsgroup.json
  • 13:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48889 and previous config saved to /var/cache/conftool/dbconfig/20230606-131512-ladsgroup.json
  • 13:11 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 13:06 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: EventStreamConfig - Disable canary events and hadoop ingestion for development.network.probe - T332024 (duration: 07m 17s)
  • 13:00 eoghan@cumin1001: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48888 and previous config saved to /var/cache/conftool/dbconfig/20230606-125944-ladsgroup.json
  • 12:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T336886)', diff saved to https://phabricator.wikimedia.org/P48887 and previous config saved to /var/cache/conftool/dbconfig/20230606-125923-ladsgroup.json
  • 12:56 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_esams and A:cp
  • 12:55 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
  • 12:53 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1023.eqiad.wmnet with OS bullseye
  • 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P48886 and previous config saved to /var/cache/conftool/dbconfig/20230606-124417-ladsgroup.json
  • 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P48885 and previous config saved to /var/cache/conftool/dbconfig/20230606-122911-ladsgroup.json
  • 12:21 cgoubert@deploy1002: Finished scap: (no justification provided) (duration: 02m 10s)
  • 12:19 cgoubert@deploy1002: Started scap: (no justification provided)
  • 12:19 claime: redeploying 927218 to mw-on-k8s - T338121
  • 12:15 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T336886)', diff saved to https://phabricator.wikimedia.org/P48884 and previous config saved to /var/cache/conftool/dbconfig/20230606-121405-ladsgroup.json
  • 12:09 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 12:00 kamila@deploy1002: Finished scap: Backport for OAuthRateLimiter: Add rate limiting class for WME using LiftWing (T338121) (duration: 08m 54s)
  • 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T336886)', diff saved to https://phabricator.wikimedia.org/P48881 and previous config saved to /var/cache/conftool/dbconfig/20230606-115911-ladsgroup.json
  • 11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 11:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T336886)', diff saved to https://phabricator.wikimedia.org/P48880 and previous config saved to /var/cache/conftool/dbconfig/20230606-115833-ladsgroup.json
  • 11:53 kamila@deploy1002: kamila and klausman: Backport for OAuthRateLimiter: Add rate limiting class for WME using LiftWing (T338121) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 11:51 kamila@deploy1002: Started scap: Backport for OAuthRateLimiter: Add rate limiting class for WME using LiftWing (T338121)
  • 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P48879 and previous config saved to /var/cache/conftool/dbconfig/20230606-114327-ladsgroup.json
  • 11:38 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 11:37 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 11:31 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 11:31 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P48878 and previous config saved to /var/cache/conftool/dbconfig/20230606-112819-ladsgroup.json
  • 11:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T336886)', diff saved to https://phabricator.wikimedia.org/P48877 and previous config saved to /var/cache/conftool/dbconfig/20230606-111313-ladsgroup.json
  • 11:03 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrading Gitlab to 15.10.8
  • 10:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T336886)', diff saved to https://phabricator.wikimedia.org/P48876 and previous config saved to /var/cache/conftool/dbconfig/20230606-105756-ladsgroup.json
  • 10:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 10:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 10:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T336886)', diff saved to https://phabricator.wikimedia.org/P48875 and previous config saved to /var/cache/conftool/dbconfig/20230606-105724-ladsgroup.json
  • 10:53 urbanecm@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
  • 10:53 urbanecm@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
  • 10:52 urbanecm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
  • 10:51 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp in group1 wikis (T299954) (duration: 07m 03s)
  • 10:51 urbanecm@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
  • 10:50 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 10:50 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 10:50 urbanecm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
  • 10:50 urbanecm@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
  • 10:46 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp in group1 wikis (T299954) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 10:44 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp in group1 wikis (T299954)
  • 10:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P48874 and previous config saved to /var/cache/conftool/dbconfig/20230606-104218-ladsgroup.json
  • 10:30 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 10:30 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 10:28 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:28 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P48873 and previous config saved to /var/cache/conftool/dbconfig/20230606-102712-ladsgroup.json
  • 10:20 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 10:20 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 10:20 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:20 mwpresync@deploy1002: Pruned MediaWiki: 1.41.0-wmf.10 (duration: 02m 18s)
  • 10:20 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:18 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:18 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:18 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:17 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.41.0-wmf.12 refs T337526 (duration: 56m 25s)
  • 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T336886)', diff saved to https://phabricator.wikimedia.org/P48872 and previous config saved to /var/cache/conftool/dbconfig/20230606-101205-ladsgroup.json
  • 10:07 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:07 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:02 urbanecm@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
  • 10:01 urbanecm@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
  • 10:00 urbanecm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
  • 09:59 urbanecm@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
  • 09:58 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 09:58 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 09:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T336886)', diff saved to https://phabricator.wikimedia.org/P48871 and previous config saved to /var/cache/conftool/dbconfig/20230606-095512-ladsgroup.json
  • 09:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 09:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T336886)', diff saved to https://phabricator.wikimedia.org/P48870 and previous config saved to /var/cache/conftool/dbconfig/20230606-095451-ladsgroup.json
  • 09:41 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 09:41 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 09:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P48869 and previous config saved to /var/cache/conftool/dbconfig/20230606-093945-ladsgroup.json
  • 09:34 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_esams and A:cp
  • 09:31 fabfur@cumin1001: END (FAIL) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=1) rolling custom on A:cp-text_esams and A:cp
  • 09:27 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_esams and A:cp
  • 09:27 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 09:26 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 09:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P48867 and previous config saved to /var/cache/conftool/dbconfig/20230606-092439-ladsgroup.json
  • 09:21 mwpresync@deploy1002: Started scap: testwikis wikis to 1.41.0-wmf.12 refs T337526
  • 09:18 jynus: running systemctl start train-presync
  • 09:16 vgutierrez: restarting acme-chief and nginx on acme-chief instances
  • 09:11 claime: Building production images - T338014
  • 09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T336886)', diff saved to https://phabricator.wikimedia.org/P48866 and previous config saved to /var/cache/conftool/dbconfig/20230606-090933-ladsgroup.json
  • 08:59 urbanecm: deploy1002: run /usr/local/sbin/fix-staging-perms (T338205)
  • 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2002.codfw.wmnet
  • 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2002.codfw.wmnet
  • 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T336886)', diff saved to https://phabricator.wikimedia.org/P48865 and previous config saved to /var/cache/conftool/dbconfig/20230606-085337-ladsgroup.json
  • 08:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T336886)', diff saved to https://phabricator.wikimedia.org/P48864 and previous config saved to /var/cache/conftool/dbconfig/20230606-085317-ladsgroup.json
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1002.eqiad.wmnet
  • 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1002.eqiad.wmnet
  • 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P48863 and previous config saved to /var/cache/conftool/dbconfig/20230606-083810-ladsgroup.json
  • 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P48861 and previous config saved to /var/cache/conftool/dbconfig/20230606-082304-ladsgroup.json
  • 08:15 moritzm: installing openssl security updates on bullseye
  • 08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T336886)', diff saved to https://phabricator.wikimedia.org/P48860 and previous config saved to /var/cache/conftool/dbconfig/20230606-080758-ladsgroup.json
  • 07:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T336886)', diff saved to https://phabricator.wikimedia.org/P48859 and previous config saved to /var/cache/conftool/dbconfig/20230606-075210-ladsgroup.json
  • 07:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 07:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 07:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T336886)', diff saved to https://phabricator.wikimedia.org/P48858 and previous config saved to /var/cache/conftool/dbconfig/20230606-075149-ladsgroup.json
  • 07:47 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_esams and A:cp
  • 07:42 dcausse@deploy1002: Finished scap: Backport for ttm: use new config option to separate readable and writable services (T322284) (duration: 15m 20s)
  • 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P48857 and previous config saved to /var/cache/conftool/dbconfig/20230606-073643-ladsgroup.json
  • 07:28 dcausse@deploy1002: dcausse: Backport for ttm: use new config option to separate readable and writable services (T322284) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 07:27 dcausse@deploy1002: Started scap: Backport for ttm: use new config option to separate readable and writable services (T322284)
  • 07:22 kharlan@deploy1002: Finished scap: Backport for checkuser: Disable client hints feature by default (T337944) (duration: 08m 14s)
  • 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P48856 and previous config saved to /var/cache/conftool/dbconfig/20230606-072137-ladsgroup.json
  • 07:16 kharlan@deploy1002: kharlan: Backport for checkuser: Disable client hints feature by default (T337944) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 07:14 kharlan@deploy1002: Started scap: Backport for checkuser: Disable client hints feature by default (T337944)
  • 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T336886)', diff saved to https://phabricator.wikimedia.org/P48855 and previous config saved to /var/cache/conftool/dbconfig/20230606-070631-ladsgroup.json
  • 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T336886)', diff saved to https://phabricator.wikimedia.org/P48854 and previous config saved to /var/cache/conftool/dbconfig/20230606-065057-ladsgroup.json
  • 06:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 06:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 06:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 06:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 06:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 06:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 06:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 06:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 06:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T336886)', diff saved to https://phabricator.wikimedia.org/P48853 and previous config saved to /var/cache/conftool/dbconfig/20230606-060807-ladsgroup.json
  • 05:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P48852 and previous config saved to /var/cache/conftool/dbconfig/20230606-055301-ladsgroup.json
  • 05:50 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'configure' for AS: 2518
  • 05:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 2518
  • 05:49 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 2518
  • 05:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 2518
  • 05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P48851 and previous config saved to /var/cache/conftool/dbconfig/20230606-053755-ladsgroup.json
  • 05:34 Amir1: ladsgroup@clouddb1021:/srv/sqldata.s1$ sudo rm db1196* (T337961)
  • 05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T336886)', diff saved to https://phabricator.wikimedia.org/P48850 and previous config saved to /var/cache/conftool/dbconfig/20230606-052249-ladsgroup.json
  • 05:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T336886)', diff saved to https://phabricator.wikimedia.org/P48849 and previous config saved to /var/cache/conftool/dbconfig/20230606-051938-ladsgroup.json
  • 05:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 05:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 05:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T336886)', diff saved to https://phabricator.wikimedia.org/P48848 and previous config saved to /var/cache/conftool/dbconfig/20230606-051918-ladsgroup.json
  • 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P48847 and previous config saved to /var/cache/conftool/dbconfig/20230606-050410-ladsgroup.json
  • 04:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P48846 and previous config saved to /var/cache/conftool/dbconfig/20230606-044904-ladsgroup.json
  • 04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T336886)', diff saved to https://phabricator.wikimedia.org/P48845 and previous config saved to /var/cache/conftool/dbconfig/20230606-043358-ladsgroup.json
  • 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T336886)', diff saved to https://phabricator.wikimedia.org/P48844 and previous config saved to /var/cache/conftool/dbconfig/20230606-043047-ladsgroup.json
  • 04:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 04:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T336886)', diff saved to https://phabricator.wikimedia.org/P48843 and previous config saved to /var/cache/conftool/dbconfig/20230606-043026-ladsgroup.json
  • 04:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P48842 and previous config saved to /var/cache/conftool/dbconfig/20230606-041520-ladsgroup.json
  • 04:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P48841 and previous config saved to /var/cache/conftool/dbconfig/20230606-040013-ladsgroup.json
  • 03:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T336886)', diff saved to https://phabricator.wikimedia.org/P48840 and previous config saved to /var/cache/conftool/dbconfig/20230606-034506-ladsgroup.json
  • 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T336886)', diff saved to https://phabricator.wikimedia.org/P48839 and previous config saved to /var/cache/conftool/dbconfig/20230606-034256-ladsgroup.json
  • 03:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 03:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T336886)', diff saved to https://phabricator.wikimedia.org/P48838 and previous config saved to /var/cache/conftool/dbconfig/20230606-034235-ladsgroup.json
  • 03:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 03:32 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:32 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 03:31 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 03:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P48837 and previous config saved to /var/cache/conftool/dbconfig/20230606-032729-ladsgroup.json
  • 03:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P48836 and previous config saved to /var/cache/conftool/dbconfig/20230606-031223-ladsgroup.json
  • 02:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T336886)', diff saved to https://phabricator.wikimedia.org/P48835 and previous config saved to /var/cache/conftool/dbconfig/20230606-025717-ladsgroup.json
  • 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T336886)', diff saved to https://phabricator.wikimedia.org/P48834 and previous config saved to /var/cache/conftool/dbconfig/20230606-025507-ladsgroup.json
  • 02:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 02:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 02:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48833 and previous config saved to /var/cache/conftool/dbconfig/20230606-021622-ladsgroup.json
  • 02:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 02:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48832 and previous config saved to /var/cache/conftool/dbconfig/20230606-020616-ladsgroup.json
  • 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P48831 and previous config saved to /var/cache/conftool/dbconfig/20230606-020116-ladsgroup.json
  • 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P48830 and previous config saved to /var/cache/conftool/dbconfig/20230606-015110-ladsgroup.json
  • 01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P48829 and previous config saved to /var/cache/conftool/dbconfig/20230606-014610-ladsgroup.json
  • 01:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P48828 and previous config saved to /var/cache/conftool/dbconfig/20230606-013604-ladsgroup.json
  • 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48827 and previous config saved to /var/cache/conftool/dbconfig/20230606-013104-ladsgroup.json
  • 01:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48826 and previous config saved to /var/cache/conftool/dbconfig/20230606-012058-ladsgroup.json
  • 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48825 and previous config saved to /var/cache/conftool/dbconfig/20230606-010704-ladsgroup.json
  • 01:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 01:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 01:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48824 and previous config saved to /var/cache/conftool/dbconfig/20230606-010643-ladsgroup.json
  • 00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T336886)', diff saved to https://phabricator.wikimedia.org/P48823 and previous config saved to /var/cache/conftool/dbconfig/20230606-005357-ladsgroup.json
  • 00:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T336886)', diff saved to https://phabricator.wikimedia.org/P48822 and previous config saved to /var/cache/conftool/dbconfig/20230606-005336-ladsgroup.json
  • 00:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P48821 and previous config saved to /var/cache/conftool/dbconfig/20230606-005137-ladsgroup.json
  • 00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P48820 and previous config saved to /var/cache/conftool/dbconfig/20230606-003830-ladsgroup.json
  • 00:37 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P48819 and previous config saved to /var/cache/conftool/dbconfig/20230606-003631-ladsgroup.json
  • 00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P48818 and previous config saved to /var/cache/conftool/dbconfig/20230606-002324-ladsgroup.json
  • 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48817 and previous config saved to /var/cache/conftool/dbconfig/20230606-002125-ladsgroup.json
  • 00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48816 and previous config saved to /var/cache/conftool/dbconfig/20230606-001914-ladsgroup.json
  • 00:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 00:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 00:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 00:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T336886)', diff saved to https://phabricator.wikimedia.org/P48815 and previous config saved to /var/cache/conftool/dbconfig/20230606-001836-ladsgroup.json
  • 00:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T336886)', diff saved to https://phabricator.wikimedia.org/P48814 and previous config saved to /var/cache/conftool/dbconfig/20230606-000818-ladsgroup.json
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P48813 and previous config saved to /var/cache/conftool/dbconfig/20230606-000330-ladsgroup.json

2023-06-05

  • 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T336886)', diff saved to https://phabricator.wikimedia.org/P48812 and previous config saved to /var/cache/conftool/dbconfig/20230605-235346-ladsgroup.json
  • 23:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 23:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 23:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 23:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T336886)', diff saved to https://phabricator.wikimedia.org/P48811 and previous config saved to /var/cache/conftool/dbconfig/20230605-235310-ladsgroup.json
  • 23:49 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp in group0 wikis (T299954) (duration: 07m 02s)
  • 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P48810 and previous config saved to /var/cache/conftool/dbconfig/20230605-234824-ladsgroup.json
  • 23:43 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp in group0 wikis (T299954) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 23:42 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp in group0 wikis (T299954)
  • 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P48809 and previous config saved to /var/cache/conftool/dbconfig/20230605-233804-ladsgroup.json
  • 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T336886)', diff saved to https://phabricator.wikimedia.org/P48808 and previous config saved to /var/cache/conftool/dbconfig/20230605-233318-ladsgroup.json
  • 23:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T336886)', diff saved to https://phabricator.wikimedia.org/P48807 and previous config saved to /var/cache/conftool/dbconfig/20230605-233107-ladsgroup.json
  • 23:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 23:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48806 and previous config saved to /var/cache/conftool/dbconfig/20230605-233046-ladsgroup.json
  • 23:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:25 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 23:24 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P48805 and previous config saved to /var/cache/conftool/dbconfig/20230605-232258-ladsgroup.json
  • 23:22 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:22 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 23:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=93) for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P48804 and previous config saved to /var/cache/conftool/dbconfig/20230605-231540-ladsgroup.json
  • 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002"
  • 23:14 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove mgmt DNS for ssw1-a1 for testing - pt1979@cumin2002"
  • 23:12 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:11 jforrester@deploy1002: Finished deploy [integration/docroot@6eefe56]: I5c1b92 for T334492 (duration: 00m 05s)
  • 23:10 jforrester@deploy1002: Started deploy [integration/docroot@6eefe56]: I5c1b92 for T334492
  • 23:09 jforrester@deploy1002: Finished deploy [integration/docroot@ab77611]: Idf6c7a (duration: 00m 08s)
  • 23:09 jforrester@deploy1002: Started deploy [integration/docroot@ab77611]: Idf6c7a
  • 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T336886)', diff saved to https://phabricator.wikimedia.org/P48803 and previous config saved to /var/cache/conftool/dbconfig/20230605-230752-ladsgroup.json
  • 23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P48802 and previous config saved to /var/cache/conftool/dbconfig/20230605-230034-ladsgroup.json
  • 22:57 mutante: contint2001 - sudo systemctl restart apache2
  • 22:57 mutante: contint2001 - sudo apt-get remove --purge libapache2-mod-php7.3 php7.3-cli php7.3-common php7.3-json php7.3-opcache php7.3-readline
  • 22:55 jforrester@deploy1002: Finished deploy [integration/docroot@8255d99]: I6c7575 for T337425 (duration: 00m 13s)
  • 22:55 jforrester@deploy1002: Started deploy [integration/docroot@8255d99]: I6c7575 for T337425
  • 22:53 mutante: contint2001 (prod main CI server) - upgrading PHP 7.3 to 7.4
  • 22:49 zabe@deploy1002: Finished scap: Backport for Stop writing to revision_comment_temp in testwiki (T299954) (duration: 09m 13s)
  • 22:46 mutante: contint2002, contint1002 - upgrading PHP from 7.3 to 7.4
  • 22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48801 and previous config saved to /var/cache/conftool/dbconfig/20230605-224528-ladsgroup.json
  • 22:41 zabe@deploy1002: zabe: Backport for Stop writing to revision_comment_temp in testwiki (T299954) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 22:40 zabe@deploy1002: Started scap: Backport for Stop writing to revision_comment_temp in testwiki (T299954)
  • 22:37 ladsgroup@deploy1002: Finished scap: Backport for moveToExternal: Actually convert encoding of cur_text (T337700) (duration: 09m 04s)
  • 22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T336886)', diff saved to https://phabricator.wikimedia.org/P48800 and previous config saved to /var/cache/conftool/dbconfig/20230605-223035-ladsgroup.json
  • 22:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 22:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 22:29 ladsgroup@deploy1002: ladsgroup: Backport for moveToExternal: Actually convert encoding of cur_text (T337700) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 22:28 ladsgroup@deploy1002: Started scap: Backport for moveToExternal: Actually convert encoding of cur_text (T337700)
  • 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48799 and previous config saved to /var/cache/conftool/dbconfig/20230605-222745-ladsgroup.json
  • 22:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 22:24 ladsgroup@deploy1002: Finished scap: Backport for Revert "Remove legacy encoding option from dawiktionary" (duration: 07m 40s)
  • 22:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T335845)', diff saved to https://phabricator.wikimedia.org/P48798 and previous config saved to /var/cache/conftool/dbconfig/20230605-222339-ladsgroup.json
  • 22:18 ladsgroup@deploy1002: ladsgroup: Backport for Revert "Remove legacy encoding option from dawiktionary" synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 22:17 ladsgroup@deploy1002: Started scap: Backport for Revert "Remove legacy encoding option from dawiktionary"
  • 22:13 ladsgroup@deploy1002: Finished scap: Backport for Help measure the impact of saneitizer jobs (T336698) (duration: 09m 48s)
  • 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48797 and previous config saved to /var/cache/conftool/dbconfig/20230605-220833-ladsgroup.json
  • 22:05 ladsgroup@deploy1002: ladsgroup: Backport for Help measure the impact of saneitizer jobs (T336698) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 22:03 ladsgroup@deploy1002: Started scap: Backport for Help measure the impact of saneitizer jobs (T336698)
  • 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs1016.eqiad.wmnet
  • 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
  • 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48796 and previous config saved to /var/cache/conftool/dbconfig/20230605-215345-ladsgroup.json
  • 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P48795 and previous config saved to /var/cache/conftool/dbconfig/20230605-215326-ladsgroup.json
  • 21:51 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
  • 21:50 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs1016.eqiad.wmnet
  • 21:42 urbanecm@deploy1002: Finished scap: Backport for NewImpact: Fix renderMode parsing for Special:Impact (T338085) (duration: 25m 38s)
  • 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P48794 and previous config saved to /var/cache/conftool/dbconfig/20230605-213839-ladsgroup.json
  • 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T335845)', diff saved to https://phabricator.wikimedia.org/P48793 and previous config saved to /var/cache/conftool/dbconfig/20230605-213819-ladsgroup.json
  • 21:35 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs1015.eqiad.wmnet
  • 21:35 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
  • 21:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T335845)', diff saved to https://phabricator.wikimedia.org/P48792 and previous config saved to /var/cache/conftool/dbconfig/20230605-213202-ladsgroup.json
  • 21:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 21:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 21:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 21:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 21:30 urbanecm@deploy1002: urbanecm: Backport for NewImpact: Fix renderMode parsing for Special:Impact (T338085) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 21:29 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 21:29 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 21:25 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
  • 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P48791 and previous config saved to /var/cache/conftool/dbconfig/20230605-212333-ladsgroup.json
  • 21:23 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs1015.eqiad.wmnet
  • 21:18 urbanecm@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 21:17 urbanecm@deploy1002: Started scap: Backport for NewImpact: Fix renderMode parsing for Special:Impact (T338085)
  • 21:16 urbanecm@deploy1002: Finished scap: Backport for Update interwiki cache (T338093) (duration: 24m 34s)
  • 21:15 urbanecm@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48790 and previous config saved to /var/cache/conftool/dbconfig/20230605-210827-ladsgroup.json
  • 21:05 urbanecm@deploy1002: urbanecm: Backport for Update interwiki cache (T338093) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 20:51 urbanecm@deploy1002: Started scap: Backport for Update interwiki cache (T338093)
  • 20:48 cjming: end of UTC late backport window
  • 20:47 urbanecm: [urbanecm@deploy1002 ~]$ sudo /usr/local/sbin/fix-staging-perms # verify T338180 fix
  • away: payments-wiki upgraded from 2b4203df to f3b229c6
  • 20:46 cjming@deploy1002: Finished scap: Backport for Revert "Revert "VisualEditorFeatureUse sampling rate to 1 everywhere"" (duration: 09m 57s)
  • 20:38 cjming@deploy1002: cjming: Backport for Revert "Revert "VisualEditorFeatureUse sampling rate to 1 everywhere"" synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:36 cjming@deploy1002: Started scap: Backport for Revert "Revert "VisualEditorFeatureUse sampling rate to 1 everywhere""
  • 20:35 cjming@deploy1002: Finished scap: Backport for Add initial stream configs for Android article events using Metrics Platform Java client library (T330355) (duration: 24m 57s)
  • 20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T336886)', diff saved to https://phabricator.wikimedia.org/P48789 and previous config saved to /var/cache/conftool/dbconfig/20230605-202916-ladsgroup.json
  • 20:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 20:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 20:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T336886)', diff saved to https://phabricator.wikimedia.org/P48788 and previous config saved to /var/cache/conftool/dbconfig/20230605-202855-ladsgroup.json
  • 20:23 cjming@deploy1002: cjming: Backport for Add initial stream configs for Android article events using Metrics Platform Java client library (T330355) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P48787 and previous config saved to /var/cache/conftool/dbconfig/20230605-201349-ladsgroup.json
  • 20:10 cjming@deploy1002: Started scap: Backport for Add initial stream configs for Android article events using Metrics Platform Java client library (T330355)
  • 20:09 urbanecm: [urbanecm@deploy1002 ~]$ sudo /usr/local/sbin/fix-staging-perms # attempt to fix permission errors when doing a backport
  • 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P48786 and previous config saved to /var/cache/conftool/dbconfig/20230605-195842-ladsgroup.json
  • 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T336886)', diff saved to https://phabricator.wikimedia.org/P48785 and previous config saved to /var/cache/conftool/dbconfig/20230605-194336-ladsgroup.json
  • 19:32 brett: Maglev LVS scheduler rollout in eqiad finished (puppet re-enabled) - T263797
  • 19:12 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs2011.codfw.wmnet
  • 19:12 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
  • 19:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 19:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T336886)', diff saved to https://phabricator.wikimedia.org/P48784 and previous config saved to /var/cache/conftool/dbconfig/20230605-190702-ladsgroup.json
  • 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T336886)', diff saved to https://phabricator.wikimedia.org/P48783 and previous config saved to /var/cache/conftool/dbconfig/20230605-190528-ladsgroup.json
  • 19:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 19:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 19:03 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
  • 18:58 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs2011.codfw.wmnet
  • 18:56 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
  • 18:52 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: revert - remove undeeded wgEventBusStreamNamesMap override setting (take 2) - T336817 (duration: 11m 54s)
  • 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P48782 and previous config saved to /var/cache/conftool/dbconfig/20230605-185156-ladsgroup.json
  • 18:48 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
  • 18:48 inflatador: bking@cumin1001 depooling wdqs2011for fw update T331297
  • 18:48 inflatador: bking@cumin1001 repooling wdqs2010 T331297
  • 18:45 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
  • 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P48781 and previous config saved to /var/cache/conftool/dbconfig/20230605-183650-ladsgroup.json
  • 18:35 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
  • 18:32 inflatador: bking@cumin1001 depooling wdqs2010 for fw update T331297
  • 18:30 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: revert - Remove unused page_change rc streams - T336817 (duration: 11m 23s)
  • 18:29 sukhe: homer "cr*-eqiad*" commit "Gerrit: 927246 remove old gerrit service IP"
  • 18:28 brett: Maglev LVS scheduler rollout in eqiad (puppet disabled) - T263797
  • 18:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T336886)', diff saved to https://phabricator.wikimedia.org/P48780 and previous config saved to /var/cache/conftool/dbconfig/20230605-182144-ladsgroup.json
  • 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1224 (T336886)', diff saved to https://phabricator.wikimedia.org/P48779 and previous config saved to /var/cache/conftool/dbconfig/20230605-181935-ladsgroup.json
  • 18:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 18:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T336886)', diff saved to https://phabricator.wikimedia.org/P48778 and previous config saved to /var/cache/conftool/dbconfig/20230605-181915-ladsgroup.json
  • 18:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T336886)', diff saved to https://phabricator.wikimedia.org/P48777 and previous config saved to /var/cache/conftool/dbconfig/20230605-181219-ladsgroup.json
  • 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P48776 and previous config saved to /var/cache/conftool/dbconfig/20230605-180408-ladsgroup.json
  • 17:58 btullis@puppetmaster1001: conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
  • 17:58 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
  • 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P48775 and previous config saved to /var/cache/conftool/dbconfig/20230605-175712-ladsgroup.json
  • 17:50 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: no-op: Remove unused page_change rc streams - T336817 (duration: 20m 11s)
  • 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P48774 and previous config saved to /var/cache/conftool/dbconfig/20230605-174902-ladsgroup.json
  • 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P48773 and previous config saved to /var/cache/conftool/dbconfig/20230605-174206-ladsgroup.json
  • 17:38 cdanis@deploy1002: Finished scap: Backport for Enable user network probe events (T332024) (duration: 10m 02s)
  • 17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T336886)', diff saved to https://phabricator.wikimedia.org/P48772 and previous config saved to /var/cache/conftool/dbconfig/20230605-173356-ladsgroup.json
  • 17:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1213:3316 (T336886)', diff saved to https://phabricator.wikimedia.org/P48771 and previous config saved to /var/cache/conftool/dbconfig/20230605-173002-ladsgroup.json
  • 17:30 cdanis@deploy1002: cdanis: Backport for Enable user network probe events (T332024) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 17:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 17:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 17:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T336886)', diff saved to https://phabricator.wikimedia.org/P48770 and previous config saved to /var/cache/conftool/dbconfig/20230605-172942-ladsgroup.json
  • 17:28 cdanis@deploy1002: Started scap: Backport for Enable user network probe events (T332024)
  • 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T336886)', diff saved to https://phabricator.wikimedia.org/P48769 and previous config saved to /var/cache/conftool/dbconfig/20230605-172700-ladsgroup.json
  • 17:26 cdanis@deploy1002: Backport cancelled.
  • 17:26 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: Remove undeeded wgEventBusStreamNamesMap override setting (take 2) - T336817 (duration: 09m 25s)
  • 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1223 (T336886)', diff saved to https://phabricator.wikimedia.org/P48768 and previous config saved to /var/cache/conftool/dbconfig/20230605-172124-ladsgroup.json
  • 17:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 17:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T336886)', diff saved to https://phabricator.wikimedia.org/P48767 and previous config saved to /var/cache/conftool/dbconfig/20230605-172103-ladsgroup.json
  • 17:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P48766 and previous config saved to /var/cache/conftool/dbconfig/20230605-171436-ladsgroup.json
  • 17:12 cdanis@deploy1002: Backport cancelled.
  • 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P48765 and previous config saved to /var/cache/conftool/dbconfig/20230605-170557-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P48764 and previous config saved to /var/cache/conftool/dbconfig/20230605-165929-ladsgroup.json
  • 16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P48763 and previous config saved to /var/cache/conftool/dbconfig/20230605-165051-ladsgroup.json
  • 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T336886)', diff saved to https://phabricator.wikimedia.org/P48762 and previous config saved to /var/cache/conftool/dbconfig/20230605-164423-ladsgroup.json
  • 16:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2013.codfw.wmnet with OS bullseye
  • 16:37 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1201 (T336886)', diff saved to https://phabricator.wikimedia.org/P48761 and previous config saved to /var/cache/conftool/dbconfig/20230605-163714-ladsgroup.json
  • 16:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T336886)', diff saved to https://phabricator.wikimedia.org/P48760 and previous config saved to /var/cache/conftool/dbconfig/20230605-163653-ladsgroup.json
  • 16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T336886)', diff saved to https://phabricator.wikimedia.org/P48759 and previous config saved to /var/cache/conftool/dbconfig/20230605-163545-ladsgroup.json
  • 16:35 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1212 (T336886)', diff saved to https://phabricator.wikimedia.org/P48758 and previous config saved to /var/cache/conftool/dbconfig/20230605-162707-ladsgroup.json
  • 16:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T336886)', diff saved to https://phabricator.wikimedia.org/P48757 and previous config saved to /var/cache/conftool/dbconfig/20230605-162629-ladsgroup.json
  • 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P48756 and previous config saved to /var/cache/conftool/dbconfig/20230605-162147-ladsgroup.json
  • 16:21 btullis@puppetmaster1001: conftool action : set/pooled=no; selector: service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
  • 16:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
  • 16:19 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
  • 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: host reimage
  • 16:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P48755 and previous config saved to /var/cache/conftool/dbconfig/20230605-161123-ladsgroup.json
  • 16:08 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P48754 and previous config saved to /var/cache/conftool/dbconfig/20230605-160640-ladsgroup.json
  • 16:06 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 16:06 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 16:06 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 16:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 16:05 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 16:05 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 15:59 bblack: mw1419: manually executing a php restart to test new safe-service-restart
  • 15:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P48753 and previous config saved to /var/cache/conftool/dbconfig/20230605-155617-ladsgroup.json
  • 15:55 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2013.codfw.wmnet with OS bullseye
  • 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T336886)', diff saved to https://phabricator.wikimedia.org/P48752 and previous config saved to /var/cache/conftool/dbconfig/20230605-155134-ladsgroup.json
  • 15:51 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs2013']
  • 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1187 (T336886)', diff saved to https://phabricator.wikimedia.org/P48751 and previous config saved to /var/cache/conftool/dbconfig/20230605-154926-ladsgroup.json
  • 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T336886)', diff saved to https://phabricator.wikimedia.org/P48750 and previous config saved to /var/cache/conftool/dbconfig/20230605-154905-ladsgroup.json
  • 15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T336886)', diff saved to https://phabricator.wikimedia.org/P48749 and previous config saved to /var/cache/conftool/dbconfig/20230605-154110-ladsgroup.json
  • 15:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2013']
  • 15:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs2013']
  • 15:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs2013']
  • 15:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T336886)', diff saved to https://phabricator.wikimedia.org/P48748 and previous config saved to /var/cache/conftool/dbconfig/20230605-153542-ladsgroup.json
  • 15:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 15:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 15:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T336886)', diff saved to https://phabricator.wikimedia.org/P48747 and previous config saved to /var/cache/conftool/dbconfig/20230605-153521-ladsgroup.json
  • 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P48746 and previous config saved to /var/cache/conftool/dbconfig/20230605-153359-ladsgroup.json
  • 15:33 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
  • 15:33 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
  • 15:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2013.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:27 Amir1: on s3 master: update `text` set old_text = 'O:18:"historyblobcurstub":1:{s:6:"mCurId";i:5532;}', old_flags = 'object' where old_id= 14484; (T337700)
  • 15:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P48745 and previous config saved to /var/cache/conftool/dbconfig/20230605-152015-ladsgroup.json
  • 15:19 moritzm: installing debian-archive-keyring updates on bullseye hosts
  • 15:19 mforns@deploy1002: Finished deploy [airflow-dags/analytics@674ec0a]: (no justification provided) (duration: 00m 17s)
  • 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P48744 and previous config saved to /var/cache/conftool/dbconfig/20230605-151853-ladsgroup.json
  • 15:18 mforns@deploy1002: Started deploy [airflow-dags/analytics@674ec0a]: (no justification provided)
  • 15:18 sukhe@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T326767 (duration: 102m 46s)
  • 15:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host lvs2013.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Setup DNS for lvs2013 - pt1979@cumin2002"
  • 15:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Setup DNS for lvs2013 - pt1979@cumin2002"
  • 15:05 moritzm: installing avahi security updates
  • 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P48742 and previous config saved to /var/cache/conftool/dbconfig/20230605-150509-ladsgroup.json
  • 15:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T336886)', diff saved to https://phabricator.wikimedia.org/P48741 and previous config saved to /var/cache/conftool/dbconfig/20230605-150347-ladsgroup.json
  • 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T336886)', diff saved to https://phabricator.wikimedia.org/P48740 and previous config saved to /var/cache/conftool/dbconfig/20230605-150138-ladsgroup.json
  • 15:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 15:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T336886)', diff saved to https://phabricator.wikimedia.org/P48739 and previous config saved to /var/cache/conftool/dbconfig/20230605-150117-ladsgroup.json
  • 14:55 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:55 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:52 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:52 otto@deploy1002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:50 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:50 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T336886)', diff saved to https://phabricator.wikimedia.org/P48738 and previous config saved to /var/cache/conftool/dbconfig/20230605-145003-ladsgroup.json
  • 14:48 sukhe: homer "cr*-codfw*" commit "Gerrit: 927208 remove decommissioned host lvs2009": T335777
  • 14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs2009.codfw.wmnet
  • 14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 14:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P48737 and previous config saved to /var/cache/conftool/dbconfig/20230605-144611-ladsgroup.json
  • 14:45 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T336886)', diff saved to https://phabricator.wikimedia.org/P48736 and previous config saved to /var/cache/conftool/dbconfig/20230605-144438-ladsgroup.json
  • 14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T336886)', diff saved to https://phabricator.wikimedia.org/P48735 and previous config saved to /var/cache/conftool/dbconfig/20230605-144417-ladsgroup.json
  • 14:42 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 14:32 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs2009.codfw.wmnet
  • 14:31 ejegg: payments-wiki upgraded from c2f9f8b5 to 2b4203df
  • 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P48734 and previous config saved to /var/cache/conftool/dbconfig/20230605-143105-ladsgroup.json
  • 14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P48733 and previous config saved to /var/cache/conftool/dbconfig/20230605-142911-ladsgroup.json
  • 14:28 sukhe: codfw low-traffic LVS: set routing-options static route 10.2.1.0/24 next-hop 10.192.49.7
  • 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T336886)', diff saved to https://phabricator.wikimedia.org/P48732 and previous config saved to /var/cache/conftool/dbconfig/20230605-141559-ladsgroup.json
  • 14:15 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:15 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T336886)', diff saved to https://phabricator.wikimedia.org/P48731 and previous config saved to /var/cache/conftool/dbconfig/20230605-141451-ladsgroup.json
  • 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T336886)', diff saved to https://phabricator.wikimedia.org/P48730 and previous config saved to /var/cache/conftool/dbconfig/20230605-141430-ladsgroup.json
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P48729 and previous config saved to /var/cache/conftool/dbconfig/20230605-141405-ladsgroup.json
  • 14:08 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:08 otto@deploy1002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P48728 and previous config saved to /var/cache/conftool/dbconfig/20230605-135924-ladsgroup.json
  • 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T336886)', diff saved to https://phabricator.wikimedia.org/P48727 and previous config saved to /var/cache/conftool/dbconfig/20230605-135859-ladsgroup.json
  • 13:57 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:56 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T336886)', diff saved to https://phabricator.wikimedia.org/P48726 and previous config saved to /var/cache/conftool/dbconfig/20230605-135332-ladsgroup.json
  • 13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T336886)', diff saved to https://phabricator.wikimedia.org/P48725 and previous config saved to /var/cache/conftool/dbconfig/20230605-135311-ladsgroup.json
  • 13:46 moritzm: installing python-ipaddress security updates
  • 13:45 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
  • 13:44 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: Host under maintenance
  • 13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P48724 and previous config saved to /var/cache/conftool/dbconfig/20230605-134418-ladsgroup.json
  • 13:44 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1002.eqiad.wmnet with reason: Host under maintenance
  • 13:43 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1002.eqiad.wmnet with reason: Host under maintenance
  • 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T335845)', diff saved to https://phabricator.wikimedia.org/P48723 and previous config saved to /var/cache/conftool/dbconfig/20230605-134313-ladsgroup.json
  • 13:41 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:41 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P48722 and previous config saved to /var/cache/conftool/dbconfig/20230605-133805-ladsgroup.json
  • 13:36 sukhe@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T326767
  • 13:35 sukhe@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T322937 (duration: 01m 06s)
  • 13:35 sukhe@deploy1002: Locking from deployment [ALL REPOSITORIES]: LVS maintenance in codfw, blocking deploys T322937
  • 13:35 bblack@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: temporary lock for LVS resarts in core DCs (duration: 05m 54s)
  • 13:32 bblack: lvs1* (eqiad) - restart pybal for T334703 IPs
  • 13:29 bblack: lvs2* (codfw) - restart pybal for T334703 IPs
  • 13:29 bblack@deploy1002: Locking from deployment [ALL REPOSITORIES]: temporary lock for LVS resarts in core DCs
  • 13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T336886)', diff saved to https://phabricator.wikimedia.org/P48721 and previous config saved to /var/cache/conftool/dbconfig/20230605-132911-ladsgroup.json
  • 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P48720 and previous config saved to /var/cache/conftool/dbconfig/20230605-132807-ladsgroup.json
  • 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T336886)', diff saved to https://phabricator.wikimedia.org/P48719 and previous config saved to /var/cache/conftool/dbconfig/20230605-132703-ladsgroup.json
  • 13:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 13:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T336886)', diff saved to https://phabricator.wikimedia.org/P48718 and previous config saved to /var/cache/conftool/dbconfig/20230605-132642-ladsgroup.json
  • 13:25 hashar: Restarted Zuul due to stall ssh connection # T309376
  • 13:25 bblack: lvs3* (esams) - restart pybal for T334703 IPs
  • 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P48717 and previous config saved to /var/cache/conftool/dbconfig/20230605-132259-ladsgroup.json
  • 13:19 bblack: lvs5* (eqsin) - restart pybal for T334703 IPs
  • 13:17 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:15 bblack: lvs6* (drmrs) - restart pybal for T334703 IPs
  • 13:14 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Make outreachwiki a multilingual Wikidata client (T171140) (duration: 10m 06s)
  • 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P48716 and previous config saved to /var/cache/conftool/dbconfig/20230605-131301-ladsgroup.json
  • 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P48715 and previous config saved to /var/cache/conftool/dbconfig/20230605-131136-ladsgroup.json
  • 13:09 bblack: lvs4* (ulsfo) - restart pybal for T334703 IPs
  • 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T336886)', diff saved to https://phabricator.wikimedia.org/P48714 and previous config saved to /var/cache/conftool/dbconfig/20230605-130753-ladsgroup.json
  • 13:05 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde: Backport for Make outreachwiki a multilingual Wikidata client (T171140) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 13:04 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Make outreachwiki a multilingual Wikidata client (T171140)
  • 13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T336886)', diff saved to https://phabricator.wikimedia.org/P48713 and previous config saved to /var/cache/conftool/dbconfig/20230605-130228-ladsgroup.json
  • 13:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 13:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T335845)', diff saved to https://phabricator.wikimedia.org/P48712 and previous config saved to /var/cache/conftool/dbconfig/20230605-125754-ladsgroup.json
  • 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P48711 and previous config saved to /var/cache/conftool/dbconfig/20230605-125630-ladsgroup.json
  • 12:52 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 12:51 Amir1: killed prioritizeFilesWithTemplate.php, stopping depool maint.
  • 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 12:44 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
  • 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T335845)', diff saved to https://phabricator.wikimedia.org/P48710 and previous config saved to /var/cache/conftool/dbconfig/20230605-124444-ladsgroup.json
  • 12:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 12:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 12:43 jmm@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe
  • 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T336886)', diff saved to https://phabricator.wikimedia.org/P48709 and previous config saved to /var/cache/conftool/dbconfig/20230605-124124-ladsgroup.json
  • 12:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T336886)', diff saved to https://phabricator.wikimedia.org/P48708 and previous config saved to /var/cache/conftool/dbconfig/20230605-123915-ladsgroup.json
  • 12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:39 jmm@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe
  • 12:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 12:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 12:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 12:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 12:17 jynus: creating a copy of db1157 binlogs on dbprov1004 T338128
  • 12:15 bblack: lvs*: disabling puppet to roll out new LVS IPs in https://gerrit.wikimedia.org/r/c/operations/puppet/+/924593 - T334703
  • 12:15 bblack: lvs*: disabling puppet to roll out new LVS IPs in https://gerrit.wikimedia.org/r/c/operations/puppet/+/924593 - T334703
  • 12:15 jbond@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=puppetboard-next
  • 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.elasticsearch.restart-nginx (exit_code=0) rolling restart_daemons on A:relforge
  • 11:45 jmm@cumin2002: START - Cookbook sre.elasticsearch.restart-nginx rolling restart_daemons on A:relforge
  • 11:39 jbond@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=puppetboard-next
  • 11:21 moritzm: restarting Exim on MXes to pick up OpenSSL updates
  • 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling restart_daemons on A:ncredir
  • 11:13 moritzm: bounced ferm on ml-serve2006 (race caused by firewall profile change)
  • 11:08 jmm@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling restart_daemons on A:ncredir
  • 10:31 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling restart_daemons on A:ldap-replicas
  • 10:29 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling restart_daemons on A:ldap-replicas
  • 10:14 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:14 aborrero@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirts - aborrero@cumin1001"
  • 10:13 aborrero@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirts - aborrero@cumin1001"
  • 10:11 moritzm: installing openssl security updates on Bullseye
  • 10:08 aborrero@cumin1001: START - Cookbook sre.dns.netbox
  • 10:06 godog: truncate xff.log and JobExecutor.log on mwlog1002 to reclaim space - T338127
  • 09:41 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 09:39 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 09:39 claime: roll-restart thumbor in eqiad - T337649
  • 09:39 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 09:38 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=thumbor.*
  • 09:38 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 09:37 claime: roll-restart thumbor in codfw - T337649
  • 08:40 claime: power-cycling restbase1027 - T338122
  • 07:54 moritzm: installing containerd security updates
  • 07:38 kartik@deploy1002: Finished scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337669) (duration: 09m 58s)
  • 07:30 kartik@deploy1002: kartik: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337669) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 07:28 kartik@deploy1002: Started scap: Backport for testwiki: Enable Section Translation for 10 Wikipedias (T337669)
  • 07:25 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 07:23 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 07:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 07:23 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 07:21 taavi@deploy1002: Finished scap: Backport for [SearchVue] Enable on Norwegian, Hungarian, Catalan, Dutch, and Ukrainian (T336870) (duration: 18m 27s)
  • 07:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1001.eqiad.wmnet
  • 07:12 taavi@deploy1002: mlitn and taavi: Backport for [SearchVue] Enable on Norwegian, Hungarian, Catalan, Dutch, and Ukrainian (T336870) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 07:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1001.eqiad.wmnet
  • 07:02 taavi@deploy1002: Started scap: Backport for [SearchVue] Enable on Norwegian, Hungarian, Catalan, Dutch, and Ukrainian (T336870)
  • 06:20 _joe_: killing a pod with consistently high haproxy queue for thumbor in codfw
  • 06:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 60427
  • 06:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 60427

2023-06-03

  • 13:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade
  • 13:41 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade
  • 13:28 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2012.codfw.wmnet
  • 13:28 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2012.codfw.wmnet

2023-06-02

  • 20:16 apergos: rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps'
  • 18:42 bblack: dns*: puppets are all re-enabled, ntp restarts are done, etc
  • 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 17:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-a1-codfw - pt1979@cumin2002"
  • 17:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-a1-codfw.mgmt.codfw.wmnet
  • 17:27 bblack: dns*: disabling puppet to control rollout of NTP config fixups
  • 16:03 bblack: dns*: removed faulty authdns[12]001 lines from /etc/hosts via cumin+sed
  • 15:35 sukhe: restart ntp.service on dns1002
  • 13:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:26 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:25 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:25 ottomata: deploying flink-operator change to dse-k8s and wikikube to add ingress for health check port - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/926479
  • 13:24 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:24 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:24 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:24 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:22 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:03 moritzm: installing at-spi2-core bugfix updates from Bullseye point release
  • 09:35 moritzm: installing texlive-security updates on buster
  • 09:18 akosiaris: update kubernetes-node to 1.23.14-2 on all P:kubernetes::node hosts (88 in total) T337836. Reload systemd for unit changes to take effect
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5016.eqsin.wmnet
  • 08:52 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5016.eqsin.wmnet
  • 08:52 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5015.eqsin.wmnet
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5015.eqsin.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5014.eqsin.wmnet
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5014.eqsin.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp5013.eqsin.wmnet
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: cp5013.eqsin.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 0 hosts:
  • 08:51 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 0 hosts:
  • 08:42 moritzm: installing traceroute bugfix updates from Bullseye point release
  • 07:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org
  • 07:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org
  • 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3006.wikimedia.org
  • 07:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3006.wikimedia.org
  • 07:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
  • 07:22 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad or A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 07:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
  • 01:53 ejegg: fundraising python tools upgraded from 759d4c89 to 2ca83336
  • 01:22 cstone: civicrm upgraded from 3819d6d1 to bcc8fccc

2023-06-01

  • 21:06 samtar@deploy1002: Finished scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) (duration: 08m 30s)
  • 20:59 samtar@deploy1002: esanders and samtar: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 20:57 samtar@deploy1002: Started scap: Backport for Remove deleted config wgVectorStickyHeaderEdit (T337955)
  • 20:54 samtar@deploy1002: Finished scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) (duration: 10m 29s)
  • 20:45 samtar@deploy1002: samtar and ksarabia: Backport for Remove config and AB test code for edit buttons in sticky header (T337955) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 20:44 samtar@deploy1002: Started scap: Backport for Remove config and AB test code for edit buttons in sticky header (T337955)
  • 20:21 samtar@deploy1002: Finished scap: Backport for Deploy Research Incentive survey on enwiki (T336092) (duration: 07m 56s)
  • 20:15 samtar@deploy1002: dani and samtar: Backport for Deploy Research Incentive survey on enwiki (T336092) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:13 samtar@deploy1002: Started scap: Backport for Deploy Research Incentive survey on enwiki (T336092)
  • 20:12 samtar@deploy1002: Finished scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) (duration: 08m 20s)
  • 20:05 samtar@deploy1002: samtar and dreamyjazz: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 20:04 samtar@deploy1002: Started scap: Backport for Always collapse by default the CheckUserHelper on loginwiki (T328726)
  • 19:51 ejegg: fundraising python tools upgraded from 72570bdd to 759d4c89
  • 19:12 mforns@deploy1002: Finished deploy [airflow-dags/analytics@21e7354]: (no justification provided) (duration: 02m 42s)
  • 19:11 mforns@deploy1002: Started deploy [airflow-dags/analytics@21e7354]: (no justification provided)
  • 19:11 bblack@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work (duration: 03m 27s)
  • 19:09 bblack: lvs1* (eqiad): upgrade pybal to 1.15.13 - T334703
  • 19:08 bblack@deploy1002: Locking from deployment [ALL REPOSITORIES]: temporary lock for LVS/pybal upgrade work
  • 18:45 bblack: lvs6* (drmrs): upgrade pybal to 1.15.13 - T334703
  • 18:33 bblack: lvs3* (esams): upgrade pybal to 1.15.13 - T334703
  • 18:32 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.11 refs T337525
  • 17:50 mforns@deploy1002: Finished deploy [airflow-dags/analytics@03ca1c1]: (no justification provided) (duration: 00m 10s)
  • 17:50 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-upload_drmrs and A:cp
  • 17:50 mforns@deploy1002: Started deploy [airflow-dags/analytics@03ca1c1]: (no justification provided)
  • 17:49 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 17:48 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 17:48 fabfur@cumin1001: END (PASS) - Cookbook sre.cdn.run-puppet-restart-varnish (exit_code=0) rolling custom on A:cp-text_drmrs and A:cp
  • 17:47 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 17:47 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 17:45 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 17:45 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 17:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 17:05 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:55 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: revert: Remove undeeded wgEventBusStreamNamesMap override setting. Recent EventBus changes are not deployed yet? - T336817 (duration: 07m 24s)
  • 16:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:53 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 16:53 aborrero@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
  • 16:52 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
  • 16:44 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: no-op: Remove undeeded wgEventBusStreamNamesMap override setting - T336817 (duration: 08m 18s)
  • 16:42 bblack: lvs2* (codfw): upgrade pybal to 1.15.13 - T334703
  • 16:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:40 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1002.eqiad.wmnet with OS bullseye
  • 16:35 bblack: lvs5* (eqsin): upgrade pybal to 1.15.13 - T334703
  • 16:32 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703 (round 2!)
  • 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 16:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:10 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
  • 16:07 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2004-dev.codfw.wmnet with reason: host reimage
  • 16:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
  • 16:06 mutante: gerrit - set repo wikimedia/annualreport to readonly (from active) - T337041
  • 16:04 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudswift1001.eqiad.wmnet with reason: host reimage
  • 16:01 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 16:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 15:59 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 15:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 15:45 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:44 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:33 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:33 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 15:21 fabfur: running run-puppet-agent on cp6010.drmrs.wmnet to fix icinga check from cookbook
  • 15:15 bblack: lvs400[89]: upgrade pybal to 1.15.13 - T334703
  • 15:11 sukhe: reprepro -C component/pybal bullseye-wikimedia pybal_1.15.13_source.changes
  • 15:00 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1002.eqiad.wmnet with OS bullseye
  • 14:59 moritzm: installing python-sqlparse security updates
  • 14:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 14:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 14:55 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye
  • 14:53 moritzm: installing jackson-databind security updates
  • 14:49 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 14:45 fabfur: running run-puppet-agent on cp6009.drmrs.wmnet to fix icinga check from cookbook
  • 14:44 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
  • 14:41 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1002.eqiad.wmnet with reason: host reimage
  • 14:40 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-upload_drmrs and A:cp
  • 14:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 14:39 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 14:36 fabfur@cumin1001: START - Cookbook sre.cdn.run-puppet-restart-varnish rolling custom on A:cp-text_drmrs and A:cp
  • 14:34 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 14:29 moritzm: installing imagemagick security updates on buster
  • 14:16 herron@cumin1001: START - Cookbook sre.hosts.reimage for host mwlog1002.eqiad.wmnet with OS bullseye
  • 14:14 fabfur: Disabled puppet on A:cp-drmrs for T323557
  • 14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@3c9cc85]: (no justification provided) (duration: 00m 11s)
  • 14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@3c9cc85]: (no justification provided)
  • 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48700 and previous config saved to /var/cache/conftool/dbconfig/20230601-141317-ladsgroup.json
  • 14:11 claime: Removing obsolete mediawiki-services-function-evaluator from registry - T337505
  • 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48699 and previous config saved to /var/cache/conftool/dbconfig/20230601-135811-ladsgroup.json
  • 13:52 moritzm: installing sysstat security updates
  • 13:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:51 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:50 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:49 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:49 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P48698 and previous config saved to /var/cache/conftool/dbconfig/20230601-134304-ladsgroup.json
  • 13:29 moritzm: installing openssl security updates on bullseye
  • 13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48697 and previous config saved to /var/cache/conftool/dbconfig/20230601-132758-ladsgroup.json
  • 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T336886)', diff saved to https://phabricator.wikimedia.org/P48695 and previous config saved to /var/cache/conftool/dbconfig/20230601-132319-ladsgroup.json
  • 13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20230601-132238-ladsgroup.json
  • 13:21 claime: Removing obsolete mediawiki-services-function-orchestrator from registry - T337505
  • 13:13 urbanecm@deploy1002: Finished scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) (duration: 11m 08s)
  • 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48694 and previous config saved to /var/cache/conftool/dbconfig/20230601-130732-ladsgroup.json
  • 13:04 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 13:04 urbanecm@deploy1002: urbanecm and daimona: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 13:03 bking@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on wdqs2021.codfw.wmnet with reason: attempting WDQS stack on bullseye
  • 13:02 urbanecm@deploy1002: Started scap: Backport for beta: Stop setting unused $wgCampaignEventsUseNewTrackingToolsSchema (T336362), Set $wgCampaignEventsUseNewTrackingToolsSchema to true in prod (T336364)
  • 12:58 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 12:57 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 12:52 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 12:52 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P48693 and previous config saved to /var/cache/conftool/dbconfig/20230601-125226-ladsgroup.json
  • 12:50 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 12:49 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48692 and previous config saved to /var/cache/conftool/dbconfig/20230601-123720-ladsgroup.json
  • 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2151 (T336886)', diff saved to https://phabricator.wikimedia.org/P48691 and previous config saved to /var/cache/conftool/dbconfig/20230601-123236-ladsgroup.json
  • 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 12:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 12:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48690 and previous config saved to /var/cache/conftool/dbconfig/20230601-122900-ladsgroup.json
  • 12:17 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:17 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:16 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 12:16 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48689 and previous config saved to /var/cache/conftool/dbconfig/20230601-121354-ladsgroup.json
  • 12:03 Daimona: Creating ce_tracking_tools table for the CampaignEvents extension on testwiki, test2wiki, officewiki, and metawiki # T336365
  • 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P48688 and previous config saved to /var/cache/conftool/dbconfig/20230601-115848-ladsgroup.json
  • 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48687 and previous config saved to /var/cache/conftool/dbconfig/20230601-114342-ladsgroup.json
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T336886)', diff saved to https://phabricator.wikimedia.org/P48686 and previous config saved to /var/cache/conftool/dbconfig/20230601-113843-ladsgroup.json
  • 11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48685 and previous config saved to /var/cache/conftool/dbconfig/20230601-113822-ladsgroup.json
  • 11:28 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 11:28 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:26 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:25 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48684 and previous config saved to /var/cache/conftool/dbconfig/20230601-112316-ladsgroup.json
  • 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P48683 and previous config saved to /var/cache/conftool/dbconfig/20230601-110810-ladsgroup.json
  • 11:04 jayme: disabling puppet on all kubernestes control planes for https://gerrit.wikimedia.org/r/c/operations/puppet/+/925707
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48682 and previous config saved to /var/cache/conftool/dbconfig/20230601-105303-ladsgroup.json
  • 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T336886)', diff saved to https://phabricator.wikimedia.org/P48681 and previous config saved to /var/cache/conftool/dbconfig/20230601-104803-ladsgroup.json
  • 10:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 10:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48680 and previous config saved to /var/cache/conftool/dbconfig/20230601-104742-ladsgroup.json
  • 10:45 cmooney@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 10:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48679 and previous config saved to /var/cache/conftool/dbconfig/20230601-103236-ladsgroup.json
  • 10:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P48678 and previous config saved to /var/cache/conftool/dbconfig/20230601-101730-ladsgroup.json
  • 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:17 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
  • 10:16 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
  • 10:14 aborrero@cumin2002: START - Cookbook sre.dns.netbox
  • 10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48677 and previous config saved to /var/cache/conftool/dbconfig/20230601-100224-ladsgroup.json
  • 10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T336886)', diff saved to https://phabricator.wikimedia.org/P48676 and previous config saved to /var/cache/conftool/dbconfig/20230601-100011-ladsgroup.json
  • 10:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 09:56 moritzm: installing systemd security updates on bullseye
  • 09:53 Amir1: ladsgroup@mwmaint1002:~$ foreachwikiindblist group2 extensions/AbuseFilter/maintenance/MigrateActorsAF.php (T336224)
  • 09:52 gehel: cleaning apt archives on an-test-worker1002: `sudo apt-get clean`, recovering 14G
  • 09:49 cmooney@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 09:43 cmooney@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:36 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:36 cmooney@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:35 cmooney@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcontrol2004-dev']
  • 09:32 volans: installed spicerack v7.2.0 on cumin2002
  • 09:30 aborrero@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 09:21 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1010.eqiad.wmnet
  • 09:18 godog: remove lv prometheus-global - T288196
  • 09:17 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1010.eqiad.wmnet
  • 09:17 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1009.eqiad.wmnet
  • 09:16 volans@cumin1001: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest1001.eqiad.wmnet
  • 09:16 volans@cumin1001: START - Cookbook sre.hosts.dhcp for host sretest1001.eqiad.wmnet
  • 09:13 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1009.eqiad.wmnet
  • 09:12 volans: installed spicerack v7.2.0 on cumin1001
  • 09:11 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1008.eqiad.wmnet
  • 09:07 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1008.eqiad.wmnet
  • 09:06 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1007.eqiad.wmnet
  • 09:02 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1007.eqiad.wmnet
  • 09:01 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1006.eqiad.wmnet
  • 08:57 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1006.eqiad.wmnet
  • 08:56 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2004-dev.codfw.wmnet with OS bullseye
  • 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:53 aborrero@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
  • 08:53 aborrero@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2004-dev - aborrero@cumin1001"
  • 08:49 aborrero@cumin1001: START - Cookbook sre.dns.netbox
  • 08:48 apergos: UTC morning backport and config training window done
  • 08:30 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 08:29 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 08:28 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:28 daniel@deploy1002: Finished scap: Backport for ORES: add model versions configuration and thresholds (T319170) (duration: 10m 12s)
  • 08:28 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 08:19 daniel@deploy1002: daniel and isaranto: Backport for ORES: add model versions configuration and thresholds (T319170) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 08:18 daniel@deploy1002: Started scap: Backport for ORES: add model versions configuration and thresholds (T319170)
  • 07:55 daniel@deploy1002: Finished scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) (duration: 09m 09s)
  • 07:48 daniel@deploy1002: daniel: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 07:46 daniel@deploy1002: Started scap: Backport for Enable parser cache warming jobs for parsoid on frwiki (T329366)
  • 07:42 mlitn@deploy1002: Finished scap: Backport for Add $wgInterwikiLogoOverride (T315269) (duration: 33m 02s)
  • 07:35 moritzm: installing libssh security updates
  • 07:29 mlitn@deploy1002: mlitn: Backport for Add $wgInterwikiLogoOverride (T315269) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
  • 07:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: Setup in progress
  • 07:09 mlitn@deploy1002: Started scap: Backport for Add $wgInterwikiLogoOverride (T315269)
  • 06:16 kart_: Updated MinT to 2023-06-01-041041-production (T336525)
  • 06:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: applied
  • 05:56 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 05:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 05:46 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 05:44 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 05:42 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 05:39 kart_: Updated cxserver to 2023-06-01-041016-production (T337669)
  • 05:34 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:34 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:32 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:27 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:27 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 00:11 eileen: civicrm upgraded from 885208ca to 3819d6d1

Archives

See Server Admin Log/Archives.