You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0))
imported>Stashbot
(jynus: forced session revocation on phab for a user T299315)
 
(384 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== 2020-11-03 ==
== 2022-01-17 ==
* 01:01 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:27 jynus: forced session revocation on phab for a user [[phab:T299315|T299315]]
* 00:59 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:48 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@27a4f7a]: (no justification provided) (duration: 00m 02s)
* 00:42 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:48 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@27a4f7a]: (no justification provided)
* 00:40 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 18:47 krinkle@deploy1002: Finished deploy [integration/docroot@1621c26]: (no justification provided) (duration: 01m 14s)
* 18:46 krinkle@deploy1002: Started deploy [integration/docroot@1621c26]: (no justification provided)
* 16:30 moritzm: installing python-virtualenv bugfix updates from bullseye 11.2 point release
* 16:21 moritzm: installing wget bugfix updates from bullseye 11.2 point release
* 16:13 moritzm: installing freeipmi bugfix updates from bullseye 11.2 point release
* 16:02 moritzm: installing curl bugfix updates from bullseye 11.2 point release
* 15:54 mutante: mw1414,mw1415,mw1416,mw1417,mw1418,mw1447,mw1448,mw1449,mw1450,mw1437,mw1438 (all canaries eqiad) - apt-get remove --purge fonts*; apt-get remove --purge xfonts* ([[phab:T294378|T294378]])
* 15:46 mutante: parse2001, parse2002, wtp1025, wtp1026 (all parsoid canaries - apt-get remove --purge fonts*; apt-get remove --purge xfonts* ([[phab:T294378|T294378]])
* 15:40 mutante: mw2278, mw2279, mw2374, mw2376 (API and jobrunner canaries codfw) - apt-get remove --purge fonts*; apt-get remove --purge xfonts* ([[phab:T294378|T294378]])
* 15:34 mutante: mw2271, mw2272, mw2251, mw2252 (appserver and API canaries codfw) - apt-get remove --purge fonts*; apt-get remove --purge xfonts* ([[phab:T294378|T294378]])
* 15:01 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-airflow1003.eqiad.wmnet
* 14:58 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM an-airflow1003.eqiad.wmnet
* 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2132.codfw.wmnet with OS bullseye
* 14:50 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-airflow1002.eqiad.wmnet
* 14:48 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM an-airflow1002.eqiad.wmnet
* 14:45 moritzm: imported cassandra 3.11.11 to component/cassandradev for stretch-wikimedia and buster-wikimedia [[phab:T298805|T298805]]
* 14:41 moritzm: systemctl reset-failed ifup@ens5.service on an-airflow1001 [[phab:T273026|T273026]]
* 14:39 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-airflow1001.eqiad.wmnet
* 14:37 hnowlan: removing restbase2009 from cassandra configs
* 14:30 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM an-airflow1001.eqiad.wmnet
* 14:16 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2132.codfw.wmnet with OS bullseye
* 14:15 marostegui: Reimage db2132 to Bullseye [[phab:T299344|T299344]]
* 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges group from s3 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18762 and previous config saved to /var/cache/conftool/dbconfig/20220117-134520-marostegui.json
* 12:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS bullseye
* 12:19 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS bullseye
* 12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS bullseye
* 11:40 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS bullseye
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafkamon1002.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM kafkamon1002.eqiad.wmnet
* 11:08 moritzm: switching kubetcd1006 to DRBD-backed storage (required for ganeti update)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: switch to drbd storage
* 11:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1006.eqiad.wmnet with reason: switch to drbd storage
* 11:00 moritzm: systemctl reset-failed ifup@ens5.service on kubetcd1005 [[phab:T273026|T273026]]
* 10:56 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked group from s3 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18761 and previous config saved to /var/cache/conftool/dbconfig/20220117-104801-marostegui.json
* 10:47 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1152.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18760 and previous config saved to /var/cache/conftool/dbconfig/20220117-104459-marostegui.json
* 10:44 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 10:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1153.eqiad.wmnet with OS bullseye
* 10:42 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 10:32 moritzm: switching kubetcd1005 to DRBD-backed storage (required for ganeti update)
* 10:31 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifeeds: sync on staging
* 10:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: switch to drbd storage
* 10:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: switch to drbd storage
* 10:30 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply on production
* 10:30 jayme@deploy1002: helmfile [staging] START helmfile.d/services/wikifeeds: apply on staging
* 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P18759 and previous config saved to /var/cache/conftool/dbconfig/20220117-102954-marostegui.json
* 10:17 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1152.eqiad.wmnet with OS bullseye
* 10:15 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1153.eqiad.wmnet with OS bullseye
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P18758 and previous config saved to /var/cache/conftool/dbconfig/20220117-101450-marostegui.json
* 10:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS bullseye
* 10:04 moritzm: switching kubetcd1004 to DRBD-backed storage (required for ganeti update)
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: switch to drbd storage
* 10:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1004.eqiad.wmnet with reason: switch to drbd storage
* 10:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS bullseye
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18757 and previous config saved to /var/cache/conftool/dbconfig/20220117-095945-marostegui.json
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18756 and previous config saved to /var/cache/conftool/dbconfig/20220117-095837-marostegui.json
* 09:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 09:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18755 and previous config saved to /var/cache/conftool/dbconfig/20220117-095830-marostegui.json
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P18754 and previous config saved to /var/cache/conftool/dbconfig/20220117-094325-marostegui.json
* 09:30 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS bullseye
* 09:30 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bullseye
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P18753 and previous config saved to /var/cache/conftool/dbconfig/20220117-092820-marostegui.json
* 09:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1017.eqiad.wmnet with OS bullseye
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18752 and previous config saved to /var/cache/conftool/dbconfig/20220117-091316-marostegui.json
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18751 and previous config saved to /var/cache/conftool/dbconfig/20220117-091308-marostegui.json
* 09:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 09:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18750 and previous config saved to /var/cache/conftool/dbconfig/20220117-091300-marostegui.json
* 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P18749 and previous config saved to /var/cache/conftool/dbconfig/20220117-085756-marostegui.json
* 08:53 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1017.eqiad.wmnet with OS bullseye
* 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P18748 and previous config saved to /var/cache/conftool/dbconfig/20220117-084251-marostegui.json
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM schema1003.eqiad.wmnet
* 08:34 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM schema1003.eqiad.wmnet
* 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18747 and previous config saved to /var/cache/conftool/dbconfig/20220117-082746-marostegui.json
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T285149|T285149]])', diff saved to https://phabricator.wikimedia.org/P18746 and previous config saved to /var/cache/conftool/dbconfig/20220117-082638-marostegui.json
* 08:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM schema1004.eqiad.wmnet
* 08:17 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM schema1004.eqiad.wmnet
* 06:59 elukey: `systemctl reset-failed ifup@ens5.service` on an-test-client1001 and kafka-test1010
* 06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1016.eqiad.wmnet with OS bullseye
* 05:57 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1016.eqiad.wmnet with OS bullseye


== 2020-11-02 ==
== 2022-01-16 ==
* 22:19 twentyafterfour: restart php7.3-fpm on phab1001
* 08:21 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 22:03 twentyafterfour: applied {{Gerrit|113a244a66}} on phab1001 to hotfix [[phab:T240862|T240862]]
* 08:20 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply on staging
* 20:22 eileen: process-control config revision is {{Gerrit|313a36312f}} re-enable thank you
* 08:20 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply on production
* 19:56 dzahn@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:18 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
* 19:48 dzahn@cumin1001: START - Cookbook sre.dns.netbox
* 08:17 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply on staging
* 19:47 eileen: civicrm revision changed from {{Gerrit|3317d30356}} to {{Gerrit|cd13d9e30f}}, config revision is {{Gerrit|db912e3bba}}
* 08:17 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply on production
* 19:45 eileen: process-control config revision is {{Gerrit|db912e3bba}} - thankyou job off for testing
* 19:07 Urbanecm: Deployed security fix for [[phab:T205908|T205908]]
* 19:04 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 18:59 andrewbogott: added dcaro to ops and wmf ldap groups
* 18:59 mutante: decom'ing testvm1001
* 18:58 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:49 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 18:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:49 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 18:17 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:17 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 18:14 XioNoX: push new pfw policies - [[phab:T267051|T267051]]
* 16:39 ejegg: updated payments-wiki from {{Gerrit|adc3369cb3}} to {{Gerrit|1ad4ba9639}}
* 16:37 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:37 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 16:37 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:37 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 16:37 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:37 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 15:36 moritzm: imported php-excimer/php-luasandbox to component/php72 for buster-wikimedia
* 14:38 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:38 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 14:34 moritzm: rolling restart of cassandra in restbase-dev to pick up Java security updates
* 14:17 kormat: uploaded orchestrator 3.2.3-1 to apt
* 14:01 hashar@deploy1001: Synchronized wmf-config/CommonSettings.php: Remove $wgExtDistListFile, unused - [[phab:T266024|T266024]] (duration: 00m 58s)
* 13:46 elukey@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0)
* 13:40 elukey: roll restart zookeeper ok an-conf* to pick up new openjdk upgrades
* 13:40 elukey@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper
* 13:03 Lucas_WMDE: EU backport&config window done
* 13:02 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/Wikibase: Backport: [[gerrit:637801{{!}}Revert JS parser commits (T266671)]] (duration: 01m 09s)
* 12:52 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:637819{{!}}Add Response namespace at otrs_wikiwiki to namespaces searched by default (T266917)]] (duration: 00m 58s)
* 12:21 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:634224{{!}}Stop defining wmgULSCompactLinksForNewAccounts and wmgULSCompactLinksEnableAnon]], 2/2 (Beta) (duration: 00m 57s)
* 12:20 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:634224{{!}}Stop defining wmgULSCompactLinksForNewAccounts and wmgULSCompactLinksEnableAnon]], 1/2 (production) (duration: 01m 02s)
* 12:15 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:638020{{!}}Stop reading wmgULSCompactLinksForNewAccounts and wmgULSCompactLinksEnableAnon]] (duration: 00m 58s)
* 12:15 volans: upgraded python3-wmflib to 0.0.4 on cumin[12]001
* 12:09 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:637778{{!}}Fix array depth for properties array (T266835)]], Beta part (prod no-op) (duration: 00m 58s)
* 12:07 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:637778{{!}}Fix array depth for properties array (T266835)]] (duration: 00m 59s)
* 12:02 volans: uploaded python3-wmflib_0.0.4 to apt.wikimedia.org buster-wikimedia
* 11:51 effie: disable puppet on thumbor1001 and thumbor1002 to test 636024
* 11:51 effie: disable thumbor on thumbor1001 and thumbor1002 to test 636024
* 11:34 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:638045{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 11:33 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:638045{{!}} Bumping portals to master (T128546)]] (duration: 01m 00s)
* 11:18 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:18 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 11:13 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 11:06 godog: upgrade thanos to 0.16.0 on prometheus hosts - [[phab:T261281|T261281]]
* 10:59 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 10:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 10:50 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 10:28 oblivian@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:28 oblivian@cumin1001: START - Cookbook sre.network.cf
* 10:28 oblivian@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:28 oblivian@cumin1001: START - Cookbook sre.network.cf
* 10:23 moritzm: installing openldap security updates on corp LDAP replicas
* 08:46 XioNoX: add uRPF strict to ulsfo office links - [[phab:T266561|T266561]]
* 08:41 moritzm: installing openldap security updates on LDAP replicas
* 08:40 godog: upgrade thanos to 0.16 in codfw/eqiad - [[phab:T261281|T261281]]
* 06:09 oblivian@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:09 oblivian@cumin1001: START - Cookbook sre.network.cf
* 06:09 oblivian@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:09 oblivian@cumin1001: START - Cookbook sre.network.cf


== 2020-11-01 ==
== 2022-01-15 ==
* 22:41 Urbanecm: mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=metawiki Turkmen # [[phab:T266976|T266976]]
* 08:55 legoktm: finished running recountCategories on s4 wikis ([[phab:T299244|T299244]])
* 09:52 ariel@deploy1001: Finished deploy [dumps/dumps@de4c823]: actually allow per run dir to be made early in the run (duration: 00m 04s)
* 07:58 legoktm: finished running recountCategories on s7 wikis ([[phab:T299244|T299244]])
* 09:52 ariel@deploy1001: Started deploy [dumps/dumps@de4c823]: actually allow per run dir to be made early in the run
* 07:51 legoktm: finished running recountCategories on s2 wikis ([[phab:T299244|T299244]])
* 09:16 ariel@deploy1001: Finished deploy [dumps/dumps@6c7d811]: create empty dir for tableinfo if needed (duration: 00m 04s)
* 06:41 <legoktm>: finished running recountCategories on s3 wikis ([[phab:T299244|T299244]])
* 09:16 ariel@deploy1001: Started deploy [dumps/dumps@6c7d811]: create empty dir for tableinfo if needed
* 06:21 <legoktm>: finished running recountCategories on s6 wikis ([[phab:T299244|T299244]])
* 01:26 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:19 <legoktm>: finished running recountCategories on s5 wikis ([[phab:T299244|T299244]])
* 01:26 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 06:18 <legoktm>: finished running recountCategories on s8 wikis ([[phab:T299244|T299244]])
* 01:16 rzl@cumin1001: dbctl commit (dc=all): 'Depool db1091', diff saved to https://phabricator.wikimedia.org/P13124 and previous config saved to /var/cache/conftool/dbconfig/20201101-011600-rzl.json
* 06:14 legoktm: running recountCategories on s3 wikis
* 05:20 legoktm: started recountCategories.php --wiki=enwiki --mode pages ([[phab:T299244|T299244]])
* 03:05 legoktm: started refreshLinks --dfn-only via systemd units for s7-s8 ([[phab:T299244|T299244]])
* 03:01 legoktm: started refreshLinks --dfn-only via systemd units for s2-s6 ([[phab:T299244|T299244]])
* 02:55 legoktm: started mwscript refreshLinks.php --wiki=commonswiki --dfn-only ([[phab:T299244|T299244]])
* 02:54 legoktm: started mwscript refreshLinks.php --wiki=enwiki --dfn-only ([[phab:T299244|T299244]])
* 02:52 legoktm: started mwscript refreshLinks.php --wiki=enwiki --dfn-only
* 01:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:04 legoktm: starting recountCategories.php --mode pages --wiki enwiki on mwmaint1002
* 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:58 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 00:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:52 dduvall@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]] (duration: 00m 52s)
* 00:51 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 00:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:46 jforrester@deploy1002: Finished scap: Revert "LinksUpdate refactor" and follow-ups for [[phab:T299244|T299244]] re. [[phab:T293958|T293958]] (duration: 03m 58s)
* 00:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:42 jforrester@deploy1002: Started scap: Revert "LinksUpdate refactor" and follow-ups for [[phab:T299244|T299244]] re. [[phab:T293958|T293958]]
* 00:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:14 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert "all/group1 wikis to 1.38.0-wmf.17"


== 2020-10-31 ==
== 2022-01-14 ==
* 00:12 mutante: removed Nuria from wmf group, she is already in nda group ([[phab:T266086|T266086]])
* 23:07 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2051.codfw.wmnet with OS stretch
* 22:26 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 18:09 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on restbase2009.codfw.wmnet with reason: not in restbase cluster, used for testing
* 18:09 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 15 days, 0:00:00 on restbase2009.codfw.wmnet with reason: not in restbase cluster, used for testing
* 17:44 bblack: drmrs asw: removed native-vlan-id from config on secondary (x-rack) interfaces of lvses to debug network issue
* 17:26 bblack: reboot lvs600[23]
* 16:55 bblack: reboot lvs6001
* 16:30 bblack: rebooting cp60xx where x is 6, 7, 8, 14, 15, 16 (downtimed)
* 16:15 dancy@deploy1002: Synchronized README: Testing php-fpm restart (duration: 03m 18s)
* 16:04 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2009.codfw.wmnet with OS buster
* 15:40 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 15:39 bblack: lvs6001 + all services downtimed
* 15:29 bblack@cumin1001: conftool action : set/pooled=yes; selector: dc=drmrs
* 15:00 bblack: silenced site=drmrs in alertmanager for one month, I think
* 15:00 bblack: silenced site=drmrs in alertmanager, I think
* 13:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2011.codfw.wmnet with OS bullseye
* 13:20 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2009.codfw.wmnet with OS buster
* 12:59 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host pc2011.codfw.wmnet with OS bullseye
* 12:53 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 12:51 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2009.codfw.wmnet with OS buster
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1024.eqiad.wmnet with OS buster
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1024.eqiad.wmnet with OS buster
* 12:20 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 12:18 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2009.codfw.wmnet with OS buster
* 11:51 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 11:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on restbase2009.codfw.wmnet with reason: not in restbase cluster, used for testing
* 11:48 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on restbase2009.codfw.wmnet with reason: not in restbase cluster, used for testing
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1023.eqiad.wmnet with OS buster
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1023.eqiad.wmnet with OS buster
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM archiva1002.wikimedia.org
* 11:00 moritzm: systemctl reset-failed ifup@ens5.service on archiva1002 [[phab:T273026|T273026]]
* 10:56 moritzm: rebooting archiva1002 (running archiva.wikimedia.org)
* 10:56 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM archiva1002.wikimedia.org
* 10:55 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS stretch
* 10:50 moritzm: systemctl reset-failed ifup@ens5.service on an-test-ui1001 [[phab:T273026|T273026]]
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-test-ui1001.eqiad.wmnet
* 10:42 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-test-ui1001.eqiad.wmnet
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-test-presto1001.eqiad.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-test-presto1001.eqiad.wmnet
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM matomo1002.eqiad.wmnet
* 10:05 moritzm: rebooting matomo1002 (running piwik.wikimedia.org)
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM matomo1002.eqiad.wmnet
* 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-test-druid1001.eqiad.wmnet
* 09:55 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-test-druid1001.eqiad.wmnet
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM apt1001.wikimedia.org
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM apt1001.wikimedia.org
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM install1003.wikimedia.org
* 09:28 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM install1003.wikimedia.org
* 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-test-client1001.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-test-client1001.eqiad.wmnet
* 09:11 marostegui: Move pc1014 from pc1 to pc2 [[phab:T299046|T299046]]
* 09:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2013.codfw.wmnet with OS bullseye
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-tool1009.eqiad.wmnet
* 09:01 moritzm: rebooting an-tool1009 (running hue.wikimedia.org)
* 09:01 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-tool1009.eqiad.wmnet
* 09:00 moritzm: systemctl reset-failed ifup@ens5.service on an-tool1005 [[phab:T273026|T273026]]
* 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-tool1008.eqiad.wmnet
* 08:58 moritzm: rebooting an-tool1008 (running yarn.wikimedia.org)
* 08:58 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-tool1008.eqiad.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-tool1007.eqiad.wmnet
* 08:55 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-tool1007.eqiad.wmnet
* 08:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM an-tool1005.eqiad.wmnet
* 08:51 moritzm: rebooting an-tool1007 (running turnilo.wikimedia.org)
* 08:50 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM an-tool1005.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cuminunpriv1001.eqiad.wmnet
* 08:34 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM cuminunpriv1001.eqiad.wmnet
* 08:33 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host pc2013.codfw.wmnet with OS bullseye
* 07:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2012.codfw.wmnet with OS bullseye
* 07:05 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host pc2012.codfw.wmnet with OS bullseye
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s3 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18735 and previous config saved to /var/cache/conftool/dbconfig/20220114-063554-marostegui.json
* 06:15 marostegui: Failover m5 proxy from dbproxy1017 to dbproxy1021 [[phab:T298586|T298586]]
* 05:16 legoktm: manually restarted discard_held_messages service on lists1001, failed with a spurious sqlalchemy issue about packets being out of order
* 00:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:23 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 00:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:15 dduvall@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]] (duration: 01m 06s)
* 00:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:13 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:09 dduvall@deploy1002: Synchronized php-1.38.0-wmf.17/includes/content/WikitextContentHandler.php: Backport: [[gerrit:753828{{!}}In WikitextContentHandler always use getFreshParser() (T299149)]] (duration: 01m 07s)


== 2020-10-30 ==
== 2022-01-13 ==
* 23:35 foks: removing two files for legal compliance
* 22:40 WFan: Updating payment-wiki, revision changed from {{Gerrit|8497eae9}} to {{Gerrit|5cc9d5e0}}
* 23:32 mutante: adding query.wikidata.org to TLS cert for webserver-misc-apps.discovery.wmnet [[phab:T266702|T266702]]
* 22:18 dzahn@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=miscweb
* 23:05 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:00 dzahn@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=miscweb
* 23:04 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:48 mutante: running puppet on cp-ulsfo
* 23:04 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:03 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:03 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:03 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:03 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:02 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:02 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:02 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:01 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:01 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:01 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:01 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:31 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.38.0-wmf.17"
* 21:02 jiji@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:00 jiji@cumin2001: START - Cookbook sre.hosts.downtime
* 20:29 dduvall: rolling back wmf.17 from group1 due to a large increase in "Parser state cleared while parsing" across commons and group1 wikipedias ([[phab:T293958|T293958]], [[phab:T299149|T299149]])
* 20:59 mutante: mw1267,mw1268 - scap pull and repool - back to prod - [[phab:T266164|T266164]]
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:57 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1267.eqiad.wmnet
* 20:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:57 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1268.eqiad.wmnet
* 20:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:56 mutante: mw1267,mw1268 - scap pull
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:33 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:17 dduvall@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]] (duration: 01m 06s)
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:16 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:07 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 20:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:32 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 19:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 19:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 19:42 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2051.codfw.wmnet with OS stretch
* 20:31 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 19:40 dzahn@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: sync on main
* 20:06 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:40 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:753634{{!}}Enable ArticlePlaceholder on dagwiki (T298349)]] (duration: 01m 13s)
* 20:04 robh@cumin1001: START - Cookbook sre.hosts.downtime
* 19:37 dzahn@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply on main
* 18:48 cdanis: the above scap began (and mostly finished) several minutes ago but is hanging on a couple hosts down for maintenance
* 19:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:48 cdanis@deploy1001: Synchronized wmf-config/InitialiseSettings.php: lower frwiki featured feeds limit {{Gerrit|1a41ef634}} [[phab:T266865|T266865]] (duration: 05m 14s)
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:48 cdanis: ✔️ cdanis@deploy1001.eqiad.wmnet /srv/mediawiki-staging 🕝☕ scap sync-file wmf-config/InitialiseSettings.php 'lower frwiki featured feeds limit {{Gerrit|1a41ef634}} [[phab:T266865|T266865]]'
* 19:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:27 hashar@deploy1001: Finished deploy [integration/docroot@c35e5e9]: Add ECS to doc.wikimedia.org index (duration: 00m 06s)
* 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:27 hashar@deploy1001: Started deploy [integration/docroot@c35e5e9]: Add ECS to doc.wikimedia.org index
* 19:25 dzahn@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: sync on main
* 17:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:23 dzahn@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply on main
* 17:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:22 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:19 effie: disable puppet on mc1036 and mc2036 - [[phab:T252391|T252391]]
* 19:19 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:747993{{!}}Add event stream config for ios.notification_interaction (T290920)]] (duration: 01m 13s)
* 17:18 effie: enable puppet on all mediawiki and mc* hosts
* 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:19 elukey: kafka-jumbo1006 still running with 1g nick
* 19:15 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:747991{{!}}Add event stream config for android.customize_toolbar_interaction (T297818)]] (duration: 01m 12s)
* 15:36 effie: stopping puppet on mediawiki and mc* hosts
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:11 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:11 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:09 rzl: downtiming mc2036 for buster reimage
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:42 elukey: stop kafka-jumbo1006 to swap NICs (1g -> 10g, d1 -> d4 rack)
* 19:07 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:753793{{!}}Enable skin migration mode on the beta cluster]] (duration: 01m 14s)
* 14:14 cmjohnson1: moving mw1267 and mw168 to rack A8 eqiad [[phab:T266164|T266164]]
* 18:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:29 XioNoX: set normal VRRP balancing on cr2-eqiad
* 18:42 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 10:08 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 10:08 klausman@cumin1001: START - Cookbook sre.hosts.downtime
* 17:49 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS stretch
* 10:02 ladsgroup@deploy1001: Synchronized static/images/project-logos: Revert: Changing logo of Wikidata for the brithday (duration: 01m 12s)
* 17:45 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1005.eqiad.wmnet with reason: requires resync after planet sync
* 09:13 elukey@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:45 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps1005.eqiad.wmnet with reason: requires resync after planet sync
* 09:07 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 17:37 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 08:58 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 17:34 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 08:54 elukey: decom an-tool1006 (old analytics test vm) - [[phab:T255139|T255139]]
* 17:33 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 08:53 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 17:29 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 17:29 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 17:29 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 17:28 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 17:28 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 17:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 17:22 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS stretch
* 17:11 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 17:07 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS stretch
* 17:01 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2009.codfw.wmnet with OS buster
* 16:34 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 16:27 moritzm: impor maps-deduped-tilelist 0.0.5 to buster-wikimedia/main [[phab:T297408|T297408]]
* 16:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cuminunpriv1001.eqiad.wmnet
* 16:00 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM cuminunpriv1001.eqiad.wmnet
* 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 15:50 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2009.codfw.wmnet with OS buster
* 15:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aphlict1001.eqiad.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aphlict1001.eqiad.wmnet
* 15:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM flowspec1001.eqiad.wmnet
* 15:40 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM flowspec1001.eqiad.wmnet
* 15:36 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS stretch
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ldap-replica1004.wikimedia.org
* 15:26 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ldap-replica1004.wikimedia.org
* 15:23 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ldap-replica1003.wikimedia.org
* 15:21 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase2009.codfw.wmnet with OS buster
* 15:20 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ldap-replica1003.wikimedia.org
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM seaborgium.wikimedia.org
* 15:15 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM seaborgium.wikimedia.org
* 15:10 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS stretch
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM urldownloader1002.wikimedia.org
* 15:03 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM urldownloader1002.wikimedia.org
* 14:56 mmandere: cp3053: upgrade varnish to 6.0.9-1wm1 [[phab:T298758|T298758]]
* 14:56 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2009.codfw.wmnet with OS buster
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp1001.wikimedia.org
* 14:47 moritzm: systemctl reset-failed ifup@ens5.service on idp1001 [[phab:T273026|T273026]]
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM idp1001.wikimedia.org
* 14:15 moritzm: switch ml-etcd1003 to DRBD (needed to be able to shuffle instances around for the Ganeti buster update)
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-etcd1003.eqiad.wmnet with reason: switch to drbd storage
* 14:14 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-etcd1003.eqiad.wmnet with reason: switch to drbd storage
* 13:53 mmandere@cumin1001: conftool action : set/pooled=yes; selector: name=cp6009.drmrs.wmnet
* 13:49 moritzm: switch ml-etcd1002 to DRBD (needed to be able to shuffle instances around for the Ganeti buster update)
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-etcd1002.eqiad.wmnet with reason: switch to drbd storage
* 13:48 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-etcd1002.eqiad.wmnet with reason: switch to drbd storage
* 13:45 mmandere@cumin1001: conftool action : set/pooled=yes; selector: name=cp6001.drmrs.wmnet
* 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM urldownloader1001.wikimedia.org
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM urldownloader1001.wikimedia.org
* 13:23 moritzm: switch ml-etcd1001 to DRBD (needed to be able to shuffle instances around for the Ganeti buster update)
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-etcd1001.eqiad.wmnet with reason: switch to drbd storage
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-etcd1001.eqiad.wmnet with reason: switch to drbd storage
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudbackup1001-dev.eqiad.wmnet
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM cloudbackup1001-dev.eqiad.wmnet
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18731 and previous config saved to /var/cache/conftool/dbconfig/20220113-124307-root.json
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s3 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18730 and previous config saved to /var/cache/conftool/dbconfig/20220113-124300-marostegui.json
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s3 codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18729 and previous config saved to /var/cache/conftool/dbconfig/20220113-124140-marostegui.json
* 12:37 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es1021', diff saved to https://phabricator.wikimedia.org/P18728 and previous config saved to /var/cache/conftool/dbconfig/20220113-123744-marostegui.json
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudbackup1002-dev.eqiad.wmnet
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18727 and previous config saved to /var/cache/conftool/dbconfig/20220113-122803-root.json
* 12:27 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM cloudbackup1002-dev.eqiad.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ldap-corp1001.wikimedia.org
* 12:21 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ldap-corp1001.wikimedia.org
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 60%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18726 and previous config saved to /var/cache/conftool/dbconfig/20220113-121300-root.json
* 12:03 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM eventlog1003.eqiad.wmnet
* 11:59 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM eventlog1003.eqiad.wmnet
* 11:57 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18725 and previous config saved to /var/cache/conftool/dbconfig/20220113-115756-root.json
* 11:42 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18724 and previous config saved to /var/cache/conftool/dbconfig/20220113-114252-root.json
* 11:34 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1010.eqiad.wmnet
* 11:27 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18723 and previous config saved to /var/cache/conftool/dbconfig/20220113-112749-root.json
* 11:26 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1010.eqiad.wmnet
* 11:26 _joe_: update scap everywhere [[phab:T298986|T298986]]
* 11:25 oblivian@deploy1002: Finished deploy [restbase/deploy@0848b15]: scap testing (duration: 00m 09s)
* 11:25 oblivian@deploy1002: Started deploy [restbase/deploy@0848b15]: scap testing
* 11:24 oblivian@deploy1002: Finished deploy [restbase/deploy@0848b15]: (no justification provided) (duration: 00m 09s)
* 11:23 oblivian@deploy1002: Started deploy [restbase/deploy@0848b15]: (no justification provided)
* 11:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testreduce1001.eqiad.wmnet
* 11:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2022.codfw.wmnet with OS bullseye
* 11:16 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testreduce1001.eqiad.wmnet
* 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 20%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18722 and previous config saved to /var/cache/conftool/dbconfig/20220113-111245-root.json
* 11:11 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1009.eqiad.wmnet
* 11:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netbox1001.wikimedia.org
* 11:08 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1009.eqiad.wmnet
* 11:03 moritzm: rebooting netbox1001 (running netbox.wikimedia.org)
* 11:03 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM netbox1001.wikimedia.org
* 11:03 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1001.eqiad.wmnet with OS buster
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM netboxdb1001.eqiad.wmnet
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM netboxdb1001.eqiad.wmnet
* 10:58 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1008.eqiad.wmnet
* 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18721 and previous config saved to /var/cache/conftool/dbconfig/20220113-105741-root.json
* 10:56 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1008.eqiad.wmnet
* 10:52 hashar: Restarting Jenkins CI for plugins update [[phab:T298691|T298691]]
* 10:47 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1007.eqiad.wmnet
* 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM search-loader1001.eqiad.wmnet
* 10:45 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1007.eqiad.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM search-loader1001.eqiad.wmnet
* 10:42 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es2022.codfw.wmnet with OS bullseye
* 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18720 and previous config saved to /var/cache/conftool/dbconfig/20220113-104238-root.json
* 10:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM irc1001.wikimedia.org
* 10:29 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main1001.eqiad.wmnet with OS buster
* 10:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM irc1001.wikimedia.org
* 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18719 and previous config saved to /var/cache/conftool/dbconfig/20220113-102734-root.json
* 10:27 moritzm: systemctl reset-failed ifup@ens5.service on lists1001 [[phab:T273026|T273026]]
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM grafana1002.eqiad.wmnet
* 10:10 moritzm: rebooting grafana1002 (running grafana.wikimedia.org)
* 10:10 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM grafana1002.eqiad.wmnet
* 10:09 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye
* 10:02 mmandere: cp3052: upgrade varnish to 6.0.9-1wm1 [[phab:T298758|T298758]]
* 10:02 joal@deploy1002: Finished deploy [analytics/refinery@94ec386]: Hotfix analytics deploy [analytics/refinery@94ec386] (duration: 21m 47s)
* 10:02 elukey: run kafka preferred-replica-election on kafka-main1001 to force a rebalance of partition leaders (after kafka-main1002's reimage)
* 10:00 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1006.eqiad.wmnet
* 09:59 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1002.eqiad.wmnet with OS buster
* 09:56 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1006.eqiad.wmnet
* 09:49 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye
* 09:46 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye
* 09:42 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye
* 09:40 joal@deploy1002: Started deploy [analytics/refinery@94ec386]: Hotfix analytics deploy [analytics/refinery@94ec386]
* 09:40 joal@deploy1002: Finished deploy [analytics/refinery@94ec386] (thin): Hotfix analytics deploy THIN [analytics/refinery@94ec386] (duration: 00m 07s)
* 09:40 joal@deploy1002: Started deploy [analytics/refinery@94ec386] (thin): Hotfix analytics deploy THIN [analytics/refinery@94ec386]
* 09:39 joal@deploy1002: Finished deploy [analytics/refinery@94ec386] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@94ec386] (duration: 06m 59s)
* 09:35 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye
* 09:32 joal@deploy1002: Started deploy [analytics/refinery@94ec386] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@94ec386]
* 09:30 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye
* 09:30 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye
* 09:26 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main1002.eqiad.wmnet with OS buster
* 09:25 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye
* 09:24 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye
* 09:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM xhgui1001.eqiad.wmnet
* 09:14 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM xhgui1001.eqiad.wmnet
* 09:08 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM lists1001.wikimedia.org
* 09:02 moritzm: rebooting lists1001 (running lists.wikimedia.org) to pick up new KVM setting
* 09:00 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM lists1001.wikimedia.org
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1022, give weight to es1021 [[phab:T295965|T295965]] ', diff saved to https://phabricator.wikimedia.org/P18718 and previous config saved to /var/cache/conftool/dbconfig/20220113-085906-marostegui.json
* 08:42 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1003.eqiad.wmnet with OS buster
* 08:39 elukey: ipmi mc reset cold for kafka-main1002, mgmt interface not reachable via ssh
* 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges group from s7 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18717 and previous config saved to /var/cache/conftool/dbconfig/20220113-083923-marostegui.json
* 08:28 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:753505{{!}}Take LogicException into consideration (T299111)]] (duration: 01m 28s)
* 08:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:21 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.17/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:753504{{!}}Take LogicException into consideration (T299111)]] (duration: 01m 28s)
* 08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:08 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main1003.eqiad.wmnet with OS buster
* 08:06 marostegui: Change innodb_checksum_algorithm=full_crc32 on eqiad sanitarium hosts (db1154, db1155) [[phab:T287244|T287244]]
* 08:02 elukey: ipmi mc reset cold for kafka-main1003, mgmt interface not reachable via ssh
* 07:57 elukey: stop kafka* on kafka-main1003 as prep-step for reimage to buster
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked group from s7 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18715 and previous config saved to /var/cache/conftool/dbconfig/20220113-075012-marostegui.json
* 07:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1015.eqiad.wmnet with OS bullseye
* 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1015.eqiad.wmnet with OS bullseye
* 06:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:41 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.16/includes/export/WikiExporter.php: Backport: [[gerrit:753501{{!}}export: Remove ignoring rev_page_id index (T163532)]] (duration: 01m 28s)
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: repooling after maintenance and reimage', diff saved to https://phabricator.wikimedia.org/P18714 and previous config saved to /var/cache/conftool/dbconfig/20220113-064113-root.json
* 06:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:38 marostegui: Failover m3 proxy from dbproxy1016 to dbproxy1020 [[phab:T298586|T298586]]
* 06:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:26 marostegui: Remove rev_page_id from frwiki,jawiki,ruwiki and labswiki from db1096 (s6) [[phab:T285149|T285149]]
* 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: repooling after maintenance and reimage', diff saved to https://phabricator.wikimedia.org/P18713 and previous config saved to /var/cache/conftool/dbconfig/20220113-062609-root.json
* 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: repooling after maintenance and reimage', diff saved to https://phabricator.wikimedia.org/P18712 and previous config saved to /var/cache/conftool/dbconfig/20220113-061105-root.json
* 06:05 tstarling@deploy1002: Synchronized php-1.38.0-wmf.17/includes/libs/rdbms/database/Database.php: (no justification provided) (duration: 01m 27s)
* 05:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 05:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 05:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: repooling after maintenance and reimage', diff saved to https://phabricator.wikimedia.org/P18711 and previous config saved to /var/cache/conftool/dbconfig/20220113-055602-root.json
* 05:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 05:53 tstarling@deploy1002: Synchronized php-1.38.0-wmf.17/tests/phpunit/unit/includes/libs/rdbms/database/DatabaseSQLTest.php: (no justification provided) (duration: 01m 32s)
* 05:00 TimStarling: doing [[phab:T299095|T299095]] restorations on s3 wikis
* 04:30 TimStarling: on mwmaint1002: inserting 11565 rows into itwiki.pagelinks for [[phab:T299095|T299095]]
* 03:33 TimStarling: on mwmaint1002: inserting {{Gerrit|1714288}} into wikidatawiki.pagelinks for [[phab:T299095|T299095]]
* 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:30 TimStarling: on mwmaint1002: inserting {{Gerrit|4221344}} rows into commonswiki.pagelinks to clean up from [[phab:T299095|T299095]]
* 02:29 tstarling@deploy1002: Synchronized php-1.38.0-wmf.16/maintenance/sql.php: batch size (duration: 01m 28s)
* 00:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:31 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:752751{{!}}Enable CirrusSearch on it/en Wikivoyage]] (duration: 01m 28s)
* 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:24 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:752760{{!}}Skip vector-2022 skin in config, not Vector skin (T298923)]] (duration: 01m 29s)
* 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:11 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:753584{{!}}Enable Disambiguator notifications on all wikis (T293319)]] (duration: 01m 28s)
* 00:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn


== 2020-10-29 ==
== 2022-01-12 ==
* 23:59 eileen: process-control config revision is {{Gerrit|6891d35bce}}
* 23:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:39 Urbanecm: Evening B&C window done
* 23:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:38 Urbanecm: urbanecm@mwmaint1002:~$ mwscript namespaceDupes.php --wiki=trwikiquote --add-prefix=BROKEN --fix # [[phab:T266605|T266605]] # P13112
* 23:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:37 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ddb7e08e9c1d07f704c9f7585d8b6089f1895b5c}}: Add namespace aliases to Turkish Wikiquote ([[phab:T266605|T266605]]) (duration: 00m 57s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:36 eileen: process-control config revision is {{Gerrit|1114512f90}}
* 23:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 23:29 Urbanecm: urbanecm@
* 23:29 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.17
* 23:07 jhathaway: rebooting mx1001 to get old kernel
* 22:48 cwhite: end eqiad opensearch upgrade [[phab:T288621|T288621]]
* 21:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18709 and previous config saved to /var/cache/conftool/dbconfig/20220112-214258-marostegui.json
* 21:28 mbsantos: mbsantos@maps1009.eqiad.wmnet: start imposm-initial-import  - full planet re-import ([[phab:T299049|T299049]])
* 21:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P18708 and previous config saved to /var/cache/conftool/dbconfig/20220112-212753-marostegui.json
* 21:19 ryankemper: [WDQS] [[phab:T299098|T299098]] depooled `wdqs2003` so dc-ops can take a look at the PS2 failure
* 21:18 joal@deploy1002: Finished deploy [analytics/refinery@988b7d2] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@988b7d2] (duration: 06m 57s)


== 2020-10-28 ==
== 2022-01-11 ==
* 23:54 ryankemper: Canary `wdqs1003` tests pass, proceeding with wdqs deploy to rest of fleet
* 23:56 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 23:52 ryankemper@deploy1001: Started deploy [wdqs/wdqs@8c97b17]: 0.3.53
* 23:48 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 23:52 ryankemper@deploy1001: deploy aborted:  0.3.53 (duration: 00m 00s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 23:52 ryankemper@deploy1001: Started deploy [wdqs/wdqs@8c97b17]: 0.3.53
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:54 mutante: scandium - scap pull after reinstalling OS
* 23:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 22:14 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 22:12 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:41 ryankemper: Disabled elasticsearch "saneitizer" systemd timer in eqiad due to checker jobs falling behind: `sudo systemctl disable mediawiki_job_cirrus_sanitize_jobs.timer` on `mwmaint1002`
* 23:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:22 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 23:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:05 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 21:05 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 23:05 dduvall@deploy1002: Synchronized php-1.38.0-wmf.17/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Backport: [[gerrit:753071{{!}}Watchlist API update: Call correct method (T298999)]] (duration: 02m 40s)
* 20:50 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 23:04 dduvall: syncing backport to fix VE regression that followed testwiki/group0 deployment (cc [[phab:T293958|T293958]])
* 20:22 ladsgroup@deploy1001: Synchronized static/images/project-logos: Changing logo of Wikidata for the brithday (duration: 00m 58s)
* 21:29 mutante: mw1418 - apt-get remove --purge fonts*; apt-get remove --purge xfonts*; running puppet - nothing gets reinstalled and with --purge it means 'dpkg -l {{!}} grep fonts' is actually empty, not full of "rc" still - [[phab:T294378|T294378]]
* 19:56 jgleeson: updated Smashpig from {{Gerrit|2246685626}} to {{Gerrit|09f29c1da5}}
* 21:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18615 and previous config saved to /var/cache/conftool/dbconfig/20220111-211134-marostegui.json
* 19:53 herron@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97)
* 20:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P18614 and previous config saved to /var/cache/conftool/dbconfig/20220111-205629-marostegui.json
* 19:53 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 20:56 mutante: mw1418 (lowest numbered canary appserver that we use for httpbb hourly tests on cumin1001) - apt-get autoremove - removed font* and python3* packages - reason: [[phab:T294378|T294378]]
* 19:50 herron@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:36 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:36 herron@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:36 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:30 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:42 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1009.eqiad.wmnet
* 19:30 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 20:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P18613 and previous config saved to /var/cache/conftool/dbconfig/20220111-204124-marostegui.json
* 19:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:38 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1009.eqiad.wmnet
* 19:20 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:38 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 18:56 tgr_: Morning deploys done
* 20:36 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1008.eqiad.wmnet
* 18:55 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:636983{{!}}Temporary enable 'editpage' warn logging (T251023)]] (duration: 00m 57s)
* 20:32 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1008.eqiad.wmnet
* 18:51 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:31 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1007.eqiad.wmnet
* 18:51 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:31 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1032.eqiad.wmnet
* 18:47 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1007.eqiad.wmnet
* 18:46 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:636791{{!}}Revert "cirrus: Hardcode more_like to codfw cirrus cluster"]] (duration: 00m 56s)
* 20:27 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1032.eqiad.wmnet
* 18:45 tgr@deploy1001: Synchronized wmf-config/PoolCounterSettings.php: Config: [[gerrit:636956{{!}}Revert "Revert "Increase cirrus morelike pool counter by 20%"" ()]] (duration: 00m 57s)
* 20:26 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1031.eqiad.wmnet
* 18:43 volans@cumin1001: START - Cookbook sre.dns.netbox
* 20:26 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1030.eqiad.wmnet
* 18:40 tgr@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/GrowthExperiments/includes/HomepageModules/SuggestedEdits.php: Backport: [[gerrit:636787{{!}}Suggested edits: Include page ID with task preview data (T266600)]] (duration: 00m 59s)
* 20:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18612 and previous config saved to /var/cache/conftool/dbconfig/20220111-202620-marostegui.json
* 18:19 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:619880{{!}}Removing obsolete license definition]] (duration: 01m 00s)
* 20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18611 and previous config saved to /var/cache/conftool/dbconfig/20220111-202513-marostegui.json
* 18:11 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 18:07 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 18:06 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18610 and previous config saved to /var/cache/conftool/dbconfig/20220111-202505-marostegui.json
* 18:02 elukey@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 20:23 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1031.eqiad.wmnet
* 17:46 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 20:23 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1030.eqiad.wmnet
* 17:46 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:17 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1024.eqiad.wmnet
* 17:30 hnowlan: reimporting OSM data for eqiad
* 20:17 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1025.eqiad.wmnet
* 17:24 hnowlan: removing OSM database on maps1004
* 20:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P18609 and previous config saved to /var/cache/conftool/dbconfig/20220111-201000-marostegui.json
* 16:35 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:09 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1025.eqiad.wmnet
* 16:34 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 20:08 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1024.eqiad.wmnet
* 16:34 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:01 dduvall@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]] (duration: 39m 38s)
* 16:34 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 19:59 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM logstash1023.eqiad.wmnet
* 16:34 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P18608 and previous config saved to /var/cache/conftool/dbconfig/20220111-195456-marostegui.json
* 16:34 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 19:53 cwhite@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM logstash1023.eqiad.wmnet
* 16:22 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,cluster=maps,service=kartotherian-ssl,name=maps1004.eqiad.wmnet
* 19:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18607 and previous config saved to /var/cache/conftool/dbconfig/20220111-193951-marostegui.json
* 16:22 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,cluster=maps,service=kartotherian,name=maps1004.eqiad.wmnet
* 19:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18606 and previous config saved to /var/cache/conftool/dbconfig/20220111-193844-marostegui.json
* 16:18 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,cluster=kartotherian,service=kartotherian,name=maps1004.eqiad.wmnet
* 19:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 16:16 hnowlan: Disabling tilerator in eqiad
* 19:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 16:15 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18605 and previous config saved to /var/cache/conftool/dbconfig/20220111-193836-marostegui.json
* 16:15 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 19:30 sukhe: upload pdns-recursor_4.6.0-1wm1 to apt.wm.o (buster) - [[phab:T252132|T252132]]
* 16:06 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 19:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:05 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 19:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:03 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:51 Amir1: restarting uwsgi on ores in eqiad
* 19:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P18604 and previous config saved to /var/cache/conftool/dbconfig/20220111-192331-marostegui.json
* 15:49 elukey@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:33 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 19:21 dduvall@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.17  refs [[phab:T293958|T293958]]
* 15:33 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 19:17 sukhe@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum1002.eqiad.wmnet
* 15:24 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 19:13 sukhe@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM durum1002.eqiad.wmnet
* 15:24 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 19:13 sukhe@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM durum1001.eqiad.wmnet
* 15:23 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 19:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P18603 and previous config saved to /var/cache/conftool/dbconfig/20220111-190827-marostegui.json
* 15:10 godog: roll restart logstash5 in codfw
* 19:05 sukhe@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM durum1001.eqiad.wmnet
* 14:50 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:05 sukhe@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh1002.wikimedia.org
* 14:05 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'eventrouter' .
* 19:04 dduvall@deploy1002: Pruned MediaWiki: 1.38.0-wmf.9 (duration: 15m 51s)
* 13:54 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'kube-system' for release 'eventrouter' .
* 19:01 sukhe@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM doh1002.wikimedia.org
* 12:39 moritzm: installing libdatetime-timezone-perl  updates
* 19:00 sukhe@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM doh1001.wikimedia.org
* 11:46 XioNoX: configure urpf strict log-only on cr3-ulsfo:et-0/0/1.501 - [[phab:T266561|T266561]]
* 18:58 sukhe@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM doh1001.wikimedia.org
* 10:39 ema: due to [[phab:T266651|T266651]], cancel the entry above: A:cp upgrade libvmod-netmapper to 1.9-1 [[phab:T266567|T266567]] [[phab:T264398|T264398]]
* 18:57 ebernhardson: clear wcqs.jnl and aliases.map for all wcqs instances [[phab:T296470|T296470]]
* 10:38 elukey: clean up 10.64.5.7 and 2620:0:861:104:10:64:5:7 from Netbox (records mistakely allocated via the makevm cookbook) - [[phab:T266648|T266648]]
* 18:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18602 and previous config saved to /var/cache/conftool/dbconfig/20220111-185322-marostegui.json
* 10:35 elukey@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97)
* 18:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:25 ema: A:cp (except cp3052, running varnish 5) upgrade libvmod-netmapper to 1.9-1 [[phab:T266567|T266567]] [[phab:T264398|T264398]]
* 18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18601 and previous config saved to /var/cache/conftool/dbconfig/20220111-185215-marostegui.json
* 10:20 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 18:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 09:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 09:52 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:50 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18600 and previous config saved to /var/cache/conftool/dbconfig/20220111-185208-marostegui.json
* 09:49 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 18:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:26 jayme: imported kubeyaml 0.0.3~20201027+git5f5556c-1 to buster-wikimedia
* 18:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:41 _joe_: also ran apt-get autoremove on mwdebug1002
* 09:02 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 18:41 _joe_: installed scap 4.1.1 on mwdebug1002 [[phab:T298986|T298986]], ran scap pull successfully
* 08:37 jynus: updated dump grants on db2093
* 18:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18599 and previous config saved to /var/cache/conftool/dbconfig/20220111-183703-marostegui.json
* 07:53 volans: upgraded python3-wmflib to 0.0.3 on the cumin hosts - [[phab:T257905|T257905]]
* 18:34 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-test-coord1002.eqiad.wmnet with OS buster
* 07:40 godog: update thanos-fe1002 to thanos 0.16.0 - [[phab:T261281|T261281]]
* 18:29 _joe_: uploaded scap 4.1.1-1 to apt [[phab:T298986|T298986]]
* 07:22 godog: swift codfw-prod: bump object weight for ms-be2057 - [[phab:T261633|T261633]]
* 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18598 and previous config saved to /var/cache/conftool/dbconfig/20220111-182158-marostegui.json
* 04:43 ryankemper: [[phab:T266492|T266492]] Finished rolling restart of codfw cirrus cluster
* 18:08 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host an-test-coord1002.eqiad.wmnet with OS buster
* 04:43 ryankemper@cumin2001: END (PASS) - Cookbook sre.elasticsearch.rolling-restart (exit_code=0)
* 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18597 and previous config saved to /var/cache/conftool/dbconfig/20220111-180653-marostegui.json
* 02:58 ryankemper: [[phab:T266492|T266492]] Beginning rolling restart of codfw cirrus cluster, 3 nodes at a time, on `ryankemper@cumin2001` tmux session `elasticsearch_restart_codfw`
* 18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18596 and previous config saved to /var/cache/conftool/dbconfig/20220111-180547-marostegui.json
* 02:57 ryankemper@cumin2001: START - Cookbook sre.elasticsearch.rolling-restart
* 18:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 02:12 eileen: tools revision changed from {{Gerrit|a2a91d6c6a}} to {{Gerrit|087a596d3a}}
* 18:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 00:40 eileen: civicrm revision changed from {{Gerrit|4fdfb8408b}} to {{Gerrit|e1d65b0f3a}}, config revision is {{Gerrit|f16003ab62}}
* 18:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 18:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
* 18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18595 and previous config saved to /var/cache/conftool/dbconfig/20220111-180534-marostegui.json
* 17:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P18594 and previous config saved to /var/cache/conftool/dbconfig/20220111-175029-marostegui.json
* 17:44 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2009.codfw.wmnet
* 17:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P18593 and previous config saved to /var/cache/conftool/dbconfig/20220111-173524-marostegui.json
* 17:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18592 and previous config saved to /var/cache/conftool/dbconfig/20220111-172019-marostegui.json
* 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18591 and previous config saved to /var/cache/conftool/dbconfig/20220111-171912-marostegui.json
* 17:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 17:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18590 and previous config saved to /var/cache/conftool/dbconfig/20220111-171905-marostegui.json
* 17:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources (duration: 02m 04s)
* 17:12 vgutierrez@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir1002.eqiad.wmnet
* 17:11 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources
* 17:10 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources (duration: 03m 33s)
* 17:08 vgutierrez@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ncredir1002.eqiad.wmnet
* 17:07 vgutierrez@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ncredir1001.eqiad.wmnet
* 17:07 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@65895c0]: Remove cassandra from kartotherian sources
* 17:06 bking@cumin1001: START - Cookbook sre.wdqs.data-reload
* 17:06 bking@cumin1001: START - Cookbook sre.wdqs.data-reload
* 17:04 bking@cumin1001: START - Cookbook sre.wdqs.data-reload
* 17:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18589 and previous config saved to /var/cache/conftool/dbconfig/20220111-170400-marostegui.json
* 17:03 bking@cumin1001: START - Cookbook sre.wdqs.data-reload
* 17:03 vgutierrez@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ncredir1001.eqiad.wmnet
* 17:03 bking@cumin1001: START - Cookbook sre.wdqs.data-reload
* 17:00 bking@cumin1001: START - Cookbook sre.wdqs.data-reload
* 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18588 and previous config saved to /var/cache/conftool/dbconfig/20220111-164856-marostegui.json
* 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18587 and previous config saved to /var/cache/conftool/dbconfig/20220111-163351-marostegui.json
* 16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18586 and previous config saved to /var/cache/conftool/dbconfig/20220111-163244-marostegui.json
* 16:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 16:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18585 and previous config saved to /var/cache/conftool/dbconfig/20220111-163237-marostegui.json
* 16:29 arturo: aborrero@apt1001:~ $ sudo -i reprepro clearvanished
* 16:23 arturo: aborrero@apt1001:~ $ sudo -i reprepro --noskipold --component thirdparty/kubeadm-k8s-1-21 update buster-wikimedia
* 16:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18584 and previous config saved to /var/cache/conftool/dbconfig/20220111-161732-marostegui.json
* 16:03 cwhite: begin rolling restart of opensearch in codfw - jvm upgrade
* 16:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18583 and previous config saved to /var/cache/conftool/dbconfig/20220111-160227-marostegui.json
* 15:59 vgutierrez: re-enable puppet on acme-chief clients after acmechief1001 reboot - [[phab:T294120|T294120]]
* 15:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief1001.eqiad.wmnet
* 15:56 vgutierrez@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM acmechief1001.eqiad.wmnet
* 15:56 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2009.codfw.wmnet with reason: Decommissioning - hnowlan
* 15:56 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2009.codfw.wmnet with reason: Decommissioning - hnowlan
* 15:55 vgutierrez: disable puppet on acme-chief clients for acmechief1001 reboot - [[phab:T294120|T294120]]
* 15:52 vgutierrez@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief-test1001.eqiad.wmnet
* 15:51 ebernhardson: restart elasticserach_6@production-search-psi-eqiad on elastic1049 to resolve issue with full heap
* 15:47 vgutierrez@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM acmechief-test1001.eqiad.wmnet
* 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18582 and previous config saved to /var/cache/conftool/dbconfig/20220111-154722-marostegui.json
* 15:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18580 and previous config saved to /var/cache/conftool/dbconfig/20220111-154615-marostegui.json
* 15:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18579 and previous config saved to /var/cache/conftool/dbconfig/20220111-154608-marostegui.json
* 15:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18578 and previous config saved to /var/cache/conftool/dbconfig/20220111-153103-marostegui.json
* 15:30 hnowlan: Decommissioning cassandra instance restbase2009-a via nodetool
* 15:22 arnoldokoth: systemctl reset-failed ifup@ens5.service on otrs1001 [[phab:T273026|T273026]]
* 15:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18577 and previous config saved to /var/cache/conftool/dbconfig/20220111-151558-marostegui.json
* 15:10 aokoth@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM otrs1001.eqiad.wmnet
* 15:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM rpki1001.eqiad.wmnet
* 15:04 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM rpki1001.eqiad.wmnet
* 15:02 aokoth@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM otrs1001.eqiad.wmnet
* 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18576 and previous config saved to /var/cache/conftool/dbconfig/20220111-150054-marostegui.json
* 15:00 aokoth@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1002.eqiad.wmnet
* 14:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18575 and previous config saved to /var/cache/conftool/dbconfig/20220111-145947-marostegui.json
* 14:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 14:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 14:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18574 and previous config saved to /var/cache/conftool/dbconfig/20220111-145939-marostegui.json
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM zookeeper-test1002.eqiad.wmnet
* 14:56 aokoth@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1002.eqiad.wmnet
* 14:48 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM zookeeper-test1002.eqiad.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ping1002.eqiad.wmnet
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ping1002.eqiad.wmnet
* 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18573 and previous config saved to /var/cache/conftool/dbconfig/20220111-144435-marostegui.json
* 14:38 XioNoX: disable ping-offload in eqiad
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:35 marostegui: Upgrade pc1014 mysql
* 14:33 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751949{{!}}Clean up nova-network remains]] (2/2) (duration: 02m 40s)
* 14:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:31 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:751949{{!}}Clean up nova-network remains]] (1/2) (duration: 02m 49s)
* 14:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18572 and previous config saved to /var/cache/conftool/dbconfig/20220111-142930-marostegui.json
* 14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:25 taavi@deploy1002: Synchronized wmf-config/reverse-proxy.php: Config: [[gerrit:751952{{!}}reverse-proxy: add drmrs ranges (T282787)]] (duration: 01m 36s)
* 14:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1021.eqiad.wmnet with OS bullseye
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18571 and previous config saved to /var/cache/conftool/dbconfig/20220111-141425-marostegui.json
* 14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18570 and previous config saved to /var/cache/conftool/dbconfig/20220111-141318-marostegui.json
* 14:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 14:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 14:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
* 14:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
* 14:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 14:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18569 and previous config saved to /var/cache/conftool/dbconfig/20220111-141249-marostegui.json
* 13:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18568 and previous config saved to /var/cache/conftool/dbconfig/20220111-135744-marostegui.json
* 13:50 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1021.eqiad.wmnet with OS bullseye
* 13:43 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
* 13:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18567 and previous config saved to /var/cache/conftool/dbconfig/20220111-134239-marostegui.json
* 13:36 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
* 13:36 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
* 13:33 moritzm: installing 4.9.290 kernels von stretch systems (no reboots yet)
* 13:29 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
* 13:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18565 and previous config saved to /var/cache/conftool/dbconfig/20220111-132734-marostegui.json
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18564 and previous config saved to /var/cache/conftool/dbconfig/20220111-132627-marostegui.json
* 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 13:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM people1003.eqiad.wmnet
* 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:07 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM people1003.eqiad.wmnet
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM planet1002.eqiad.wmnet
* 12:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM planet1002.eqiad.wmnet
* 12:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18563 and previous config saved to /var/cache/conftool/dbconfig/20220111-122143-marostegui.json
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:15 cparle@deploy1002: Synchronized wmf-config: Config: [[gerrit:752599{{!}}Enable support for references (T230315)]] (duration: 01m 00s)
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubetcd2004.codfw.wmnet with reason: switch to plain disk storage
* 12:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18562 and previous config saved to /var/cache/conftool/dbconfig/20220111-121025-root.json
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18561 and previous config saved to /var/cache/conftool/dbconfig/20220111-120638-marostegui.json
* 12:00 moritzm: reverting kubetcd2004.codfw.wmnet back to "plain" storage
* 11:56 moritzm: rebalance ganeti row A (all nodes reimaged to Buster)
* 11:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18560 and previous config saved to /var/cache/conftool/dbconfig/20220111-115522-root.json
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18559 and previous config saved to /var/cache/conftool/dbconfig/20220111-115133-marostegui.json
* 11:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet
* 11:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18558 and previous config saved to /var/cache/conftool/dbconfig/20220111-114018-root.json
* 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18557 and previous config saved to /var/cache/conftool/dbconfig/20220111-113628-marostegui.json
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
* 11:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18556 and previous config saved to /var/cache/conftool/dbconfig/20220111-113216-marostegui.json
* 11:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 11:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 11:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18555 and previous config saved to /var/cache/conftool/dbconfig/20220111-113208-marostegui.json
* 11:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2023.codfw.wmnet
* 11:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18554 and previous config saved to /var/cache/conftool/dbconfig/20220111-112514-root.json
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2023.codfw.wmnet
* 11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P18553 and previous config saved to /var/cache/conftool/dbconfig/20220111-111704-marostegui.json
* 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P18551 and previous config saved to /var/cache/conftool/dbconfig/20220111-110159-marostegui.json
* 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18550 and previous config saved to /var/cache/conftool/dbconfig/20220111-104654-marostegui.json
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18549 and previous config saved to /var/cache/conftool/dbconfig/20220111-103941-marostegui.json
* 10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18548 and previous config saved to /var/cache/conftool/dbconfig/20220111-103927-marostegui.json
* 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18547 and previous config saved to /var/cache/conftool/dbconfig/20220111-102421-marostegui.json
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18546 and previous config saved to /var/cache/conftool/dbconfig/20220111-100917-marostegui.json
* 09:58 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2019.codfw.wmnet with OS buster
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18545 and previous config saved to /var/cache/conftool/dbconfig/20220111-095408-marostegui.json
* 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18544 and previous config saved to /var/cache/conftool/dbconfig/20220111-095254-marostegui.json
* 09:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 09:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18543 and previous config saved to /var/cache/conftool/dbconfig/20220111-095246-marostegui.json
* 09:51 jayme@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 09:40 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubestagemaster1001.eqiad.wmnet
* 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18542 and previous config saved to /var/cache/conftool/dbconfig/20220111-093741-marostegui.json
* 09:35 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubemaster1001.eqiad.wmnet
* 09:33 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubestagemaster1001.eqiad.wmnet
* 09:29 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubemaster1001.eqiad.wmnet
* 09:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18541 and previous config saved to /var/cache/conftool/dbconfig/20220111-092706-ladsgroup.json
* 09:25 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2019.codfw.wmnet with OS buster
* 09:23 ema: cp4021 (upload), cp4027 (text): upgrade varnish to 6.0.9-1wm1 [[phab:T298758|T298758]]
* 09:23 hashar: Upgrading Jenkins and Apache on releases1002 & release2002
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18540 and previous config saved to /var/cache/conftool/dbconfig/20220111-092236-marostegui.json
* 09:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2078.codfw.wmnet with OS bullseye
* 09:15 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubemaster1002.eqiad.wmnet
* 09:13 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubemaster1002.eqiad.wmnet
* 09:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P18539 and previous config saved to /var/cache/conftool/dbconfig/20220111-091201-ladsgroup.json
* 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2023.codfw.wmnet with OS buster
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18538 and previous config saved to /var/cache/conftool/dbconfig/20220111-090732-marostegui.json
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18537 and previous config saved to /var/cache/conftool/dbconfig/20220111-090119-marostegui.json
* 09:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 09:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18536 and previous config saved to /var/cache/conftool/dbconfig/20220111-090111-marostegui.json
* 08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P18535 and previous config saved to /var/cache/conftool/dbconfig/20220111-085656-ladsgroup.json
* 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2078.codfw.wmnet with OS bullseye
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P18534 and previous config saved to /var/cache/conftool/dbconfig/20220111-084606-marostegui.json
* 08:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18533 and previous config saved to /var/cache/conftool/dbconfig/20220111-084151-ladsgroup.json
* 08:40 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2023.codfw.wmnet with OS buster
* 08:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2124.codfw.wmnet
* 08:33 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2124.codfw.wmnet
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18532 and previous config saved to /var/cache/conftool/dbconfig/20220111-083322-ladsgroup.json
* 08:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18531 and previous config saved to /var/cache/conftool/dbconfig/20220111-083314-ladsgroup.json
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P18530 and previous config saved to /var/cache/conftool/dbconfig/20220111-083102-marostegui.json
* 08:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1020.eqiad.wmnet with OS bullseye
* 08:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P18529 and previous config saved to /var/cache/conftool/dbconfig/20220111-081809-ladsgroup.json
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18528 and previous config saved to /var/cache/conftool/dbconfig/20220111-081557-marostegui.json
* 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18527 and previous config saved to /var/cache/conftool/dbconfig/20220111-081442-marostegui.json
* 08:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 08:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 08:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
* 08:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
* 08:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 08:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
* 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18526 and previous config saved to /var/cache/conftool/dbconfig/20220111-081400-marostegui.json
* 08:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P18525 and previous config saved to /var/cache/conftool/dbconfig/20220111-080305-ladsgroup.json
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P18524 and previous config saved to /var/cache/conftool/dbconfig/20220111-075856-marostegui.json
* 07:55 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1020.eqiad.wmnet with OS bullseye
* 07:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18523 and previous config saved to /var/cache/conftool/dbconfig/20220111-074800-ladsgroup.json
* 07:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2117.codfw.wmnet
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P18522 and previous config saved to /var/cache/conftool/dbconfig/20220111-074351-marostegui.json
* 07:42 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2117.codfw.wmnet
* 07:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18521 and previous config saved to /var/cache/conftool/dbconfig/20220111-074202-ladsgroup.json
* 07:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 07:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 07:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18520 and previous config saved to /var/cache/conftool/dbconfig/20220111-074154-ladsgroup.json
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18519 and previous config saved to /var/cache/conftool/dbconfig/20220111-072847-marostegui.json
* 07:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P18518 and previous config saved to /var/cache/conftool/dbconfig/20220111-072649-ladsgroup.json
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18517 and previous config saved to /var/cache/conftool/dbconfig/20220111-071729-marostegui.json
* 07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18516 and previous config saved to /var/cache/conftool/dbconfig/20220111-071721-marostegui.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18515 and previous config saved to /var/cache/conftool/dbconfig/20220111-071254-root.json
* 07:12 taavi: extensions/CentralAuth/maintenance/migrateHiddenLevel.php finished - [[phab:T289068|T289068]]
* 07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P18514 and previous config saved to /var/cache/conftool/dbconfig/20220111-071144-ladsgroup.json
* 07:07 marostegui: Failover m2 proxy from dbproxy1015 to dbproxy1013 [[phab:T298586|T298586]]
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P18513 and previous config saved to /var/cache/conftool/dbconfig/20220111-070216-marostegui.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18512 and previous config saved to /var/cache/conftool/dbconfig/20220111-065750-root.json
* 06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18511 and previous config saved to /var/cache/conftool/dbconfig/20220111-065640-ladsgroup.json
* 06:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2114.codfw.wmnet
* 06:51 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2114.codfw.wmnet
* 06:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 ([[phab:T296143|T296143]])', diff saved to https://phabricator.wikimedia.org/P18510 and previous config saved to /var/cache/conftool/dbconfig/20220111-065118-ladsgroup.json
* 06:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
* 06:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
* 06:50 Amir1: upgrading mysql on ['db2114', 'db2117', 'db2124']
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P18509 and previous config saved to /var/cache/conftool/dbconfig/20220111-064712-marostegui.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18508 and previous config saved to /var/cache/conftool/dbconfig/20220111-064247-root.json
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18507 and previous config saved to /var/cache/conftool/dbconfig/20220111-063207-marostegui.json
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18506 and previous config saved to /var/cache/conftool/dbconfig/20220111-063052-marostegui.json
* 06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1012.eqiad.wmnet with OS bullseye
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P18505 and previous config saved to /var/cache/conftool/dbconfig/20220111-062743-root.json
* 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es2032 after Bullseye reimage [[phab:T295965|T295965]]', diff saved to https://phabricator.wikimedia.org/P18504 and previous config saved to /var/cache/conftool/dbconfig/20220111-062620-marostegui.json
* 06:21 taavi: starting extensions/CentralAuth/maintenance/migrateHiddenLevel.php on a mwmaint1002 screen session - [[phab:T289068|T289068]]
* 06:00 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1012.eqiad.wmnet with OS bullseye
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18503 and previous config saved to /var/cache/conftool/dbconfig/20220111-054417-marostegui.json
* 05:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 02:41 eileen: * revision {{Gerrit|d90542c2}} -> {{Gerrit|2956a622}} (latest)
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:42 eileen: revision {{Gerrit|277989d7}} -> {{Gerrit|d90542c2}} (latest) civicrm
* 00:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:24 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.16/skins/Vector/resources/skins.vector.js/dropdownMenus.js: {{Gerrit|79b33f2}}: Fix TypeError: document.querySelectorAll(...).forEach is not a function ([[phab:T298910|T298910]]) (duration: 00m 59s)
* 00:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn


== 2020-10-27 ==
== 2022-01-10 ==
* 22:20 mutante: systemctl reset-failed on various servers to see which are coming back later from failed auto_restart and which don't
* 22:36 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 21:40 mutante: mwmaint2001 - systemctl reset-failed - mediawiki_job_parser_cache_purging.service
* 22:34 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 20:56 mutante: ms-be1057 is network down but running, NO-CARRIER on NIC, cable disconnected?
* 20:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18502 and previous config saved to /var/cache/conftool/dbconfig/20220110-202728-marostegui.json
* 20:43 mutante: releases2002 - systemctl reset-failed .. after removing wmf_auto_restart_rsync
* 20:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P18501 and previous config saved to /var/cache/conftool/dbconfig/20220110-201224-marostegui.json
* 20:13 mutante: gerrit1001/gerrit2001: manually deleting list_mediawiki_extensions cron job ([[phab:T266024|T266024]])
* 19:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P18500 and previous config saved to /var/cache/conftool/dbconfig/20220110-195719-marostegui.json
* 19:40 eileen: civicrm revision changed from {{Gerrit|bb7c08bf6d}} to {{Gerrit|4fdfb8408b}}, config revision is {{Gerrit|f16003ab62}}
* 19:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18499 and previous config saved to /var/cache/conftool/dbconfig/20220110-194214-marostegui.json
* 18:35 oblivian@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 19:32 ejegg: updated fundraising civicrm from {{Gerrit|3d334f30}} to {{Gerrit|277989d7}}
* 17:55 otto@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 19:29 urbanecm: UTC evening B&C finished
* 17:55 otto@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 19:27 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|8f5ca9af5ef04d1d19759cdf201fc0c7e4ee6fbc}}: Enable TheWikipediaLibrary on most wikis ([[phab:T288070|T288070]]) (duration: 01m 00s)
* 17:46 otto@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 19:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:46 otto@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:44 otto@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:22 mutante: gerrit1001/2001 - sudo rm /var/www/mediawiki-extensions.txt
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:18 ejegg: updated payments-wiki from {{Gerrit|4c1503ad91}} to {{Gerrit|adc3369cb3}}
* 18:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18497 and previous config saved to /var/cache/conftool/dbconfig/20220110-184154-marostegui.json
* 16:34 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 18:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 16:34 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 18:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 16:05 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 18:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18496 and previous config saved to /var/cache/conftool/dbconfig/20220110-184147-marostegui.json
* 16:05 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 18:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P18495 and previous config saved to /var/cache/conftool/dbconfig/20220110-182642-marostegui.json
* 16:05 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 18:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P18494 and previous config saved to /var/cache/conftool/dbconfig/20220110-181137-marostegui.json
* 16:05 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 17:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18493 and previous config saved to /var/cache/conftool/dbconfig/20220110-175633-marostegui.json
* 16:05 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18492 and previous config saved to /var/cache/conftool/dbconfig/20220110-175503-marostegui.json
* 15:59 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 17:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 15:42 mepps: updated payments-wiki-staging from {{Gerrit|5fdd29bc16}} to {{Gerrit|4c1503ad91}}
* 17:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 15:25 ema: cp4032: downgrade varnish to 6.0.4 [[phab:T264398|T264398]]
* 17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18491 and previous config saved to /var/cache/conftool/dbconfig/20220110-175455-marostegui.json
* 15:13 ema: cp4032: varnish-frontend-restart with libvmod-netmapper 1.9-1 [[phab:T266567|T266567]]
* 17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P18489 and previous config saved to /var/cache/conftool/dbconfig/20220110-173950-marostegui.json
* 14:55 ema: upload libvmod-netmapper 1.9-1 to buster-wikimedia component/varnish6 [[phab:T266567|T266567]]
* 17:34 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1016.eqiad.wmnet
* 14:49 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0)
* 17:32 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1016.eqiad.wmnet
* 14:48 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 17:30 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1015.eqiad.wmnet
* 14:40 _joe_: restarting envoyproxy on the jobrunners in codfw
* 17:28 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1015.eqiad.wmnet
* 14:36 akosiaris: rolling restart of all pods in codfw changeprop-jobqueue
* 17:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P18488 and previous config saved to /var/cache/conftool/dbconfig/20220110-172446-marostegui.json
* 14:27 _joe_: restart php-fpm on jobrunners in codfw
* 17:23 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1006.eqiad.wmnet
* 14:17 cdanis: ran puppet on alert1001
* 17:21 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1006.eqiad.wmnet
* 14:16 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0)
* 17:16 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kubernetes1005.eqiad.wmnet
* 14:15 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-update-tendril
* 17:14 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kubernetes1005.eqiad.wmnet
* 14:15 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-run-puppet-on-db-masters (exit_code=0)
* 17:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18487 and previous config saved to /var/cache/conftool/dbconfig/20220110-170941-marostegui.json
* 14:11 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-run-puppet-on-db-masters
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18486 and previous config saved to /var/cache/conftool/dbconfig/20220110-170811-marostegui.json
* 14:11 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
* 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 14:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
* 14:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18485 and previous config saved to /var/cache/conftool/dbconfig/20220110-170804-marostegui.json
* 14:09 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 16:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P18484 and previous config saved to /var/cache/conftool/dbconfig/20220110-165259-marostegui.json
* 14:09 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
* 16:52 ema: varnish 6.0.9-1wm1 uploaded to buster-wikimedia - component/varnish6 [[phab:T298758|T298758]]
* 14:09 rzl@cumin1001: MediaWiki read-only period ends at: 2020-10-27 14:09:02.873019
* 16:47 moritzm: installing 5.10.84 kernels on bullseye hosts (no reboots involved, just installing the new kernels in parallel)
* 14:09 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P18483 and previous config saved to /var/cache/conftool/dbconfig/20220110-163754-marostegui.json
* 14:06 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18482 and previous config saved to /var/cache/conftool/dbconfig/20220110-162249-marostegui.json
* 14:06 root@cumin1001: START - Cookbook sre.hosts.downtime
* 16:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2023.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage
* 14:05 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=99)
* 16:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2023.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage
* 14:04 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 16:21 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM registry1004.eqiad.wmnet
* 14:04 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=99)
* 16:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18481 and previous config saved to /var/cache/conftool/dbconfig/20220110-162122-marostegui.json
* 14:03 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 16:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 14:03 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
* 16:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 14:03 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
* 16:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18480 and previous config saved to /var/cache/conftool/dbconfig/20220110-162114-marostegui.json
* 14:03 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0)
* 16:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2019.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage
* 14:03 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions
* 16:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2019.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage
* 14:03 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0)
* 16:19 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM registry1004.eqiad.wmnet
* 14:02 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki
* 16:18 root@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0)
* 16:13 root@cumin1001: START - Cookbook sre.dns.netbox
* 14:02 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
* 16:09 damilare: process-control config {{Gerrit|ecf09aa0}} -> {{Gerrit|66e69bda}}
* 14:02 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0)
* 16:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P18479 and previous config saved to /var/cache/conftool/dbconfig/20220110-160608-marostegui.json
* 14:01 rzl@cumin1001: MediaWiki read-only period starts at: 2020-10-27 14:01:54.999830
* 16:00 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM chartmuseum1001.eqiad.wmnet
* 14:01 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
* 16:00 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM registry1003.eqiad.wmnet
* 13:56 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
* 15:57 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM registry1003.eqiad.wmnet
* 13:56 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
* 15:56 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM chartmuseum1001.eqiad.wmnet
* 13:55 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P18478 and previous config saved to /var/cache/conftool/dbconfig/20220110-155103-marostegui.json
* 13:55 root@cumin1001: START - Cookbook sre.hosts.downtime
* 15:49 jayme@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 13:54 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 15:49 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dragonfly-supernode1001.eqiad.wmnet
* 13:53 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 15:45 jayme@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM dragonfly-supernode1001.eqiad.wmnet
* 13:50 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18476 and previous config saved to /var/cache/conftool/dbconfig/20220110-153559-marostegui.json
* 13:49 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18475 and previous config saved to /var/cache/conftool/dbconfig/20220110-153429-marostegui.json
* 13:47 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 15:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 13:46 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 15:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 13:38 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18474 and previous config saved to /var/cache/conftool/dbconfig/20220110-153421-marostegui.json
* 13:37 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 15:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P18472 and previous config saved to /var/cache/conftool/dbconfig/20220110-151917-marostegui.json
* 13:36 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P18471 and previous config saved to /var/cache/conftool/dbconfig/20220110-150412-marostegui.json
* 13:35 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
* 14:55 jbond@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetdb1002.eqiad.wmnet
* 13:35 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
* 14:51 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 13:35 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 14:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:15 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:10 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:07 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:49 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:752277{{!}}Give priority to PreparedUpdate (T288639)]] (duration: 01m 00s)
* 13:04 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18470 and previous config saved to /var/cache/conftool/dbconfig/20220110-144907-marostegui.json
* 13:01 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:55 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18469 and previous config saved to /var/cache/conftool/dbconfig/20220110-144737-marostegui.json
* 12:55 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 12:51 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:35 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 11:27 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 11:25 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 11:21 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 11:19 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 11:14 ema: A:cp remove libvarnishapi1, replaced by libvarnishapi2 a while ago [[phab:T261487|T261487]]
* 14:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 11:13 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:12 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:06 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 11:02 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 10:54 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:36 jbond@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM puppetdb1002.eqiad.wmnet
* 10:52 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:32 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 10:46 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1001.wikimedia.org
* 10:44 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:27 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1001.wikimedia.org
* 10:40 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM moscovium.eqiad.wmnet
* 10:31 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 14:19 jelto: upload wmf-sre-laptop 0.5.3 deb package
* 10:27 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 14:19 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM moscovium.eqiad.wmnet
* 10:21 XioNoX: update policies from-zone production to-zone junos-host on mr1-eqiad - [[phab:T265589|T265589]]
* 14:07 jbond: disable puppet fleet wide for puppetdb restart
* 10:20 XioNoX: update policies from-zone production to-zone junos-host on mr1-eqsin - [[phab:T265589|T265589]]
* 13:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:19 XioNoX: update policies from-zone production to-zone junos-host on mr1-ulsfo - [[phab:T265589|T265589]]
* 13:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:15 XioNoX: update policies from-zone production to-zone junos-host on mr1-esams - [[phab:T265589|T265589]]
* 13:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:06 XioNoX: update policies from-zone production to-zone junos-host on mr1-codfw - [[phab:T265589|T265589]]
* 13:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 08:58 elukey@cumin1001: END (ERROR) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=97)
* 13:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
* 08:55 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 13:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
* 08:39 elukey@cumin1001: END (ERROR) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=97)
* 13:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 08:32 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 13:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
* 08:30 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0)
* 13:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 08:27 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 13:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 08:15 godog: update thanos-fe2002 to thanos 0.16.0 - [[phab:T261281|T261281]]
* 13:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 07:35 godog: swift codfw-prod: bump object weight for ms-be2057 - [[phab:T261633|T261633]]
* 13:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 06:58 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'kube-system' for release 'eventrouter' .
* 13:54 btullis: upgrading oozie packages in reprepro in order to pick up new log4j version
* 06:50 jayme: published docker-registry.discovery.wmnet/eventrouter:0.3.0-4
* 13:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2032.codfw.wmnet with OS bullseye
* 06:42 ryankemper: [[phab:T263970|T263970]] Set number of replicas to 2 (from previous value of 1) for all codfw indices matching `apifeatureusage*`, new shards have been assigned without issue
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18468 and previous config saved to /var/cache/conftool/dbconfig/20220110-131523-marostegui.json
* 13:02 moritzm: installing ghostscript security updates
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P18467 and previous config saved to /var/cache/conftool/dbconfig/20220110-130018-marostegui.json
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P18466 and previous config saved to /var/cache/conftool/dbconfig/20220110-124513-marostegui.json
* 12:44 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es2032.codfw.wmnet with OS bullseye
* 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2032 for Bullseye reimage [[phab:T295965|T295965]]', diff saved to https://phabricator.wikimedia.org/P18465 and previous config saved to /var/cache/conftool/dbconfig/20220110-124222-marostegui.json
* 12:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:36 taavi: UTC morning deploys done
* 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:34 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:752634{{!}}hewikisource: remove "קטע" namespace and its talk page (T298430)]] (duration: 00m 58s)
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18464 and previous config saved to /var/cache/conftool/dbconfig/20220110-123009-marostegui.json
* 12:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18463 and previous config saved to /var/cache/conftool/dbconfig/20220110-122847-marostegui.json
* 12:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 12:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18462 and previous config saved to /var/cache/conftool/dbconfig/20220110-122840-marostegui.json
* 12:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:24 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:752187{{!}}Growth: Add GEMentorDashboardDeploymentMode (T298792)]] (duration: 00m 59s)
* 12:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:18 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751545{{!}}uzwiki: Amend Babel configuration (T131924)]] (duration: 00m 59s)
* 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P18460 and previous config saved to /var/cache/conftool/dbconfig/20220110-121335-marostegui.json
* 12:10 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:747868{{!}}Add MediaSearch profiles (T297863)]] (duration: 00m 59s)
* 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P18459 and previous config saved to /var/cache/conftool/dbconfig/20220110-115830-marostegui.json
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18458 and previous config saved to /var/cache/conftool/dbconfig/20220110-114326-marostegui.json
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18457 and previous config saved to /var/cache/conftool/dbconfig/20220110-114305-marostegui.json
* 11:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 11:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 11:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance
* 11:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance
* 11:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 11:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 11:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 11:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 11:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 11:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 9 hosts with reason: Maintenance
* 11:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on 9 hosts with reason: Maintenance
* 11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18456 and previous config saved to /var/cache/conftool/dbconfig/20220110-114043-marostegui.json
* 11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18455 and previous config saved to /var/cache/conftool/dbconfig/20220110-112538-marostegui.json
* 11:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18454 and previous config saved to /var/cache/conftool/dbconfig/20220110-111034-marostegui.json
* 10:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18453 and previous config saved to /var/cache/conftool/dbconfig/20220110-105529-marostegui.json
* 10:53 moritzm: installing openjdk-11 security updates
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s7 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18452 and previous config saved to /var/cache/conftool/dbconfig/20220110-104445-marostegui.json
* 10:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18451 and previous config saved to /var/cache/conftool/dbconfig/20220110-104004-marostegui.json
* 10:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 10:38 elukey: stop/start kafka daemons on kafka-main1* nodes to move the kafka user to fixed uid/gid - [[phab:T296641|T296641]]
* 10:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 10:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
* 10:16 Amir1: removing echo objectcache entries on all wikis ([[phab:T272512|T272512]])
* 09:56 moritzm: migrating primary/secondary instances off ganeti2019
* 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18449 and previous config saved to /var/cache/conftool/dbconfig/20220110-093534-marostegui.json
* 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s7 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18448 and previous config saved to /var/cache/conftool/dbconfig/20220110-092605-marostegui.json
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P18447 and previous config saved to /var/cache/conftool/dbconfig/20220110-092029-marostegui.json
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P18446 and previous config saved to /var/cache/conftool/dbconfig/20220110-090525-marostegui.json
* 08:54 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all groups from s7 codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18445 and previous config saved to /var/cache/conftool/dbconfig/20220110-085402-marostegui.json
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18444 and previous config saved to /var/cache/conftool/dbconfig/20220110-085020-marostegui.json
* 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18443 and previous config saved to /var/cache/conftool/dbconfig/20220110-084912-marostegui.json
* 08:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18442 and previous config saved to /var/cache/conftool/dbconfig/20220110-084858-marostegui.json
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P18441 and previous config saved to /var/cache/conftool/dbconfig/20220110-083354-marostegui.json
* 08:25 moritzm: migrating primary/secondary instances off ganeti2023
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P18440 and previous config saved to /var/cache/conftool/dbconfig/20220110-081849-marostegui.json
* 08:13 marostegui: Drop table wikishared.wikimedia_editor_tasks_targets_passed  [[phab:T264225|T264225]]
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18439 and previous config saved to /var/cache/conftool/dbconfig/20220110-080344-marostegui.json
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18438 and previous config saved to /var/cache/conftool/dbconfig/20220110-080236-marostegui.json
* 08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: Maintenance
* 08:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: Maintenance
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18437 and previous config saved to /var/cache/conftool/dbconfig/20220110-080225-marostegui.json
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P18436 and previous config saved to /var/cache/conftool/dbconfig/20220110-074720-marostegui.json
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P18435 and previous config saved to /var/cache/conftool/dbconfig/20220110-073216-marostegui.json
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18434 and previous config saved to /var/cache/conftool/dbconfig/20220110-071711-marostegui.json
* 07:16 marostegui: Failover m1 proxy from dbproxy1012 to dbproxy1014 [[phab:T298586|T298586]]
* 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18433 and previous config saved to /var/cache/conftool/dbconfig/20220110-071603-marostegui.json
* 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18432 and previous config saved to /var/cache/conftool/dbconfig/20220110-071556-marostegui.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P18431 and previous config saved to /var/cache/conftool/dbconfig/20220110-070051-marostegui.json
* 06:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1014.eqiad.wmnet with OS bullseye
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P18430 and previous config saved to /var/cache/conftool/dbconfig/20220110-064546-marostegui.json
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18429 and previous config saved to /var/cache/conftool/dbconfig/20220110-063042-marostegui.json
* 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1014.eqiad.wmnet with OS bullseye
* 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18428 and previous config saved to /var/cache/conftool/dbconfig/20220110-062934-marostegui.json
* 06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18427 and previous config saved to /var/cache/conftool/dbconfig/20220110-062925-marostegui.json
* 06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1013.eqiad.wmnet with OS bullseye
* 06:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 06:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 06:16 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:752270{{!}}Use PreparedUpdate to avoid double parse (T288639)]] (duration: 01m 00s)
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P18426 and previous config saved to /var/cache/conftool/dbconfig/20220110-061420-marostegui.json
* 05:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P18425 and previous config saved to /var/cache/conftool/dbconfig/20220110-055915-marostegui.json
* 05:58 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1013.eqiad.wmnet with OS bullseye
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18424 and previous config saved to /var/cache/conftool/dbconfig/20220110-054410-marostegui.json
* 05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18423 and previous config saved to /var/cache/conftool/dbconfig/20220110-054100-marostegui.json
* 05:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
* 05:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance


== 2020-10-26 ==
== 2022-01-08 ==
* 23:12 catrope@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/GrowthExperiments/: Fix JS error when no topics set ([[phab:T266501|T266501]]) (duration: 01m 00s)
* 10:51 elukey: restart hive daemons on an-coord1002 (after my last upgrade/rollback of packages the prometheus agent settings were not picked up, so no metrics)
* 22:30 mutante: netflow5001 - systemctl reset-failed
* 21:44 rzl: live test of sre.switchdc.mediawiki complete, the foregoing logging noise had no actual production impact
* 21:43 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0)
* 21:43 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-update-tendril
* 21:43 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
* 21:41 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 21:41 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0)
* 21:41 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 21:40 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-run-puppet-on-db-masters (exit_code=0)
* 21:37 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-run-puppet-on-db-masters
* 21:37 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
* 21:37 rzl@cumin1001: [DRY-RUN] MediaWiki read-only period ends at: 2020-10-26 21:37:17.809596
* 21:37 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 21:37 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
* 21:37 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
* 21:37 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0)
* 21:36 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions
* 21:36 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0)
* 21:36 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki
* 21:36 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0)
* 21:35 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
* 21:35 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0)
* 21:35 rzl@cumin1001: [DRY-RUN] MediaWiki read-only period starts at: 2020-10-26 21:35:20.837214
* 21:35 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
* 21:34 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
* 21:34 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
* 21:34 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 21:33 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 21:32 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
* 21:32 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 21:31 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
* 21:31 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
* 21:31 rzl: starting a live test of sre.switchdc.mediawiki, which will create some logging noise but no actual production impact
* 20:54 mutante: scandium rm /usr/local/bin/update_parsoid.sh (gerrit:636494)
* 20:15 ladsgroup@deploy1001: Finished deploy [ores/deploy@6912889]: Deploy new version of articlequality for wikidata ([[phab:T261326|T261326]]) (duration: 06m 53s)
* 20:08 ladsgroup@deploy1001: Started deploy [ores/deploy@6912889]: Deploy new version of articlequality for wikidata ([[phab:T261326|T261326]])
* 19:31 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 19:29 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 19:26 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 18:59 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Remove variant setting override (no-op) ([[phab:T265556|T265556]]) (duration: 00m 57s)
* 18:55 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure $wgBabelCategoryNames on ndswiki ([[phab:T264990|T264990]]) (duration: 00m 58s)
* 18:51 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add www.legislation.gov.uk to $wgCopyUploadsDomains on commonswiki ([[phab:T265690|T265690]]) (duration: 00m 58s)
* 18:47 catrope@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/GrowthExperiments/: Make variant D the default, remove variant A ([[phab:T265372|T265372]], [[phab:T265556|T265556]]) (duration: 00m 58s)
* 18:46 catrope@deploy1001: Synchronized php-1.36.0-wmf.14/vendor/wikimedia/parsoid/: Bump wikimedia/parsoid to v0.13.0-a13, enabling 6-element DSRs ([[phab:T266285|T266285]]) (duration: 00m 58s)
* 18:43 catrope@deploy1001: Synchronized php-1.36.0-wmf.14/skins/Vector/: Fix logic in collapsibleTabs code ([[phab:T71729|T71729]]) (duration: 00m 58s)
* 18:21 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove wtp2001-wtp2020 from LinterSubmitterWhitelist ([[phab:T265558|T265558]]) (duration: 00m 59s)
* 18:10 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Make variant D the default on all wikis ([[phab:T265556|T265556]]) (duration: 00m 58s)
* 17:58 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'kube-system' for release 'eventrouter' .
* 17:48 mutante: an-worker109* - systemctl reset-failed  to clear Icinga alerts related to wmf_auto_restart changes
* 17:45 mutante: releases2002,netmon2001, various other hosts - systemctl reset-failed  to clear Icinga alerts related to wmf_auto_restart changes
* 17:39 krinkle@deploy1001: Synchronized php-1.36.0-wmf.13/resources/src/mediawiki.util/: [[phab:T265809|T265809]], {{Gerrit|I1011f63ae61f5a6}} (duration: 01m 00s)
* 16:41 XioNoX: bounce security log on pfw3-eqiad - [[phab:T263833|T263833]]
* 16:29 XioNoX: set security-log traceoptions on pfw3-eqiad - [[phab:T263833|T263833]]
* 16:14 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:08 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:07 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:00 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:51 rzl@cumin1001: conftool action : set/ttl=300; selector: dnsdisc=apertium{{!}}api-gateway{{!}}citoid{{!}}cxserver{{!}}echostore{{!}}eventgate-analytics{{!}}eventgate-analytics-external{{!}}eventgate-logging-external{{!}}eventgate-main{{!}}eventstreams{{!}}graphoid{{!}}kartotherian{{!}}mathoid{{!}}mobileapps{{!}}ores{{!}}parsoid{{!}}proton{{!}}push-notifications{{!}}recommendation-api{{!}}restbase{{!}}restbase-async{{!}}schema{{!}}search{{!}}sessionstore{{!}}termbox{{!}}wdqs{{!}}wdqs-internal{{!}}wikifeeds{{!}}zotero,name=eqiad
* 15:35 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=zotero,name=eqiad
* 15:32 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=wikifeeds,name=eqiad
* 15:29 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal,name=eqiad
* 15:26 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=wdqs,name=eqiad
* 15:23 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=termbox,name=eqiad
* 15:20 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 15:17 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 15:14 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=schema,name=eqiad
* 15:11 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=eqiad
* 15:08 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=restbase,name=eqiad
* 15:05 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
* 15:02 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=push-notifications,name=eqiad
* 14:59 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=proton,name=eqiad
* 14:56 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=parsoid,name=eqiad
* 14:53 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=ores,name=eqiad
* 14:50 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=mobileapps,name=eqiad
* 14:47 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=mathoid,name=eqiad
* 14:46 ppchelko@deploy1001: Finished deploy [restbase/deploy@a1a1bd7]: Add api-portal and snmwiki (duration: 16m 43s)
* 14:44 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=eqiad
* 14:41 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=graphoid,name=eqiad
* 14:38 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=eventstreams,name=eqiad
* 14:35 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=eventgate-main,name=eqiad
* 14:32 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=eventgate-logging-external,name=eqiad
* 14:30 ppchelko@deploy1001: Started deploy [restbase/deploy@a1a1bd7]: Add api-portal and snmwiki
* 14:29 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=eventgate-analytics-external,name=eqiad
* 14:26 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=eventgate-analytics,name=eqiad
* 14:23 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=echostore,name=eqiad
* 14:20 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=cxserver,name=eqiad
* 14:17 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=citoid,name=eqiad
* 14:14 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=api-gateway,name=eqiad
* 14:11 rzl@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=apertium,name=eqiad
* 14:06 rzl@cumin1001: conftool action : set/ttl=10; selector: dnsdisc=apertium{{!}}api-gateway{{!}}citoid{{!}}cxserver{{!}}echostore{{!}}eventgate-analytics{{!}}eventgate-analytics-external{{!}}eventgate-logging-external{{!}}eventgate-main{{!}}eventstreams{{!}}graphoid{{!}}kartotherian{{!}}mathoid{{!}}mobileapps{{!}}ores{{!}}parsoid{{!}}proton{{!}}push-notifications{{!}}recommendation-api{{!}}restbase{{!}}restbase-async{{!}}schema{{!}}search{{!}}sessionstore{{!}}termbox{{!}}wdqs{{!}}wdqs-internal{{!}}wikifeeds{{!}}zotero,name=eqiad
* 13:48 moritzm: imported cas 6.2.4-1 to apt.wikimedia.org [[phab:T265857|T265857]]
* 13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 13:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 11:52 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bff6b37a55fe8f260fe00cbb942c53101167fb07}}: Add foto.digitalarkivet.no to wgCopyUploadsDomains whitelist of Wikimedia Commons ([[phab:T266390|T266390]]) (duration: 01m 14s)
* 11:27 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99)
* 11:27 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 11:26 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99)
* 11:26 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 11:11 vgutierrez: upgrade trafficserver to 8.0.8-1wm3 on cp4032 - [[phab:T265911|T265911]]
* 11:02 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99)
* 11:02 elukey@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers
* 10:51 vgutierrez: manually reloading nginx on cloudelastic[1005-1006]
* 10:29 vgutierrez: upload trafficserver 8.0.8-1wm3 to apt.wm.org (buster) - [[phab:T265911|T265911]]
* 10:18 godog: roll restart pybal to apply latest configuration
* 09:51 jayme: published docker-registry.discovery.wmnet/eventrouter:0.3.0-3
* 09:31 moritzm: restarting PHP FPM on mw canaries to pick up freetype update
* 09:04 godog: swift codfw-prod: bump object weight for ms-be2057 - [[phab:T261633|T261633]]
* 08:58 moritzm: installing freetype security updates for stretch
* 08:57 XioNoX: remove down sessions to AS38758
* 08:51 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:51 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 08:43 XioNoX: remove down sessions to AS8560
* 08:41 XioNoX: remove down sessions to AS31334
* 08:28 XioNoX: remove down sessions to AS6327
* 08:27 XioNoX: remove down sessions to AS8674
* 08:25 XioNoX: remove down sessions to AS24429
* 08:21 XioNoX: remove down sessions to AS16509
* 06:59 _joe_: rolling restart of php7.2-fpm on the codfw jobrunners, to reduce the number of dangling transcodes after restarting cp-jobqueue for a deploy
* 06:59 oblivian@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 06:16 oblivian@cumin2001: conftool action : set/pooled=no; selector: cluster=jobrunner,dc=codfw,name=mw224.*
* 06:15 oblivian@cumin2001: conftool action : set/pooled=no; selector: cluster=videoscaler,dc=codfw,name=mw228.*
* 06:10 marostegui: Warm up tables [[phab:T261914|T261914]]


== 2020-10-25 ==
== 2022-01-07 ==
* 15:53 dwisehaupt: kernel upgrade and reboot for frdb1003
* 22:07 eileen: config revision changed from {{Gerrit|3df415c1}} to {{Gerrit|ecf09aa0}} - disable eoy email jobs
* 15:50 dwisehaupt: kernel upgrade and reboot for fran1001
* 20:08 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/<nowiki>{</nowiki>zhwikinews,zhwikinews-1.5x,zhwikinews-2x,zhwikinews-hans,zhwikinews-hans-1.5x,zhwikinews-hans-2x<nowiki>}</nowiki>.png via purgeList.php
* 19:49 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host apifeatureusage2001.codfw.wmnet
* 19:41 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host apifeatureusage1001.eqiad.wmnet
* 19:36 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS bullseye
* 19:21 herron@cumin1001: START - Cookbook sre.ganeti.makevm for new host apifeatureusage2001.codfw.wmnet
* 19:18 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS bullseye
* 19:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2051.codfw.wmnet with OS bullseye
* 19:11 herron@cumin1001: START - Cookbook sre.ganeti.makevm for new host apifeatureusage1001.eqiad.wmnet
* 18:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2051.codfw.wmnet with OS bullseye
* 15:18 taavi: reset email address for Ollie Shotton developer account per [[phab:T298779|T298779]]
* 15:08 ottomata: creeating mediainfo-streaming-updater.mutation topics on kafka main-eqiad and main-codfw and setting retention to 30 days - [[phab:T296470|T296470]]
* 14:05 ema: upgrade varnish on deployment-cache-text06 to 6.0.9 [[phab:T298758|T298758]]
* 12:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:14 taavi@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/ProofreadPage/modules/page: Backport: [[gerrit:751843{{!}}Makes sure $imgContHorizontal is always initialized (T298694)]] (duration: 00m 59s)
* 11:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:56 taavi@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/Flow: Backport: [[gerrit:752014{{!}}Revert "Use strict equality when safe to do so" (T298760)]] (duration: 01m 00s)
* 11:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:40 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 10:33 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18413 and previous config saved to /var/cache/conftool/dbconfig/20220107-072742-marostegui.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P18412 and previous config saved to /var/cache/conftool/dbconfig/20220107-071237-marostegui.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P18411 and previous config saved to /var/cache/conftool/dbconfig/20220107-065733-marostegui.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18410 and previous config saved to /var/cache/conftool/dbconfig/20220107-064228-marostegui.json
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18409 and previous config saved to /var/cache/conftool/dbconfig/20220107-064119-marostegui.json
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 06:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2089.codfw.wmnet with reason: Maintenance
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2089.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2076,2095].codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db[2076,2095].codfw.wmnet with reason: Maintenance
* 05:47 marostegui: rename wikishared.wikimedia_editor_tasks_targets_passed on db1120 [[phab:T264225|T264225]]
* 00:23 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:752036{{!}}viwiktionary: add namespaces "Appendix" and "Appendix talk" (T298289)]] (duration: 00m 59s)
* 00:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn


== 2020-10-23 ==
== 2022-01-06 ==
* 22:56 mutante: added Nuria to "nda" LDAP group - leaving her in "wmf" until the actual last day - shell account remains so no puppet change needed in ldap_only_admins ([[phab:T266086|T266086]])
* 23:52 jhathaway: bouncing blazegraph on wdqs1004
* 15:42 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:23 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-test-coord1002.eqiad.wmnet with OS buster
* 15:37 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 22:55 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host an-test-coord1002.eqiad.wmnet with OS buster
* 13:04 ema: rolling thumbor-instances restart to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/636012/ [[phab:T266155|T266155]]
* 22:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:47 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'kube-system' for release 'eventrouter' .
* 22:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:57 kormat: uploaded orchestrator v3.2.3 to apt.wikimedia.org buster-wikimedia - [[phab:T266023|T266023]] (forgot to log this earlier)
* 22:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:56 volans: uploaded python3-wmflib_0.0.3 to apt.wikimedia.org buster-wikimedia - [[phab:T257905|T257905]]
* 22:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:09 jayme: published docker-registry.discovery.wmnet/eventrouter:0.3.0-2
* 22:25 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.16  refs [[phab:T293958|T293958]]
* 09:51 moritzm: masking slapd on the old Stretch replicas to uncover potential direct access outside of the LVSes  [[phab:T264388|T264388]]
* 22:14 twentyafterfour@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/Scribunto/: sync Scribunto to deploy https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Scribunto/+/752006/ (duration: 01m 08s)
* 09:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:47 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 22:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:47 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:32 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 20:23 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@3297991]: update rdf-spark-tools jar to 0.3.98 (duration: 02m 15s)
* 09:31 jayme: published docker-registry.discovery.wmnet/eventrouter:0.3.0-1
* 20:21 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@3297991]: update rdf-spark-tools jar to 0.3.98
* 09:26 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 20:19 inflatador: banned elastic2051 from both chi and omega search clusters - [[phab:T298674|T298674]]
* 09:23 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:09 volans: upgrading spicerack to 0.0.44 on cumin hosts - [[phab:T257905|T257905]]
* 20:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:01 twentyafterfour@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Sync https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/751841 (duration: 01m 08s)
* 20:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:59 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@63c162d]: generate entity revision maps for commons / wcqs (duration: 02m 07s)
* 19:57 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@63c162d]: generate entity revision maps for commons / wcqs
* 19:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:07 taavi: UTC evening deploys done
* 19:05 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751538{{!}}Add data.nhm.ac.uk to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T298451)]] (duration: 01m 09s)
* 19:02 razzi: systemctl restart haproxy on dbproxy1018 to repool clouddb1018 for [[phab:T298505|T298505]]
* 18:59 mutante: puppetmaster1001 - creating missing Icinga contact for jgleeson in private puppet repo [[phab:T298649|T298649]]
* 18:51 mutante: contint1001 - after contint2001 also re-enabled puppet and deployed 751816 zuul-merger refactor - service git-daemon refreshed and runnning
* 18:50 razzi: run sudo maintain-views --databases centralauth --replace-all on clouddb1018 for [[phab:T298505|T298505]]
* 18:47 mutante: contint* - deploying zuul-merger puppet refactor change, first codfw-only
* 18:00 btullis@deploy1002: Finished deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production (duration: 00m 09s)
* 18:00 btullis@deploy1002: Started deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production
* 17:45 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@6f5caf9]: allow for null columns in export to relforge (duration: 02m 11s)
* 17:42 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@6f5caf9]: allow for null columns in export to relforge
* 16:42 otto@deploy1002: Finished deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production (duration: 00m 34s)
* 16:41 otto@deploy1002: Started deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production
* 16:37 inflatador: restarting elastic2052 for configuration change - [[phab:T298674|T298674]]
* 16:33 taavi: reset wikitech email for User:Iniquity per [[phab:T298683|T298683]]
* 16:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:21 taavi@deploy1002: Synchronized wmf-config/wikitech.php: wikitech: Re-enable Phabricator and Gerrit users after unblock (duration: 01m 09s)
* 16:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:19 btullis@deploy1002: Finished deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production (duration: 00m 41s)
* 16:18 btullis@deploy1002: Started deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production
* 16:18 btullis@deploy1002: Finished deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production (duration: 07m 16s)
* 16:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:10 btullis@deploy1002: Started deploy [cassandra/logstash-logback-encoder@fb10de1] (aqs): Deploying logstash-logback-encoder to production
* 16:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 16:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 16:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:09 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1004.eqiad.wmnet
* 15:00 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host stat1004.eqiad.wmnet
* 13:51 jbond: deploy cfssl_1.6.1-0+deb9u1_amd64 to stretch systems
* 09:57 hashar: Restarting zuul-merger on contint2001 and contint1001 {{!}} https://gerrit.wikimedia.org/r/c/operations/puppet/+/738370/ {{!}} [[phab:T187897|T187897]]
* 07:06 Amir1: revoke DROP from wikiadmin globally
* 02:34 eileen: civicrm revision changed from {{Gerrit|67264062}} to {{Gerrit|3d334f30}}
* 00:32 dancy@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:751530{{!}}Change the Traditional Chinese and Simplified Chinese logo for zhwikinews (T298550)]] (duration: 01m 07s)
* 00:30 dancy@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:751530{{!}}Change the Traditional Chinese and Simplified Chinese logo for zhwikinews (T298550)]] (duration: 01m 07s)
* 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn


== 2020-10-22 ==
== 2022-01-05 ==
* 22:42 mutante: ganeti1001 - adding 2 more vcpus to VM testreduce1001 - [[phab:T257940|T257940]]
* 23:50 razzi: sudo systemctl reload haproxy on dbproxy1019 to repool clouddb1014 for [[phab:T298505|T298505]]
* 22:03 mutante: deploy1002 - armed keyholder, all deployment keys loaded [[phab:T265963|T265963]]
* 23:26 razzi: run sudo maintain-views --databases centralauth --debug --replace-all on clouddb1014 for [[phab:T298505|T298505]]
* 21:56 mutante: deploy1002 - scap pull  and added to mediawiki-installation "dsh" group - will be part of scap trains but just like any appserver ([[phab:T265963|T265963]])
* 22:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:36 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:13 mutante: deploy1002 currently cloning ALL the deployment repos - new setup
* 22:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:57 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:57 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:56 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:56 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:54 mutante: applying deployment_server role to new server deploy1002 - might show up in monitoring but is not prod yet, deploy1001 still is
* 21:25 eileen: civicrm revision {{Gerrit|32d7370a}} -> {{Gerrit|67264062}}
* 18:34 mutante: adding mcrouter cert for deploy1002.eqiad.wmnet [[phab:T265963|T265963]]
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:12 dpifke@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Expand  to group1 ([[phab:T123582|T123582]]) (duration: 00m 56s)
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:12 volans: cumin 'A:dns-rec' 'rec_control wipe-cache wikimedia.org$' - [[phab:T258729|T258729]]
* 20:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 18:07 chaomodus: Updating eqiad public network DNS to automation
* 20:39 razzi: reload haproxy on dbproxy1019 to repool clouddb1014 for [[phab:T298505|T298505]]
* 17:50 volans: cumin 'A:dns-rec' 'rec_control wipe-cache eqiad.wmnet$' - [[phab:T258729|T258729]]
* 20:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:49 elukey: add thirdparty/bigtop14 to buster-wikimedia
* 20:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:46 chaomodus: Updating eqiad private network DNS to automation
* 20:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:21 bd808@cumin1001: END (PASS) - Cookbook wmcs.wikireplicas.add_wiki (exit_code=0)
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:21 bd808@cumin1001: Added views for new wiki: smnwiki [[phab:T264900|T264900]]
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:07 bd808@cumin1001: START - Cookbook wmcs.wikireplicas.add_wiki
* 20:25 twentyafterfour@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.16  refs [[phab:T293957|T293957]] (duration: 01m 07s)
* 16:46 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:23 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.16  refs [[phab:T293957|T293957]]
* 16:42 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 20:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:56 moritzm: installing remaining mariadb-10.3 updates for buster (as packaged in Debian, not the wmf-mariadb package)
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:33 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:13 andrewbogott: upgrading mariadb on cloudcontrol1003, 1004, 1005
* 20:14 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.16  refs [[phab:T293957|T293957]]
* 14:05 ottomata: bump camus version to wmf12 for all camus jobs. should be no-op now. - [[phab:T251609|T251609]]
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:00 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams: Enable canary events for all eventgate-analytics-external bound streams - [[phab:T251609|T251609]] (duration: 01m 02s)
* 20:11 twentyafterfour@deploy1002: Synchronized php-1.38.0-wmf.16/includes/changetags/ChangeTags.php: unblock the train, refs [[phab:T293957|T293957]] (duration: 01m 09s)
* 13:55 moritzm: depooling ldap-eqiad-replica01/ldap-eqiad-replica02 [[phab:T264388|T264388]]
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:41 moritzm: pooling ldap-replica1001/1002 [[phab:T264388|T264388]]
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:10 moritzm: depooling ldap-replica2001/2002 [[phab:T264388|T264388]]
* 19:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:04 liw@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.14
* 19:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:01 moritzm: pooling ldap-replica2004 [[phab:T264388|T264388]]
* 19:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:24 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams: Enable canary events for 3 eventgate-analytics bound streams - [[phab:T251609|T251609]] (duration: 01m 05s)
* 19:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|52ad2d4df1164dced684231c12aa64bd028b8ac9}}: Do not log logins at loginwiki via CU ([[phab:T253802|T253802]]) (duration: 01m 06s)
* 19:47 urbanecm@deploy1002: Finished scap: {{Gerrit|485e72bada5243755daab981f5a9ecd35e5b134e}}: Add it namespace aliases in scn ([[phab:T297844|T297844]]) (duration: 11m 40s)
* 12:03 Urbanecm: [urbanecm@deploy1001 /srv/mediawiki-staging (master * u=)]$ sudo /usr/local/sbin/fix-staging-perms
* 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:59 Lucas_WMDE: EU backport&config window done
* 19:41 razzi: reload haproxy on dbproxy1019 (previously incorrectly reloaded dbproxy1018) for [[phab:T298505|T298505]]
* 11:58 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:635762{{!}}Enable propagatePageDeletion on Test Wikidata]], 2/2 (duration: 01m 04s)
* 19:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:57 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:635762{{!}}Enable propagatePageDeletion on Test Wikidata]], 1/2 (duration: 01m 02s)
* 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:54 Urbanecm: Start of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` in a tmux session updateVarDumps at mwmaint2001 (wiki=huwiki; [[phab:T246539|T246539]])
* 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:39 moritzm: restarting nginx on acmechief*, debmonitor*, schema*, puppetdb* to pick up freetype update
* 19:35 urbanecm@deploy1002: Started scap: {{Gerrit|485e72bada5243755daab981f5a9ecd35e5b134e}}: Add it namespace aliases in scn ([[phab:T297844|T297844]])
* 11:38 marostegui: Compare s1-s8 tables - [[phab:T261914|T261914]]
* 19:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:33 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:34 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f2da5befc75b4f93ca4a11393a533b7dc97316ef}}: Deploy sticky header ([[phab:T295976|T295976]]) (duration: 01m 42s)
* 11:31 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InterwikiSortOrders.php: Config: [[gerrit:635813{{!}}Add ary, avk, awa, lld, shy and smn to InterwikiSortOrders.php]] (duration: 01m 08s)
* 19:31 razzi: reload haproxy on dbproxy1018 for [[phab:T298505|T298505]]
* 11:31 aborrero@cumin2001: START - Cookbook sre.hosts.downtime
* 19:27 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.13/skins/Vector/resources/skins.vector.es6/stickyHeader.js: {{Gerrit|f6424f32611bce8d9e95c369c28e2f787e2cdf75}}: Dont use ts-ignore. It is hiding real errors ([[phab:T297119|T297119]]) (duration: 01m 08s)
* 11:25 moritzm: restarting apache and smokeping* on netmon* to pick up freetype update
* 19:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 11:21 moritzm: correction: installing freetype security updates for buster (stretch TBD)
* 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:43 moritzm: installing freetype security updates for stretch/buster
* 19:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 10:33 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 10:27 volans@cumin1001: START - Cookbook sre.dns.netbox
* 19:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|aff4ac32f37d21ac0b70c62adc54756eb1e2d2b0}}: Add www.artsobservasjoner.no to the wgCopyUploadsDomains allowlist of Commons ([[phab:T298449|T298449]]) (duration: 01m 08s)
* 09:38 arturo: merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/634050 change to network data yaml
* 19:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:31 kormat: enabling replication from eqiad to codfw [[phab:T261914|T261914]]
* 19:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:23 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:23 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 19:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:52 godog: swift codfw-prod: bump object weight for ms-be2057 - [[phab:T261633|T261633]]
* 18:47 jgleeson: localsettings changed from {{Gerrit|2d371ed1}} to {{Gerrit|3df415c1}}
* 03:37 eileen: civicrm revision changed from {{Gerrit|4dce7bf535}} to {{Gerrit|bb7c08bf6d}}, config revision is {{Gerrit|9a522d03dd}}
* 18:22 bd808: Toolhub: ran `poetry run ./manage.py migrate` against m5-master
* 03:13 eileen: civicrm revision changed from {{Gerrit|3c3dcf80ae}} to {{Gerrit|4dce7bf535}}, config revision is {{Gerrit|9a522d03dd}}
* 18:18 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: sync on main
* 01:12 ryankemper@deploy1001: Finished deploy [wdqs/wdqs@870829c]: 0.3.52 (duration: 09m 07s)
* 18:16 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply on main
* 01:04 ryankemper: Tests passing on canary `wdqs1003`, proceeding with wdqs deploy for rest of fleet
* 18:07 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync on main
* 01:03 ryankemper@deploy1001: Started deploy [wdqs/wdqs@870829c]: 0.3.52
* 18:06 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply on main
* 18:04 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync on main
* 18:03 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply on main
* 18:03 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync on main
* 18:02 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply on main
* 18:02 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply on main
* 17:57 btullis@deploy1002: Finished deploy [analytics/superset/deploy@09094de]: Deployment of Superset 1.3.2 to production (duration: 00m 29s)
* 17:57 btullis@deploy1002: Started deploy [analytics/superset/deploy@09094de]: Deployment of Superset 1.3.2 to production
* 17:55 andrew@deploy1002: Finished deploy [horizon/deploy@b300fa6]: minor code format update (duration: 04m 09s)
* 17:53 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: sync on main
* 17:51 andrew@deploy1002: Started deploy [horizon/deploy@b300fa6]: minor code format update
* 17:50 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply on main
* 17:48 btullis@deploy1002: Finished deploy [analytics/superset/deploy@09094de]: Deployment of Superset 1.3.2 to staging (duration: 00m 39s)
* 17:47 btullis@deploy1002: Started deploy [analytics/superset/deploy@09094de]: Deployment of Superset 1.3.2 to staging
* 17:46 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: sync on main
* 17:46 btullis@deploy1002: Finished deploy [analytics/superset/deploy@09094de]: Deployment of Superset 1.3.2 to staging (duration: 03m 11s)
* 17:42 btullis@deploy1002: Started deploy [analytics/superset/deploy@09094de]: Deployment of Superset 1.3.2 to staging
* 17:42 btullis@deploy1002: Started deploy [analytics/superset/deploy@09094de]: Deployment for something important
* 17:36 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply on main
* 17:26 andrew@deploy1002: Finished deploy [horizon/deploy@15efe04]: sudo panel update (duration: 04m 00s)
* 17:21 andrew@deploy1002: Started deploy [horizon/deploy@15efe04]: sudo panel update
* 17:21 andrew@deploy1002: Finished deploy [horizon/deploy@15efe04]: sudo panel update (codfw1dev) (duration: 01m 54s)
* 17:19 andrew@deploy1002: Started deploy [horizon/deploy@15efe04]: sudo panel update (codfw1dev)
* 17:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:11 sbassett: Deployed security fix for [[phab:T298581|T298581]] to wmf.16
* 17:04 sbassett@deploy1002: Synchronized php-1.38.0-wmf.13/extensions/MobileFrontend/includes/specials/SpecialMobileContributions.php: Deploy security fix for [[phab:T298581|T298581]] (duration: 01m 08s)
* 16:51 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:51 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:38 andrew@deploy1002: Finished deploy [horizon/deploy@5e57e78]: sudo panel update (codfw1dev) (duration: 02m 08s)
* 16:36 andrew@deploy1002: Started deploy [horizon/deploy@5e57e78]: sudo panel update (codfw1dev)
* 16:27 andrew@deploy1002: Finished deploy [horizon/deploy@5e57e78]: sudo panel update (duration: 03m 53s)
* 16:26 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:26 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:23 andrew@deploy1002: Started deploy [horizon/deploy@5e57e78]: sudo panel update
* 14:54 aokoth@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:50 aokoth@cumin1001: START - Cookbook sre.dns.netbox
* 13:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2087:3316, db2087:3317 after reimage [[phab:T295965|T295965]]', diff saved to https://phabricator.wikimedia.org/P18402 and previous config saved to /var/cache/conftool/dbconfig/20220105-134827-marostegui.json
* 13:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy2001.codfw.wmnet with OS bullseye
* 13:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:38 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/TrustedXFF/: {{Gerrit|ce7113b99712ac7ce4112cff720c669f618df6eb}}: Add more Zscaler ranges ([[phab:T298241|T298241]]) (duration: 01m 09s)
* 13:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.13/extensions/TrustedXFF/: {{Gerrit|d35e36f4deb7a8e2a454769f4b2d72e45318fcc9}}: Add more Zscaler ranges ([[phab:T298241|T298241]]) (duration: 01m 09s)
* 13:33 Amir1: delete echo keys from objectchange in frwiki ([[phab:T272512|T272512]])
* 13:23 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: sync on main
* 13:22 jelto@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply on main
* 13:11 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy2001.codfw.wmnet with OS bullseye
* 13:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy2002.codfw.wmnet with OS bullseye
* 12:38 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy2002.codfw.wmnet with OS bullseye
* 12:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:18 taavi: UTC morning deploys done
* 12:16 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:749890{{!}}Add akwiki as an import source for twwiki (T298296)]] (duration: 01m 09s)
* 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:07 vgutierrez: pool cp5005 running envoyproxy as TLS terminator - [[phab:T271421|T271421]]
* 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy2003.codfw.wmnet with OS bullseye
* 11:56 jbond: rollout cfssl 1.6.1
* 11:55 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
* 11:55 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5005.eqsin.wmnet with OS buster
* 11:55 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
* 11:55 jelto@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
* 11:34 aokoth@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts kubestage1002.eqiad.wmnet
* 11:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy2003.codfw.wmnet with OS bullseye
* 11:24 btullis: updating hive packages in reprepro for log4j update
* 11:24 aokoth@cumin1001: START - Cookbook sre.hosts.decommission for hosts kubestage1002.eqiad.wmnet
* 11:20 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy2003.codfw.wmnet with OS bullseye
* 10:54 jbond: upload cfssl 1.6.1
* 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy2003.codfw.wmnet with OS bullseye
* 10:48 hashar: CI: switching MediaWiki selenium from php built-in server to Apache # https://gerrit.wikimedia.org/r/751697
* 10:40 aokoth@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:37 aokoth@cumin1001: START - Cookbook sre.dns.netbox
* 10:02 dcausse@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 10:01 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|8137ffc33d9de0f0a835223936a93e87504a7358}}: pwnwiki: Enable Growth features in dark mode ([[phab:T298115|T298115]]; 3/3) (duration: 01m 07s)
* 10:00 dcausse@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 10:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:59 urbanecm@deploy1002: Synchronized wmf-config/config/pwnwiki.yaml: {{Gerrit|8137ffc33d9de0f0a835223936a93e87504a7358}}: pwnwiki: Enable Growth features in dark mode ([[phab:T298115|T298115]]; 2/3) (duration: 01m 07s)
* 09:59 dcausse@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 09:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:58 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|8137ffc33d9de0f0a835223936a93e87504a7358}}: pwnwiki: Enable Growth features in dark mode ([[phab:T298115|T298115]]; 1/3) (duration: 01m 07s)
* 09:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:53 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.13/extensions/GrowthExperiments/includes/Mentorship/Hooks/MentorFilterHooks.php: {{Gerrit|24e15e1fd5c7feb2377974ee666c61aef8f82da5}}: MentorFilterHooks: Include only primary mentors ([[phab:T298031|T298031]]) (duration: 01m 07s)
* 09:48 aokoth@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts kubestage1001.eqiad.wmnet
* 09:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/TrustedXFF/trusted-hosts.php: {{Gerrit|ab8fe9884e3e4d1fa3bdaa1c8a9cab143b4ac565}}: Add Zscaler to list of trusted hosts for XFF ([[phab:T298241|T298241]]) (duration: 01m 08s)
* 09:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.13/extensions/TrustedXFF/trusted-hosts.php: {{Gerrit|010d96b9297825079b3ac84f247c0f80353d42a8}}: Add Zscaler to list of trusted hosts for XFF ([[phab:T298241|T298241]]) (duration: 01m 09s)
* 09:33 aokoth@cumin1001: START - Cookbook sre.hosts.decommission for hosts kubestage1001.eqiad.wmnet
* 09:29 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp5005.eqsin.wmnet with OS buster
* 09:24 vgutierrez: depool cp5005 to be reimaged as cache::upload_envoy - [[phab:T271421|T271421]]
* 08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2087.codfw.wmnet with OS bullseye
* 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2087.codfw.wmnet with OS bullseye
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2087:3316, db2087:3317 for Buster reimage [[phab:T295965|T295965]]', diff saved to https://phabricator.wikimedia.org/P18399 and previous config saved to /var/cache/conftool/dbconfig/20220105-082529-marostegui.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18398 and previous config saved to /var/cache/conftool/dbconfig/20220105-081600-marostegui.json
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P18397 and previous config saved to /var/cache/conftool/dbconfig/20220105-080055-marostegui.json
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P18396 and previous config saved to /var/cache/conftool/dbconfig/20220105-074551-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18395 and previous config saved to /var/cache/conftool/dbconfig/20220105-073046-marostegui.json
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 ([[phab:T297191|T297191]])', diff saved to https://phabricator.wikimedia.org/P18394 and previous config saved to /var/cache/conftool/dbconfig/20220105-072937-marostegui.json
* 07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 07:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
* 02:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:13 Amir1: running foreachwikiindblist all maintenance/refreshImageMetadata.php --force --verbose --mediatype=OFFICE --oldimage ([[phab:T298417|T298417]])
* 02:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:11 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.16/maintenance/refreshImageMetadata.php: Backport: [[gerrit:751526{{!}}maintenance: Add support for oldimage table metadata refresh (T298417)]] (duration: 01m 07s)
* 02:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:09 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.13/maintenance/refreshImageMetadata.php: Backport: [[gerrit:751527{{!}}maintenance: Add support for oldimage table metadata refresh (T298417)]] (duration: 01m 08s)
* 01:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:49 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:750814{{!}}Delete Tematica namespace (NS:104) in Italian Wikivoyage (T298315)]] (duration: 01m 07s)
* 01:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:43 ebernhardson@deploy1002: Synchronized static/images/mobile/copyright/wikivoyage-wordmark-bn.svg: Config: [[gerrit:749626{{!}}Update bnwikivoyage wordmark logo (T298033)]] (duration: 01m 07s)
* 01:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:41 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:749626{{!}}Update bnwikivoyage wordmark logo (T298033)]] (duration: 01m 07s)
* 01:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:12 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751485{{!}}Move CirrusSearch more_like traffic to eqiad]] (duration: 01m 07s)
* 01:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:01 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|34bf91ec2ba1408594bb77745deb6fa7d36ddf8d}}: GrowthExperiments: Add campaign pattern for JOSA ([[phab:T298057|T298057]]) (duration: 01m 08s)
* 00:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:53 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7aff17f42eb2ecad94a76c5d93ce467bd6bff39e}}: Fix wordmark svgs for strategywiki, viwikibooks ([[phab:T290091|T290091]]; 2/2) (duration: 01m 07s)
* 00:52 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/: {{Gerrit|7aff17f42eb2ecad94a76c5d93ce467bd6bff39e}}: Fix wordmark svgs for strategywiki, viwikibooks ([[phab:T290091|T290091]]; 1/2) (duration: 01m 07s)
* 00:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:49 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6c220f0bb86b0d77714ee23d662ea836897e0207}}: Enable slow-parsoid logs (duration: 01m 08s)
* 00:40 twentyafterfour@deploy1002: Synchronized php-1.38.0-wmf.16/includes/content/ContentModelChange.php: fix patch application failure (duration: 01m 07s)
* 00:37 twentyafterfour@deploy1002: Synchronized php-1.38.0-wmf.16/extensions/VisualEditor/: fix patch application failure (duration: 01m 09s)


== 2020-10-21 ==
== 2022-01-04 ==
* 23:16 catrope@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/GrowthExperiments/: [[phab:T266033|T266033]] (duration: 01m 05s)
* 22:55 twentyafterfour@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.16  refs [[phab:T293957|T293957]] (duration: 37m 56s)
* 23:14 catrope@deploy1001: Synchronized php-1.36.0-wmf.13/extensions/GrowthExperiments/: [[phab:T265751|T265751]] [[phab:T265754|T265754]] (duration: 01m 08s)
* 22:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 21:38 mutante: testreduce1001 assigned 2 more GBs of RAM - rebooting ([[phab:T257940|T257940]], [[phab:T257906|T257906]])
* 22:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:44 Amir1: end of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T264963|T264963]])
* 22:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:15 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T264963|T264963]])
* 22:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 18:13 Urbanecm: Morning B&C window done
* 22:17 twentyafterfour@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.16  refs [[phab:T293957|T293957]]
* 18:12 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|45312d359442d274e83deb7be80f86e12fb9e864}}: [WikibaseMediaInfo] Fix concept chips array nesting structure ([[phab:T256431|T256431]]) (duration: 01m 05s)
* 21:15 eileen: process-control checkout revision ({{Gerrit|e58e4e50}} -> {{Gerrit|eb83f208}})
* 18:12 mepps: updated payments-wiki-staging from {{Gerrit|db03677b2d}} to {{Gerrit|5fdd29bc16}}
* 21:02 eileen: process-control config {{Gerrit|40467fc2}} -> {{Gerrit|e58e4e50}}
* 18:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d94e33ff39b300c74fcaf08d1746c089fb1af783}}: cirrus: Hardcode more_like to codfw cirrus cluster (duration: 01m 05s)
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:56 XioNoX: configure FB PNI in eqdfw
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:43 ppchelko@deploy1001: Synchronized php-1.36.0-wmf.14/skins/WikimediaApiPortal: Backport gerrit:635329, [[phab:T266021|T266021]] (duration: 01m 06s)
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 17:34 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch ParserCache to JSON on testwiki gerrit:635382 (duration: 01m 05s)
* 20:43 eileen: config {{Gerrit|b26653a4}} -> {{Gerrit|40467fc2}} (latest)
* 17:24 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable ParserCache logger for warn+, gerrit:635071 (duration: 01m 08s)
* 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 17:21 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable ParserCache logger for warn+, gerrit:635071 (duration: 01m 06s)
* 20:34 eileen: civicrm revision {{Gerrit|aaceb4ab}} -> {{Gerrit|328c8542}}
* 17:13 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:33 twentyafterfour_: MediaWiki train for 1.38.0-wmf.16 - ran `scap prep` [[phab:T293957|T293957]]
* 17:13 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:57 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@b38fb58]: Switch mjolnir norm_query_clustering to the shsaded refinery jar (duration: 02m 11s)
* 16:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:55 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@b38fb58]: Switch mjolnir norm_query_clustering to the shsaded refinery jar
* 16:57 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18388 and previous config saved to /var/cache/conftool/dbconfig/20220104-160930-marostegui.json
* 16:57 mutante: scandium - disabling puppet so that Parsoid team can make some tests on testreduce1001 today
* 15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18387 and previous config saved to /var/cache/conftool/dbconfig/20220104-155425-marostegui.json
* 16:46 effie: restart php-fpm and pool mw2252 and mw2328
* 15:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18386 and previous config saved to /var/cache/conftool/dbconfig/20220104-153920-marostegui.json
* 15:58 Lucas_WMDE: Deployed patch for [[phab:T260349|T260349]]
* 15:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18384 and previous config saved to /var/cache/conftool/dbconfig/20220104-152416-marostegui.json
* 15:34 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 15:07 aokoth@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 15:33 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 14:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:31 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 14:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:28 moritzm: updating prometheus-openldap-exporter to 0+git20171128-3 to buster-wikimedia
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:23 jbond42: upgrade puppetlabs-stdlib to 6.5.0 https://gerrit.wikimedia.org/r/c/operations/puppet/+/634278
* 14:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:08 moritzm: imported prometheus-openldap-exporter 0+git20171128-3 to buster-wikimedia [[phab:T264388|T264388]]
* 14:21 oblivian@deploy1002: Synchronized docroot: Config: Make symlinks relative so they work on a local checkout too ([[phab:T285232|T285232]]) (duration: 00m 57s)
* 15:02 otto@deploy1001: Finished deploy [analytics/refinery@e4d16f0] (hadoop-test): deploying with updated camus to test cluster (duration: 02m 56s)
* 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 15:01 crusnov@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 15:00 otto@deploy1001: Started deploy [analytics/refinery@e4d16f0] (hadoop-test): deploying with updated camus to test cluster
* 14:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:56 crusnov@cumin1001: START - Cookbook sre.dns.netbox
* 14:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:44 reedy@deploy1001: Synchronized wmf-config/wikitech.php: Set CURLOPT_RETURNTRANSFER true in gerrit handler [[phab:T242554|T242554]] (duration: 01m 07s)
* 14:12 oblivian@deploy1002: Synchronized images: Config: Remove dead symlinks ([[phab:T285232|T285232]]) (duration: 00m 58s)
* 14:34 dcausse: restarting blazegraph on codfw servers ([[phab:T263952|T263952]])
* 14:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:21 moritzm: pooling ldap-replica2003 [[phab:T264388|T264388]]
* 14:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:04 liw@deploy1001: Synchronized php: group1 wikis to 1.36.0-wmf.14 (duration: 01m 04s)
* 14:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:03 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.14
* 14:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:40 matthiasmullie: EU B&C done
* 13:57 godog: bump prometheus k8s + ops space in eqiad
* 11:33 mlitn@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [WikibaseMediaInfo] Add config for related terms API (duration: 01m 04s)
* 13:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2087.codfw.wmnet with reason: Maintenance
* 11:17 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|785404fa2b998947d236aebe481ee1abcbd14220}}: Disable registrations stat on Special:TranslationStats ([[phab:T264158|T264158]]) (duration: 01m 05s)
* 13:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2087.codfw.wmnet with reason: Maintenance
* 11:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|11567427c3f7d2908b29046ee56a7b0c0da32c09}}: Enable ContentTranslation in 5 Wikipedias as a default tool ([[phab:T264737|T264737]]; [[phab:T264738|T264738]]; [[phab:T264739|T264739]]; [[phab:T264740|T264740]]; [[phab:T264741|T264741]]) (duration: 01m 30s)
* 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18382 and previous config saved to /var/cache/conftool/dbconfig/20220104-134410-marostegui.json
* 11:00 marostegui: Upgrade db2093's mariadb version [[phab:T266003|T266003]]
* 13:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1121,1155].eqiad.wmnet with reason: Maintenance
* 10:58 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1121,1155].eqiad.wmnet with reason: Maintenance
* 10:56 aborrero@cumin2001: START - Cookbook sre.hosts.downtime
* 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18381 and previous config saved to /var/cache/conftool/dbconfig/20220104-134359-marostegui.json
* 10:38 Urbanecm: Start of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=rowiki; [[phab:T246539|T246539]])
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18380 and previous config saved to /var/cache/conftool/dbconfig/20220104-132854-marostegui.json
* 10:37 Urbanecm: End of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=srwiki; [[phab:T246539|T246539]])
* 13:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18379 and previous config saved to /var/cache/conftool/dbconfig/20220104-131349-marostegui.json
* 10:01 Urbanecm: Start of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=srwiki; [[phab:T246539|T246539]])
* 13:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 10:00 Urbanecm: End of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=nowiki; [[phab:T246539|T246539]])
* 13:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 09:59 vgutierrez: Bump ECDHE-ECDSA-AES128-SHA pageview replacement to 100% - [[phab:T258405|T258405]]
* 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18378 and previous config saved to /var/cache/conftool/dbconfig/20220104-130816-marostegui.json
* 09:42 Urbanecm: Start of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=nowiki; [[phab:T246539|T246539]])
* 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18377 and previous config saved to /var/cache/conftool/dbconfig/20220104-125845-marostegui.json
* 09:42 Urbanecm: End of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=shwiki; [[phab:T246539|T246539]])
* 12:53 taavi: UTC morning deploys done
* 09:38 Urbanecm: Start of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=shwiki; [[phab:T246539|T246539]])
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P18376 and previous config saved to /var/cache/conftool/dbconfig/20220104-125312-marostegui.json
* 09:37 Urbanecm: mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log # wiki=warwiki; [[phab:T246539|T246539]]
* 12:52 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751385{{!}}prod: WRITE_BOTH for centralauth hidden level migration (T289068)]] (duration: 00m 57s)
* 09:30 Urbanecm: End of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=viwiki; [[phab:T246539|T246539]])
* 12:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:23 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 09:22 root@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 09:21 aborrero@cumin2001: START - Cookbook sre.hosts.downtime
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:52 Urbanecm: Start of `mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log` (wiki=viwiki; [[phab:T246539|T246539]])
* 12:38 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges from s2 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18375 and previous config saved to /var/cache/conftool/dbconfig/20220104-123845-marostegui.json
* 08:50 Urbanecm: mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log # wiki=cebwiki; [[phab:T246539|T246539]]
* 12:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P18374 and previous config saved to /var/cache/conftool/dbconfig/20220104-123807-marostegui.json
* 08:46 Urbanecm: [urbanecm@mwmaint2001 ~/updateVarDumps/output/group2-medium/output]$ mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=apiportalwiki # [[phab:T246539|T246539]]
* 12:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:38 root@cumin1001: START - Cookbook sre.ganeti.makevm
* 12:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:38 root@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 12:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:38 root@cumin1001: START - Cookbook sre.ganeti.makevm
* 12:34 taavi@deploy1002: Synchronized php-1.38.0-wmf.13/extensions/LdapAuthentication/includes/LdapAuthenticationPlugin.php: Backport: [[gerrit:751192{{!}}Include ldap errno on account creation debug logs (T298508)]] (duration: 00m 58s)
* 08:33 godog: swift codfw-prod: bump object weight for ms-be2057 - [[phab:T261633|T261633]]
* 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 08:10 XioNoX: Upgrade Routinator 3000 to 0.8.0 on rpki1001 - [[phab:T266001|T266001]]
* 12:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 08:09 XioNoX: add Routinator 3000 0.8.0 to apt - [[phab:T266001|T266001]]
* 12:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:58 elukey: update analytics-in4 filter on cr1/cr2-eqiad for https://gerrit.wikimedia.org/r/635319
* 12:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 04:35 ryankemper: re-enabled icinga notifications on all wdqs hosts now that `wdqs-updater` is healthy
* 12:26 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18373 and previous config saved to /var/cache/conftool/dbconfig/20220104-122302-marostegui.json
* 12:22 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:749244{{!}}Create autopatroller and patroller groups on bnwiktionary (T298187)]] (duration: 00m 57s)
* 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18372 and previous config saved to /var/cache/conftool/dbconfig/20220104-121643-marostegui.json
* 12:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 12:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 12:15 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751168{{!}}Make reply tool available as opt-out on specieswiki (T297535)]] (duration: 00m 57s)
* 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:13 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751167{{!}}Make reply tool available as opt-out on metawiki (T297534)]] (duration: 00m 59s)
* 12:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 12:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1106,1154].eqiad.wmnet with reason: Maintenance
* 12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1106,1154].eqiad.wmnet with reason: Maintenance
* 12:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 15 hosts with reason: Maintenance
* 12:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on 15 hosts with reason: Maintenance
* 11:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 11:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 11:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18370 and previous config saved to /var/cache/conftool/dbconfig/20220104-114503-marostegui.json
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P18369 and previous config saved to /var/cache/conftool/dbconfig/20220104-112959-marostegui.json
* 11:20 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:20 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P18368 and previous config saved to /var/cache/conftool/dbconfig/20220104-111454-marostegui.json
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18367 and previous config saved to /var/cache/conftool/dbconfig/20220104-105949-marostegui.json
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18366 and previous config saved to /var/cache/conftool/dbconfig/20220104-105922-marostegui.json
* 10:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 10:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18365 and previous config saved to /var/cache/conftool/dbconfig/20220104-105914-marostegui.json
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18364 and previous config saved to /var/cache/conftool/dbconfig/20220104-105244-marostegui.json
* 10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 10:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 10:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18362 and previous config saved to /var/cache/conftool/dbconfig/20220104-104410-marostegui.json
* 10:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 10:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 10:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 10:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
* 10:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 10:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1128.eqiad.wmnet with reason: Maintenance
* 10:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 10:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18360 and previous config saved to /var/cache/conftool/dbconfig/20220104-102905-marostegui.json
* 10:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 10:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
* 10:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 10:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18359 and previous config saved to /var/cache/conftool/dbconfig/20220104-101400-marostegui.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18358 and previous config saved to /var/cache/conftool/dbconfig/20220104-094920-marostegui.json
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P18357 and previous config saved to /var/cache/conftool/dbconfig/20220104-093415-marostegui.json
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P18356 and previous config saved to /var/cache/conftool/dbconfig/20220104-091910-marostegui.json
* 09:04 dcaro: start merging puppet cleanup patches
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18355 and previous config saved to /var/cache/conftool/dbconfig/20220104-090406-marostegui.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18354 and previous config saved to /var/cache/conftool/dbconfig/20220104-085127-marostegui.json
* 08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 08:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18353 and previous config saved to /var/cache/conftool/dbconfig/20220104-085118-marostegui.json
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P18352 and previous config saved to /var/cache/conftool/dbconfig/20220104-083613-marostegui.json
* 08:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2094.codfw.wmnet with OS bullseye
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18351 and previous config saved to /var/cache/conftool/dbconfig/20220104-082306-marostegui.json
* 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18350 and previous config saved to /var/cache/conftool/dbconfig/20220104-082259-marostegui.json
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P18349 and previous config saved to /var/cache/conftool/dbconfig/20220104-082109-marostegui.json
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18348 and previous config saved to /var/cache/conftool/dbconfig/20220104-080754-marostegui.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18347 and previous config saved to /var/cache/conftool/dbconfig/20220104-080604-marostegui.json
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18346 and previous config saved to /var/cache/conftool/dbconfig/20220104-080051-marostegui.json
* 08:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 08:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 07:56 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2094.codfw.wmnet with OS bullseye
* 07:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance
* 07:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18345 and previous config saved to /var/cache/conftool/dbconfig/20220104-075249-marostegui.json
* 07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 07:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 07:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 07:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 07:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
* 07:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 9 hosts with reason: Maintenance
* 07:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on 9 hosts with reason: Maintenance
* 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18344 and previous config saved to /var/cache/conftool/dbconfig/20220104-074456-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18343 and previous config saved to /var/cache/conftool/dbconfig/20220104-073745-marostegui.json
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18342 and previous config saved to /var/cache/conftool/dbconfig/20220104-072951-marostegui.json
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18341 and previous config saved to /var/cache/conftool/dbconfig/20220104-071446-marostegui.json
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18340 and previous config saved to /var/cache/conftool/dbconfig/20220104-065942-marostegui.json
* 06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 ([[phab:T298316|T298316]])', diff saved to https://phabricator.wikimedia.org/P18339 and previous config saved to /var/cache/conftool/dbconfig/20220104-063714-marostegui.json
* 06:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 06:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 06:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 06:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 06:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
* 04:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18338 and previous config saved to /var/cache/conftool/dbconfig/20220104-042116-marostegui.json
* 04:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 04:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
* 04:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18337 and previous config saved to /var/cache/conftool/dbconfig/20220104-042109-marostegui.json
* 04:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18335 and previous config saved to /var/cache/conftool/dbconfig/20220104-040604-marostegui.json
* 04:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2144.codfw.wmnet
* 04:01 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet
* 03:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18334 and previous config saved to /var/cache/conftool/dbconfig/20220104-035059-marostegui.json
* 03:50 ladsgroup@cumin1001: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for db2144.codfw.wmnet
* 03:50 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet
* 03:36 ladsgroup@cumin1001: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for db2144.codfw.wmnet
* 03:36 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet
* 03:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18333 and previous config saved to /var/cache/conftool/dbconfig/20220104-033555-marostegui.json
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 01:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18332 and previous config saved to /var/cache/conftool/dbconfig/20220104-015125-marostegui.json
* 01:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 01:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 01:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18331 and previous config saved to /var/cache/conftool/dbconfig/20220104-012506-marostegui.json
* 01:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18330 and previous config saved to /var/cache/conftool/dbconfig/20220104-011001-marostegui.json
* 00:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18329 and previous config saved to /var/cache/conftool/dbconfig/20220104-005456-marostegui.json
* 00:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18328 and previous config saved to /var/cache/conftool/dbconfig/20220104-003951-marostegui.json
* 00:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 00:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 00:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18327 and previous config saved to /var/cache/conftool/dbconfig/20220104-000947-marostegui.json


== 2020-10-20 ==
== 2022-01-03 ==
* 22:10 dwisehaupt: frmon2001 upgraded to buster with grafana 7.2.1
* 23:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P18326 and previous config saved to /var/cache/conftool/dbconfig/20220103-235443-marostegui.json
* 21:19 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 23:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P18325 and previous config saved to /var/cache/conftool/dbconfig/20220103-233938-marostegui.json
* 21:18 cdanis: ✔️ cdanis@mw2252.codfw.wmnet ~ 🕠🍺 sudo depool
* 23:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18324 and previous config saved to /var/cache/conftool/dbconfig/20220103-232433-marostegui.json
* 20:57 mforns@deploy1001: Finished deploy [analytics/refinery@e4d16f0] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d16f08a96b6f65447fcdc6c9e8945724a89f54] (duration: 00m 08s)
* 21:50 cwhite: manually upgrade to grafana 8 on grafana-next ([[phab:T282863|T282863]])
* 20:56 mforns@deploy1001: Started deploy [analytics/refinery@e4d16f0] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d16f08a96b6f65447fcdc6c9e8945724a89f54]
* 21:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18323 and previous config saved to /var/cache/conftool/dbconfig/20220103-212216-marostegui.json
* 20:39 cdanis: doing some manual testing on mw2221, depooled and puppet disabled
* 21:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 20:33 mforns@deploy1001: Finished deploy [analytics/refinery@e4d16f0]: Regular analytics weekly train [analytics/refinery@e4d16f08a96b6f65447fcdc6c9e8945724a89f54] (duration: 08m 10s)
* 21:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
* 20:31 ryankemper: [Temporarily] disabled notifications for all wdqs hosts while we figure out how to unstick the updater process. Impact is that new updates will be delayed, but queries will still keep serving as normal, so fixing this is a priority but note that there's no availability outage
* 21:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18322 and previous config saved to /var/cache/conftool/dbconfig/20220103-212209-marostegui.json
* 20:29 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 21:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18321 and previous config saved to /var/cache/conftool/dbconfig/20220103-210704-marostegui.json
* 20:25 mforns@deploy1001: Started deploy [analytics/refinery@e4d16f0]: Regular analytics weekly train [analytics/refinery@e4d16f08a96b6f65447fcdc6c9e8945724a89f54]
* 20:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P18320 and previous config saved to /var/cache/conftool/dbconfig/20220103-205159-marostegui.json
* 20:19 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18319 and previous config saved to /var/cache/conftool/dbconfig/20220103-203654-marostegui.json
* 20:18 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18318 and previous config saved to /var/cache/conftool/dbconfig/20220103-185305-marostegui.json
* 20:06 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 19:59 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 19:47 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18317 and previous config saved to /var/cache/conftool/dbconfig/20220103-185257-marostegui.json
* 19:47 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 18:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P18316 and previous config saved to /var/cache/conftool/dbconfig/20220103-183752-marostegui.json
* 19:47 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18315 and previous config saved to /var/cache/conftool/dbconfig/20220103-183130-marostegui.json
* 19:45 dzahn@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=parsoid,service=canary
* 18:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 19:24 razzi@cumin1001: START - Cookbook sre.ganeti.makevm
* 18:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
* 18:58 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18314 and previous config saved to /var/cache/conftool/dbconfig/20220103-183122-marostegui.json
* 18:56 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 18:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P18313 and previous config saved to /var/cache/conftool/dbconfig/20220103-182248-marostegui.json
* 17:48 effie: depooling mw2328 - [[phab:T266052|T266052]]
* 18:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P18312 and previous config saved to /var/cache/conftool/dbconfig/20220103-181617-marostegui.json
* 17:37 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18311 and previous config saved to /var/cache/conftool/dbconfig/20220103-180743-marostegui.json
* 17:35 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P18310 and previous config saved to /var/cache/conftool/dbconfig/20220103-180112-marostegui.json
* 15:54 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@629e8bc]: search satisfaction: remove unused y/m/d cli args (duration: 01m 31s)
* 17:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18309 and previous config saved to /var/cache/conftool/dbconfig/20220103-174608-marostegui.json
* 15:52 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@629e8bc]: search satisfaction: remove unused y/m/d cli args
* 17:13 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudbackup1003.eqiad.wmnet with OS buster
* 15:15 aborrero@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:57 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1088.eqiad.wmnet with OS buster
* 15:13 aborrero@cumin2001: START - Cookbook sre.hosts.downtime
* 16:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1086.eqiad.wmnet with OS buster
* 14:58 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.13/extensions/AbuseFilter/includes/Views/AbuseFilterViewList.php: {{Gerrit|fee2d3be13ae14d7ea51ff2db42090a1c27819bf}}: Prevent uncaught warnings/exception on Special:AbuseFilter ([[phab:T265994|T265994]]) (duration: 01m 03s)
* 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18308 and previous config saved to /var/cache/conftool/dbconfig/20220103-164652-marostegui.json
* 14:56 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/AbuseFilter/includes/Views/AbuseFilterViewList.php: {{Gerrit|00ef00f59fd2a7a1366161ccc66c260be20e3e50}}: Prevent uncaught warnings/exception on Special:AbuseFilter ([[phab:T265994|T265994]]) (duration: 01m 01s)
* 16:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 14:48 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.14/extensions/FileImporter/: {{Gerrit|5eee9b773338e5181867cabec9faefbdeacf67ca}}: Set originalRequest (incl. X-Forwarded-For) for remote edits ([[phab:T265810|T265810]]) (duration: 01m 06s)
* 16:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
* 14:16 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.13/extensions/FileImporter/: {{Gerrit|5f8d3de14c116b618f5226419082d5c9a07766fb}}: Set originalRequest (incl. X-Forwarded-For) for remote edits ([[phab:T265810|T265810]]) (duration: 01m 09s)
* 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18307 and previous config saved to /var/cache/conftool/dbconfig/20220103-164645-marostegui.json
* 14:15 Urbanecm: [urbanecm@deploy1001 /srv/mediawiki-staging (master u=)]$ sudo /usr/local/sbin/fix-staging-perms
* 16:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2125 (re)pooling @ 100%: Slowly repool db2125 after checking tables ', diff saved to https://phabricator.wikimedia.org/P13033 and previous config saved to /var/cache/conftool/dbconfig/20201020-135436-root.json
* 16:43 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1003.eqiad.wmnet with OS buster
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2125 (re)pooling @ 80%: Slowly repool db2125 after checking tables ', diff saved to https://phabricator.wikimedia.org/P13032 and previous config saved to /var/cache/conftool/dbconfig/20201020-133933-root.json
* 16:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2125 (re)pooling @ 60%: Slowly repool db2125 after checking tables ', diff saved to https://phabricator.wikimedia.org/P13031 and previous config saved to /var/cache/conftool/dbconfig/20201020-132430-root.json
* 16:37 ladsgroup@cumin1001: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for db2144.codfw.wmnet
* 13:19 XioNoX: install routinator 3000 0.8.0 on rpki2001 - [[phab:T266001|T266001]]
* 16:37 ladsgroup@cumin1001: START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet
* 13:16 liw@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.14
* 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P18306 and previous config saved to /var/cache/conftool/dbconfig/20220103-163140-marostegui.json
* 13:11 liw@deploy1001: Finished scap: testwikis wikis to 1.36.0-wmf.14 (duration: 58m 03s)
* 16:31 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1088.eqiad.wmnet with OS buster
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2125 (re)pooling @ 40%: Slowly repool db2125 after checking tables ', diff saved to https://phabricator.wikimedia.org/P13030 and previous config saved to /var/cache/conftool/dbconfig/20201020-130926-root.json
* 16:30 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1086.eqiad.wmnet with OS buster
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2125 (re)pooling @ 20%: Slowly repool db2125 after checking tables ', diff saved to https://phabricator.wikimedia.org/P13029 and previous config saved to /var/cache/conftool/dbconfig/20201020-125423-root.json
* 16:29 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1087.eqiad.wmnet with OS buster
* 12:25 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 16:28 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1085.eqiad.wmnet with OS buster
* 12:25 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 16:25 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1084.eqiad.wmnet with OS buster
* 12:24 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 16:22 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1088.eqiad.wmnet with OS buster
* 12:24 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 16:18 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1086.eqiad.wmnet with OS buster
* 12:16 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P18305 and previous config saved to /var/cache/conftool/dbconfig/20220103-161635-marostegui.json
* 12:16 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 16:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18304 and previous config saved to /var/cache/conftool/dbconfig/20220103-161232-marostegui.json
* 12:15 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 16:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 12:13 liw@deploy1001: Started scap: testwikis wikis to 1.36.0-wmf.14
* 16:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
* 11:37 liw: 1.36.0-wmf.14 was branched at {{Gerrit|1b7b5f716015f9303d37158820dadf759e8db707}} for [[phab:T263180|T263180]]
* 16:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18303 and previous config saved to /var/cache/conftool/dbconfig/20220103-161224-marostegui.json
* 11:35 Lucas_WMDE: EU backport/config window done
* 16:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1088.eqiad.wmnet with OS buster
* 11:35 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.36.0-wmf.13/extensions/WikimediaEvents/: Backport: [[gerrit:635030{{!}}SearchSatisfaction: Set isAnon field (T259250)]] (duration: 00m 57s)
* 16:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1087.eqiad.wmnet with OS buster
* 11:15 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:634039{{!}}Set Wikidata MF to collapse sections by default (T239195)]] (duration: 00m 56s)
* 16:04 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1086.eqiad.wmnet with OS buster
* 11:09 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:634938{{!}}Remove noratelimit from Wikidata bot group (T258354)]] (duration: 00m 56s)
* 16:04 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1085.eqiad.wmnet with OS buster
* 10:09 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 16:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18302 and previous config saved to /var/cache/conftool/dbconfig/20220103-160131-marostegui.json
* 10:09 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 16:00 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1084.eqiad.wmnet with OS buster
* 10:04 godog: swift codfw-prod: bump object weight for ms-be2057 - [[phab:T261633|T261633]]
* 15:58 vgutierrez: pool cp2029 - [[phab:T298293|T298293]]
* 09:59 dcausse: [[phab:T255399|T255399]]: resuming wdqs-data-reload manually from chunk no 776 on wdqs1009
* 15:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P18301 and previous config saved to /var/cache/conftool/dbconfig/20220103-155720-marostegui.json
* 09:51 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:53 moritzm: installing publicsuffix 20211207.1025-0+deb11u1 on bullseye hosts
* 09:51 klausman@cumin1001: START - Cookbook sre.hosts.downtime
* 15:50 moritzm: installing gmp security updates
* 09:50 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 15:43 moritzm: installing datatables.js security updates
* 09:50 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 15:42 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cp2029.codfw.wmnet with reason: Swapping faulty DIMM with B1
* 09:47 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 15:42 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cp2029.codfw.wmnet with reason: Swapping faulty DIMM with B1
* 09:25 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 15:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P18300 and previous config saved to /var/cache/conftool/dbconfig/20220103-154215-marostegui.json
* 09:25 jayme@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 15:41 moritzm: installing edk2 security updates
* 09:08 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18299 and previous config saved to /var/cache/conftool/dbconfig/20220103-152710-marostegui.json
* 09:08 jayme@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 15:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18298 and previous config saved to /var/cache/conftool/dbconfig/20220103-151558-marostegui.json
* 09:06 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 15:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 15:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 15:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18297 and previous config saved to /var/cache/conftool/dbconfig/20220103-151550-marostegui.json
* 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P18296 and previous config saved to /var/cache/conftool/dbconfig/20220103-150045-marostegui.json
* 15:00 hashar: Restarting Gerrit primary on gerrit1001
* 14:59 hashar: Restarting Gerrit replica on gerrit2001
* 14:46 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.0-1 - [[phab:T294560|T294560]]
* 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P18295 and previous config saved to /var/cache/conftool/dbconfig/20220103-144539-marostegui.json
* 14:42 XioNoX: push CR744782 "Deprecate interface-range external" to all routers
* 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18293 and previous config saved to /var/cache/conftool/dbconfig/20220103-143034-marostegui.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18292 and previous config saved to /var/cache/conftool/dbconfig/20220103-140232-marostegui.json
* 14:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155,1158].eqiad.wmnet with reason: Maintenance
* 14:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155,1158].eqiad.wmnet with reason: Maintenance
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18291 and previous config saved to /var/cache/conftool/dbconfig/20220103-140221-marostegui.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P18290 and previous config saved to /var/cache/conftool/dbconfig/20220103-134716-marostegui.json
* 13:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18289 and previous config saved to /var/cache/conftool/dbconfig/20220103-134227-marostegui.json
* 13:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 13:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
* 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P18288 and previous config saved to /var/cache/conftool/dbconfig/20220103-133212-marostegui.json
* 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18287 and previous config saved to /var/cache/conftool/dbconfig/20220103-131707-marostegui.json
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host build2001.codfw.wmnet
* 12:46 moritzm: installing openjdk-11 security updates on buster
* 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:41 taavi: UTC morning deploys done
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18286 and previous config saved to /var/cache/conftool/dbconfig/20220103-124117-marostegui.json
* 12:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:40 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:743683{{!}}Use new class names for CentralAuth RC feed]] (duration: 00m 57s)
* 12:35 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:748305{{!}}Add towiki.ru to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T294190)]] (duration: 00m 57s)
* 12:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:29 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:750826{{!}}Add a logo for amiwiki (T298439)]] (3/3) (duration: 00m 57s)
* 12:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:28 taavi@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:750826{{!}}Add a logo for amiwiki (T298439)]] (2/3) (duration: 00m 57s)
* 12:26 taavi@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:750826{{!}}Add a logo for amiwiki (T298439)]] (1/3) (duration: 00m 58s)
* 12:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:22 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:750805{{!}}Add a logo for pwnwiki (T298438)]] (3/3) (duration: 00m 57s)
* 12:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:21 taavi@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:750805{{!}}Add a logo for pwnwiki (T298438)]] (2/3) (duration: 00m 57s)
* 12:20 taavi@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:750805{{!}}Add a logo for pwnwiki (T298438)]] (1/2) (duration: 00m 58s)
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:15 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:747794{{!}}Set ContentTranslationContentImportForSectionTranslation for SX (T294642)]] (duration: 00m 59s)
* 12:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 12:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18285 and previous config saved to /var/cache/conftool/dbconfig/20220103-121131-marostegui.json
* 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:01 moritzm: installing wireshark security updates on stretch
* 12:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18284 and previous config saved to /var/cache/conftool/dbconfig/20220103-120011-marostegui.json
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18283 and previous config saved to /var/cache/conftool/dbconfig/20220103-115627-marostegui.json
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist from s2 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18282 and previous config saved to /var/cache/conftool/dbconfig/20220103-115403-marostegui.json
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P18281 and previous config saved to /var/cache/conftool/dbconfig/20220103-114507-marostegui.json
* 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18280 and previous config saved to /var/cache/conftool/dbconfig/20220103-114122-marostegui.json
* 11:37 moritzm: rebalance row_A ganeti group in codfw (to allow to eventually free 2023 of instances)
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P18279 and previous config saved to /var/cache/conftool/dbconfig/20220103-113002-marostegui.json
* 11:29 elukey: restart cassandra-b on aqs1010 and aqs1015 (instances stuck / trashing, new cluster, not serving live traffic atm)
* 11:27 oblivian@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1002.eqiad.wmnet
* 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18278 and previous config saved to /var/cache/conftool/dbconfig/20220103-112617-marostegui.json
* 11:19 oblivian@cumin2002: START - Cookbook sre.hosts.reboot-single for host deploy1002.eqiad.wmnet
* 11:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18277 and previous config saved to /var/cache/conftool/dbconfig/20220103-111638-marostegui.json
* 11:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 11:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 11:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18276 and previous config saved to /var/cache/conftool/dbconfig/20220103-111631-marostegui.json
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18275 and previous config saved to /var/cache/conftool/dbconfig/20220103-111457-marostegui.json
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18274 and previous config saved to /var/cache/conftool/dbconfig/20220103-110126-marostegui.json
* 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P18273 and previous config saved to /var/cache/conftool/dbconfig/20220103-104621-marostegui.json
* 10:41 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked from s2 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18272 and previous config saved to /var/cache/conftool/dbconfig/20220103-103909-marostegui.json
* 10:32 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18271 and previous config saved to /var/cache/conftool/dbconfig/20220103-103116-marostegui.json
* 10:22 elukey: powercycle an-worker1114 (CPU soft lockup errors in mgmt console)
* 10:20 elukey: powercycle an-worker1120 (CPU soft lockup errors in mgmt console)
* 10:19 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host build2001.codfw.wmnet
* 10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T297094|T297094]])', diff saved to https://phabricator.wikimedia.org/P18270 and previous config saved to /var/cache/conftool/dbconfig/20220103-101116-marostegui.json
* 10:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 10:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 09:59 moritzm: installing ruby2.3 security updates
* 09:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 09:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 ([[phab:T277354|T277354]])', diff saved to https://phabricator.wikimedia.org/P18269 and previous config saved to /var/cache/conftool/dbconfig/20220103-093003-marostegui.json
* 09:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 09:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 09:24 moritzm: installing djvulibre security updates on buster
* 09:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 09:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions and logpager from s2 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18268 and previous config saved to /var/cache/conftool/dbconfig/20220103-085824-marostegui.json
* 08:54 marostegui@cumin1001: dbctl commit (dc=all): 'Remove special slaves from s2 codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P18267 and previous config saved to /var/cache/conftool/dbconfig/20220103-085428-marostegui.json
* 08:49 moritzm: installing libpcap security updates
* 08:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 08:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
* 08:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:25 moritzm: installing zziplib security updates
* 08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2118.codfw.wmnet with reason: Maintenance
* 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2118.codfw.wmnet with reason: Maintenance
* 08:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 08:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
* 07:51 moritzm: draining primary and secondary instances off ganeti2023
* 07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
* 07:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2087.codfw.wmnet with reason: Maintenance
* 07:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2087.codfw.wmnet with reason: Maintenance
* 07:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2086.codfw.wmnet with reason: Maintenance
* 07:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2086.codfw.wmnet with reason: Maintenance
* 07:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:04 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 07:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 07:02 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:750831{{!}}Full roll out of wgMaxExecutionTimeForExpensiveQueries (T297708)]]