You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster)
imported>Stashbot
(mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn)
Line 1: Line 1:
== 2022-02-10 ==
* 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761501{{!}}jawikivoyage: Change module talk namespace from トーク to ノート (T262155)]] (duration: 00m 50s)
* 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761497{{!}}jawikivoyage: Change talk namespace names from トーク to ノート (T262155)]] (duration: 00m 54s)
* 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
* 00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
== 2022-02-09 ==
* 23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works  [[phab:T300568|T300568]]
* 23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
* 23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
* 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
* 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
* 22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
* 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
* 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
* 21:51 hoo: [[phab:T299422|T299422]]: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
* 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]] (duration: 00m 51s)
* 20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21  refs [[phab:T300197|T300197]]
* 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
* 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
* 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
* 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:45 urbanecm: UTC evening B&C window completed
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: {{Gerrit|3da81ec}}: Mentor dashboard: Mark mentor-tools as beta ([[phab:T280307|T280307]]) (duration: 00m 49s)
* 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: {{Gerrit|588fa93}}: Track changes of growthexperiments-mentor-away-timestamp ([[phab:T280307|T280307]]) (duration: 00m 49s)
* 19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: {{Gerrit|9675848}}: {{Gerrit|49202e7}}: Deploy M2 Mentor settings module ([[phab:T280307|T280307]]) (duration: 00m 51s)
* 19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: {{Gerrit|a307ac4b334dd6f60fa7257db10100e18531ee89}}: Track changes of growthexperiments-mentor-away-timestamp ([[phab:T280307|T280307]]) (duration: 00m 50s)
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
* 19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
* 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
* 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
* 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
* 18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
* 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
* 17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
* 17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
* 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
* 17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
* 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
* 17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
* 17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - [[phab:T300740|T300740]]
* 17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
* 16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
* 16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
* 16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
* 16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
* 16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
* 15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - [[phab:T300740|T300740]]
* 15:56 jayme: restarting pybal on lvs1015,lvs2009 - [[phab:T300740|T300740]]
* 15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
* 15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
* 15:31 jayme: restarting pybal on lvs2010,lvs1020 - [[phab:T300740|T300740]]
* 15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
* 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
* 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
* 15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci [[phab:T301361|T301361]]
* 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
* 14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' [[phab:T247497|T247497]]
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
* 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
* 14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
* 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) [[phab:T301320|T301320]] (duration: 01m 31s)
* 14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) [[phab:T301320|T301320]]
* 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
* 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 ([[phab:T300510|T300510]])', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
* 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
* 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye ([[phab:T300510|T300510]])
* 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye ([[phab:T300510|T300510]])
* 13:48 jelto: update scap to 4.3.1 on all hosts - [[phab:T301307|T301307]]
* 13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) [[phab:T301320|T301320]] (duration: 01m 34s)
* 13:36 reedy@deploy1002: Started scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) [[phab:T301320|T301320]]
* 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
* 13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
* 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:41 Lucas_WMDE: UTC morning backport+config window done
* 12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:761310{{!}}sawikisource: Add audio book namespace (T282970)]] (duration: 00m 50s)
* 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
* 12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (3/3) (duration: 00m 49s)
* 12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (2/3) (duration: 00m 49s)
* 12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: [[gerrit:760640{{!}}Stop writing to $wmfRealm (T45956)]] (1/3) (duration: 01m 38s)
* 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 ([[phab:T300775|T300775]])', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
* 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
* 11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
* 10:45 akosiaris: [[phab:T300568|T300568]] upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
* 10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
* 10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
* 10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
* 10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
* 10:20 jelto: update scap to 4.3.1 on A:restbase-canary - [[phab:T301307|T301307]]
* 10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - [[phab:T301307|T301307]]
* 10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
* 10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
* 10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
* 10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
* 10:03 akosiaris: [[phab:T300568|T300568]] upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
* 10:02 Emperor: rolling restart of swift frontends [[phab:T301251|T301251]]
* 09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
* 09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:41 ema: cp3050: stop and disable atskafka-webrequest.service [[phab:T247497|T247497]]
* 09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) [[phab:T265625|T265625]]
* 08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
* 07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
* 07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
* 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
* 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
* 03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
* 03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
* 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 ([[phab:T298554|T298554]])', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
* 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
* 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
* 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
* 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
== 2022-02-08 ==
== 2022-02-08 ==
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster

Revision as of 00:42, 10 February 2022

2022-02-10

  • 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change module talk namespace from トーク to ノート (T262155) (duration: 00m 50s)
  • 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change talk namespace names from トーク to ノート (T262155) (duration: 00m 54s)
  • 00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance

2022-02-09

  • 23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works T300568
  • 23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
  • 23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
  • 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
  • 22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
  • 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
  • 22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
  • 21:51 hoo: T299422: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
  • 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
  • 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21 refs T300197 (duration: 00m 51s)
  • 20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21 refs T300197
  • 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
  • 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
  • 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
  • 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:45 urbanecm: UTC evening B&C window completed
  • 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: 3da81ec: Mentor dashboard: Mark mentor-tools as beta (T280307) (duration: 00m 49s)
  • 19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: 588fa93: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 49s)
  • 19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: 9675848: 49202e7: Deploy M2 Mentor settings module (T280307) (duration: 00m 51s)
  • 19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: a307ac4: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 50s)
  • 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
  • 19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
  • 18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
  • 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
  • 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
  • 18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
  • 17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
  • 17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
  • 17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
  • 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
  • 17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
  • 17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
  • 17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
  • 17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - T300740
  • 17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
  • 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
  • 16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
  • 16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
  • 16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
  • 16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
  • 16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
  • 15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - T300740
  • 15:56 jayme: restarting pybal on lvs1015,lvs2009 - T300740
  • 15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
  • 15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
  • 15:31 jayme: restarting pybal on lvs2010,lvs1020 - T300740
  • 15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
  • 15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci T301361
  • 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
  • 14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' T247497
  • 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
  • 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
  • 14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
  • 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320 (duration: 01m 31s)
  • 14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320
  • 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
  • 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
  • 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
  • 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
  • 13:48 jelto: update scap to 4.3.1 on all hosts - T301307
  • 13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) T301320 (duration: 01m 34s)
  • 13:36 reedy@deploy1002: Started scap: Downgrading symfony/console \(v5.4.3 => v5.4.2\) T301320
  • 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
  • 13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:41 Lucas_WMDE: UTC morning backport+config window done
  • 12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: sawikisource: Add audio book namespace (T282970) (duration: 00m 50s)
  • 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: Stop writing to $wmfRealm (T45956) (3/3) (duration: 00m 49s)
  • 12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: Stop writing to $wmfRealm (T45956) (2/3) (duration: 00m 49s)
  • 12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: Stop writing to $wmfRealm (T45956) (1/3) (duration: 01m 38s)
  • 12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
  • 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:45 akosiaris: T300568 upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
  • 10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
  • 10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
  • 10:20 jelto: update scap to 4.3.1 on A:restbase-canary - T301307
  • 10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - T301307
  • 10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
  • 10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
  • 10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
  • 10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
  • 10:03 akosiaris: T300568 upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
  • 10:02 Emperor: rolling restart of swift frontends T301251
  • 09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
  • 09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:41 ema: cp3050: stop and disable atskafka-webrequest.service T247497
  • 09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) T265625
  • 08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
  • 07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
  • 04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
  • 03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
  • 03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
  • 03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
  • 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
  • 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance

2022-02-08

  • 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
  • 23:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS buster
  • 23:22 tzatziki: removing 1 file for legal compliance
  • 23:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS buster
  • 23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS buster
  • 23:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS buster
  • 23:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS buster
  • 22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS buster
  • 22:44 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
  • 22:42 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
  • 22:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS buster
  • 22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20402 and previous config saved to /var/cache/conftool/dbconfig/20220208-221545-marostegui.json
  • 22:12 topranks: doing planned 1-by-1 shutdown of ports xe-0/1/1, xe-0/1/2 and xe-0/1/9 on cr2-esams, to test reliability of each following user reports of issues at AMS-IX.
  • 22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20401 and previous config saved to /var/cache/conftool/dbconfig/20220208-220041-marostegui.json
  • 21:59 ryankemper: T294805 elastic10[68-83] erroneously weren't in pybal, added them just now: `sudo confctl select 'cluster=elasticsearch' set/pooled=yes:weight=10` (there's no hosts in the `conftool-data` list that we want depooled so we're okay setting all to pooled w/ equal weight)
  • 21:59 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch
  • 21:58 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch,name=elastic1*
  • 21:53 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
  • 21:52 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
  • 21:47 ryankemper: [Elastic] `ryankemper@elastic1081:~$ sudo systemctl restart elasticsearch_6*psi*` (9600 but not 9200 seemed to be having connectivity issues)
  • 21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20400 and previous config saved to /var/cache/conftool/dbconfig/20220208-214536-marostegui.json
  • 21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20399 and previous config saved to /var/cache/conftool/dbconfig/20220208-213031-marostegui.json
  • 21:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20398 and previous config saved to /var/cache/conftool/dbconfig/20220208-212558-marostegui.json
  • 21:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 21:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20397 and previous config saved to /var/cache/conftool/dbconfig/20220208-212550-marostegui.json
  • 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20396 and previous config saved to /var/cache/conftool/dbconfig/20220208-211046-marostegui.json
  • 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20395 and previous config saved to /var/cache/conftool/dbconfig/20220208-205541-marostegui.json
  • 20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 20:52 jhuneidi@deploy1002: Finished scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0 (duration: 16m 17s)
  • 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS buster
  • 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20394 and previous config saved to /var/cache/conftool/dbconfig/20220208-204036-marostegui.json
  • 20:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20393 and previous config saved to /var/cache/conftool/dbconfig/20220208-203634-ladsgroup.json
  • 20:36 jhuneidi@deploy1002: Started scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0
  • 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20392 and previous config saved to /var/cache/conftool/dbconfig/20220208-203529-marostegui.json
  • 20:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 20:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20391 and previous config saved to /var/cache/conftool/dbconfig/20220208-203521-marostegui.json
  • 20:33 ryankemper: T294805 Banned `elastic10[32-47]` from main, omega, and psi elasticsearch clusters. Shards are relocating on main and omega clusters as expected, but they don't seem to be moving on psi. Investigating that currently. Might have to do with row allocation constraints, but unsure currently
  • 20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS buster
  • 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20390 and previous config saved to /var/cache/conftool/dbconfig/20220208-202127-ladsgroup.json
  • 20:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20389 and previous config saved to /var/cache/conftool/dbconfig/20220208-202016-marostegui.json
  • 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:17 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.21 refs T300197
  • 20:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS buster
  • 20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20388 and previous config saved to /var/cache/conftool/dbconfig/20220208-200621-ladsgroup.json
  • 20:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20387 and previous config saved to /var/cache/conftool/dbconfig/20220208-200512-marostegui.json
  • 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS buster
  • 19:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS buster
  • 19:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS buster
  • 19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20386 and previous config saved to /var/cache/conftool/dbconfig/20220208-195115-ladsgroup.json
  • 19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20385 and previous config saved to /var/cache/conftool/dbconfig/20220208-195007-marostegui.json
  • 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20384 and previous config saved to /var/cache/conftool/dbconfig/20220208-194528-marostegui.json
  • 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20383 and previous config saved to /var/cache/conftool/dbconfig/20220208-194520-marostegui.json
  • 19:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS buster
  • 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20382 and previous config saved to /var/cache/conftool/dbconfig/20220208-193016-marostegui.json
  • 19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS buster
  • 19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS buster
  • 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS buster
  • 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20381 and previous config saved to /var/cache/conftool/dbconfig/20220208-192055-ladsgroup.json
  • 19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20380 and previous config saved to /var/cache/conftool/dbconfig/20220208-192047-ladsgroup.json
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20379 and previous config saved to /var/cache/conftool/dbconfig/20220208-191511-marostegui.json
  • 19:12 jhuneidi@deploy1002: Pruned MediaWiki: 1.38.0-wmf.19 (duration: 03m 12s)
  • 19:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 19:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:09 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.21 refs T300197 (duration: 39m 34s)
  • 19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20378 and previous config saved to /var/cache/conftool/dbconfig/20220208-190542-ladsgroup.json
  • 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20377 and previous config saved to /var/cache/conftool/dbconfig/20220208-190006-marostegui.json
  • 18:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment (duration: 02m 02s)
  • 18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment
  • 18:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS buster
  • 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20376 and previous config saved to /var/cache/conftool/dbconfig/20220208-185420-marostegui.json
  • 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS buster
  • 18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS buster
  • 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS buster
  • 18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20375 and previous config saved to /var/cache/conftool/dbconfig/20220208-185037-ladsgroup.json
  • 18:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20374 and previous config saved to /var/cache/conftool/dbconfig/20220208-184832-marostegui.json
  • 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20373 and previous config saved to /var/cache/conftool/dbconfig/20220208-183532-ladsgroup.json
  • 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20372 and previous config saved to /var/cache/conftool/dbconfig/20220208-183328-marostegui.json
  • 18:29 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.21 refs T300197
  • 18:22 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup (duration: 02m 03s)
  • 18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS buster
  • 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS buster
  • 18:20 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup
  • 18:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20371 and previous config saved to /var/cache/conftool/dbconfig/20220208-181823-marostegui.json
  • 18:13 moritzm: installing expat security updates
  • 18:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS buster
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20370 and previous config saved to /var/cache/conftool/dbconfig/20220208-180810-ladsgroup.json
  • 18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20369 and previous config saved to /var/cache/conftool/dbconfig/20220208-180803-ladsgroup.json
  • 18:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20368 and previous config saved to /var/cache/conftool/dbconfig/20220208-180316-marostegui.json
  • 17:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2042.codfw.wmnet with OS buster
  • 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20367 and previous config saved to /var/cache/conftool/dbconfig/20220208-175844-marostegui.json
  • 17:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20366 and previous config saved to /var/cache/conftool/dbconfig/20220208-175837-marostegui.json
  • 17:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow (duration: 02m 01s)
  • 17:56 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
  • 17:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow
  • 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20365 and previous config saved to /var/cache/conftool/dbconfig/20220208-175258-ladsgroup.json
  • 17:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20364 and previous config saved to /var/cache/conftool/dbconfig/20220208-174332-marostegui.json
  • 17:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS buster
  • 17:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS buster
  • 17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20363 and previous config saved to /var/cache/conftool/dbconfig/20220208-173753-ladsgroup.json
  • 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
  • 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
  • 17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20362 and previous config saved to /var/cache/conftool/dbconfig/20220208-173611-marostegui.json
  • 17:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS buster
  • 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20361 and previous config saved to /var/cache/conftool/dbconfig/20220208-172827-marostegui.json
  • 17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS buster
  • 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20360 and previous config saved to /var/cache/conftool/dbconfig/20220208-172248-ladsgroup.json
  • 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20359 and previous config saved to /var/cache/conftool/dbconfig/20220208-172106-marostegui.json
  • 17:17 rzl: rzl@cumin1001:~$ sudo cumin A:mw "enable-puppet T273323"
  • 17:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20358 and previous config saved to /var/cache/conftool/dbconfig/20220208-171323-marostegui.json
  • 17:11 rzl: rzl@cumin1001:~$ sudo cumin A:mw "disable-puppet T273323"
  • 17:11 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job (duration: 02m 01s)
  • 17:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job
  • 17:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS buster
  • 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20357 and previous config saved to /var/cache/conftool/dbconfig/20220208-170812-marostegui.json
  • 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20356 and previous config saved to /var/cache/conftool/dbconfig/20220208-170805-marostegui.json
  • 17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS buster
  • 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20355 and previous config saved to /var/cache/conftool/dbconfig/20220208-170601-marostegui.json
  • 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20354 and previous config saved to /var/cache/conftool/dbconfig/20220208-165445-ladsgroup.json
  • 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20353 and previous config saved to /var/cache/conftool/dbconfig/20220208-165436-ladsgroup.json
  • 16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
  • 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20352 and previous config saved to /var/cache/conftool/dbconfig/20220208-165300-marostegui.json
  • 16:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2040.codfw.wmnet with OS buster
  • 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
  • 16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20351 and previous config saved to /var/cache/conftool/dbconfig/20220208-165057-marostegui.json
  • 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS buster
  • 16:45 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: Choose wikiversions.php file relative to MWMultiVersion.php (revived) (duration: 00m 49s)
  • 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20350 and previous config saved to /var/cache/conftool/dbconfig/20220208-163932-ladsgroup.json
  • 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20349 and previous config saved to /var/cache/conftool/dbconfig/20220208-163755-marostegui.json
  • 16:37 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:37 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS buster
  • 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20348 and previous config saved to /var/cache/conftool/dbconfig/20220208-162427-ladsgroup.json
  • 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20347 and previous config saved to /var/cache/conftool/dbconfig/20220208-162250-marostegui.json
  • 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20346 and previous config saved to /var/cache/conftool/dbconfig/20220208-161812-marostegui.json
  • 16:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20345 and previous config saved to /var/cache/conftool/dbconfig/20220208-161805-marostegui.json
  • 16:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS buster
  • 16:13 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20344 and previous config saved to /var/cache/conftool/dbconfig/20220208-160922-ladsgroup.json
  • 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20343 and previous config saved to /var/cache/conftool/dbconfig/20220208-160300-marostegui.json
  • 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20342 and previous config saved to /var/cache/conftool/dbconfig/20220208-154755-marostegui.json
  • 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20341 and previous config saved to /var/cache/conftool/dbconfig/20220208-154049-ladsgroup.json
  • 15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20340 and previous config saved to /var/cache/conftool/dbconfig/20220208-154042-ladsgroup.json
  • 15:33 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 15:33 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20339 and previous config saved to /var/cache/conftool/dbconfig/20220208-153251-marostegui.json
  • 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20338 and previous config saved to /var/cache/conftool/dbconfig/20220208-152812-marostegui.json
  • 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 15:27 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20337 and previous config saved to /var/cache/conftool/dbconfig/20220208-152536-ladsgroup.json
  • 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20336 and previous config saved to /var/cache/conftool/dbconfig/20220208-152525-marostegui.json
  • 15:18 Emperor: depooling ms-fe200[5-8] T301251
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20335 and previous config saved to /var/cache/conftool/dbconfig/20220208-151032-ladsgroup.json
  • 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20334 and previous config saved to /var/cache/conftool/dbconfig/20220208-151020-marostegui.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20333 and previous config saved to /var/cache/conftool/dbconfig/20220208-145731-marostegui.json
  • 14:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 14:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20332 and previous config saved to /var/cache/conftool/dbconfig/20220208-145724-marostegui.json
  • 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20331 and previous config saved to /var/cache/conftool/dbconfig/20220208-145527-ladsgroup.json
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20330 and previous config saved to /var/cache/conftool/dbconfig/20220208-145516-marostegui.json
  • 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20329 and previous config saved to /var/cache/conftool/dbconfig/20220208-144219-marostegui.json
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20328 and previous config saved to /var/cache/conftool/dbconfig/20220208-144011-marostegui.json
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20327 and previous config saved to /var/cache/conftool/dbconfig/20220208-143545-marostegui.json
  • 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 14:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
  • 14:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 14:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20326 and previous config saved to /var/cache/conftool/dbconfig/20220208-143302-marostegui.json
  • 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20325 and previous config saved to /var/cache/conftool/dbconfig/20220208-142815-ladsgroup.json
  • 14:28 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
  • 14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20324 and previous config saved to /var/cache/conftool/dbconfig/20220208-142808-ladsgroup.json
  • 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20323 and previous config saved to /var/cache/conftool/dbconfig/20220208-142714-marostegui.json
  • 14:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2001.codfw.wmnet with OS bullseye
  • 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20322 and previous config saved to /var/cache/conftool/dbconfig/20220208-141757-marostegui.json
  • 14:17 godog: update PERC firmware on thanos-be2001 - T288937
  • 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20321 and previous config saved to /var/cache/conftool/dbconfig/20220208-141303-ladsgroup.json
  • 14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20320 and previous config saved to /var/cache/conftool/dbconfig/20220208-141210-marostegui.json
  • 14:07 godog: update NIC firmware on thanos-be2001 - T288937
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20319 and previous config saved to /var/cache/conftool/dbconfig/20220208-140252-marostegui.json
  • 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20318 and previous config saved to /var/cache/conftool/dbconfig/20220208-135758-ladsgroup.json
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20317 and previous config saved to /var/cache/conftool/dbconfig/20220208-134748-marostegui.json
  • 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20316 and previous config saved to /var/cache/conftool/dbconfig/20220208-134324-marostegui.json
  • 13:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20315 and previous config saved to /var/cache/conftool/dbconfig/20220208-134254-ladsgroup.json
  • 13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20314 and previous config saved to /var/cache/conftool/dbconfig/20220208-134022-marostegui.json
  • 13:37 moritzm: migrating instances off ganeti1021
  • 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20313 and previous config saved to /var/cache/conftool/dbconfig/20220208-133558-marostegui.json
  • 13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20312 and previous config saved to /var/cache/conftool/dbconfig/20220208-133550-marostegui.json
  • 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20310 and previous config saved to /var/cache/conftool/dbconfig/20220208-132517-marostegui.json
  • 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20309 and previous config saved to /var/cache/conftool/dbconfig/20220208-132045-marostegui.json
  • 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20308 and previous config saved to /var/cache/conftool/dbconfig/20220208-131430-ladsgroup.json
  • 13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20307 and previous config saved to /var/cache/conftool/dbconfig/20220208-131427-ladsgroup.json
  • 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20306 and previous config saved to /var/cache/conftool/dbconfig/20220208-131319-ladsgroup.json
  • 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20305 and previous config saved to /var/cache/conftool/dbconfig/20220208-131012-marostegui.json
  • 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20304 and previous config saved to /var/cache/conftool/dbconfig/20220208-130541-marostegui.json
  • 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20303 and previous config saved to /var/cache/conftool/dbconfig/20220208-125922-ladsgroup.json
  • 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20302 and previous config saved to /var/cache/conftool/dbconfig/20220208-125814-ladsgroup.json
  • 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20301 and previous config saved to /var/cache/conftool/dbconfig/20220208-125508-marostegui.json
  • 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20300 and previous config saved to /var/cache/conftool/dbconfig/20220208-125036-marostegui.json
  • 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20299 and previous config saved to /var/cache/conftool/dbconfig/20220208-124418-ladsgroup.json
  • 12:43 Amir1: shut down dbmonitor1002 (T297605)
  • 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20298 and previous config saved to /var/cache/conftool/dbconfig/20220208-124309-ladsgroup.json
  • 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
  • 12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
  • 12:37 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-be2001.codfw.wmnet with OS bullseye
  • 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20297 and previous config saved to /var/cache/conftool/dbconfig/20220208-122913-ladsgroup.json
  • 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20296 and previous config saved to /var/cache/conftool/dbconfig/20220208-122805-ladsgroup.json
  • 12:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
  • 12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bullseye
  • 12:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
  • 12:19 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20295 and previous config saved to /var/cache/conftool/dbconfig/20220208-121430-marostegui.json
  • 12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20294 and previous config saved to /var/cache/conftool/dbconfig/20220208-121422-marostegui.json
  • 12:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2010.wmnet
  • 12:11 hnowlan: Running c-foreach-nt decommission on restbase2010 in advance of decommissioning
  • 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20293 and previous config saved to /var/cache/conftool/dbconfig/20220208-120603-marostegui.json
  • 12:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 12:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20292 and previous config saved to /var/cache/conftool/dbconfig/20220208-120556-marostegui.json
  • 12:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d9902a4: cowikimedia: Let admins grant confirmed and accountcreator flags (T300948) (duration: 00m 50s)
  • 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20291 and previous config saved to /var/cache/conftool/dbconfig/20220208-120102-ladsgroup.json
  • 12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20290 and previous config saved to /var/cache/conftool/dbconfig/20220208-120054-ladsgroup.json
  • 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
  • 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20289 and previous config saved to /var/cache/conftool/dbconfig/20220208-115918-marostegui.json
  • 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2019.wmnet
  • 11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2020.wmnet
  • 11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2019.codfw.wmnet with OS buster
  • 11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bullseye
  • 11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2020.codfw.wmnet with OS buster
  • 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20288 and previous config saved to /var/cache/conftool/dbconfig/20220208-115051-marostegui.json
  • 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20287 and previous config saved to /var/cache/conftool/dbconfig/20220208-114639-ladsgroup.json
  • 11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20286 and previous config saved to /var/cache/conftool/dbconfig/20220208-114549-ladsgroup.json
  • 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20285 and previous config saved to /var/cache/conftool/dbconfig/20220208-114413-marostegui.json
  • 11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20284 and previous config saved to /var/cache/conftool/dbconfig/20220208-113910-ladsgroup.json
  • 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20283 and previous config saved to /var/cache/conftool/dbconfig/20220208-113547-marostegui.json
  • 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20282 and previous config saved to /var/cache/conftool/dbconfig/20220208-113045-ladsgroup.json
  • 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20281 and previous config saved to /var/cache/conftool/dbconfig/20220208-112909-marostegui.json
  • 11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20280 and previous config saved to /var/cache/conftool/dbconfig/20220208-112406-ladsgroup.json
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20279 and previous config saved to /var/cache/conftool/dbconfig/20220208-112042-marostegui.json
  • 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20278 and previous config saved to /var/cache/conftool/dbconfig/20220208-111540-ladsgroup.json
  • 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20277 and previous config saved to /var/cache/conftool/dbconfig/20220208-110901-ladsgroup.json
  • 11:06 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2020.codfw.wmnet with OS buster
  • 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20276 and previous config saved to /var/cache/conftool/dbconfig/20220208-110154-marostegui.json
  • 11:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 11:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20275 and previous config saved to /var/cache/conftool/dbconfig/20220208-110147-marostegui.json
  • 10:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2019.codfw.wmnet with OS buster
  • 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20274 and previous config saved to /var/cache/conftool/dbconfig/20220208-105453-marostegui.json
  • 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20273 and previous config saved to /var/cache/conftool/dbconfig/20220208-105440-marostegui.json
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20272 and previous config saved to /var/cache/conftool/dbconfig/20220208-105356-ladsgroup.json
  • 10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS bullseye
  • 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20271 and previous config saved to /var/cache/conftool/dbconfig/20220208-104642-marostegui.json
  • 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20270 and previous config saved to /var/cache/conftool/dbconfig/20220208-104421-ladsgroup.json
  • 10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20269 and previous config saved to /var/cache/conftool/dbconfig/20220208-104414-ladsgroup.json
  • 10:43 elukey: update pcc facts
  • 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20268 and previous config saved to /var/cache/conftool/dbconfig/20220208-103935-marostegui.json
  • 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20267 and previous config saved to /var/cache/conftool/dbconfig/20220208-103137-marostegui.json
  • 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20266 and previous config saved to /var/cache/conftool/dbconfig/20220208-102909-ladsgroup.json
  • 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20265 and previous config saved to /var/cache/conftool/dbconfig/20220208-102430-marostegui.json
  • 10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS bullseye
  • 10:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20264 and previous config saved to /var/cache/conftool/dbconfig/20220208-101631-marostegui.json
  • 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20263 and previous config saved to /var/cache/conftool/dbconfig/20220208-101404-ladsgroup.json
  • 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20262 and previous config saved to /var/cache/conftool/dbconfig/20220208-101238-ladsgroup.json
  • 10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 10:09 jayme: updates scap to 4.3.0 on all hosts - T300804
  • 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20261 and previous config saved to /var/cache/conftool/dbconfig/20220208-100926-marostegui.json
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20260 and previous config saved to /var/cache/conftool/dbconfig/20220208-095916-marostegui.json
  • 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20259 and previous config saved to /var/cache/conftool/dbconfig/20220208-095909-marostegui.json
  • 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20258 and previous config saved to /var/cache/conftool/dbconfig/20220208-095900-ladsgroup.json
  • 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20257 and previous config saved to /var/cache/conftool/dbconfig/20220208-095427-marostegui.json
  • 09:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20256 and previous config saved to /var/cache/conftool/dbconfig/20220208-095420-marostegui.json
  • 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20255 and previous config saved to /var/cache/conftool/dbconfig/20220208-094358-marostegui.json
  • 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20254 and previous config saved to /var/cache/conftool/dbconfig/20220208-093915-marostegui.json
  • 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20253 and previous config saved to /var/cache/conftool/dbconfig/20220208-093315-ladsgroup.json
  • 09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20252 and previous config saved to /var/cache/conftool/dbconfig/20220208-092853-marostegui.json
  • 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20251 and previous config saved to /var/cache/conftool/dbconfig/20220208-092410-marostegui.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20250 and previous config saved to /var/cache/conftool/dbconfig/20220208-091349-marostegui.json
  • 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20249 and previous config saved to /var/cache/conftool/dbconfig/20220208-090906-marostegui.json
  • 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20248 and previous config saved to /var/cache/conftool/dbconfig/20220208-084851-marostegui.json
  • 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20247 and previous config saved to /var/cache/conftool/dbconfig/20220208-083815-marostegui.json
  • 08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20246 and previous config saved to /var/cache/conftool/dbconfig/20220208-083808-marostegui.json
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20245 and previous config saved to /var/cache/conftool/dbconfig/20220208-082303-marostegui.json
  • 08:20 marostegui: Stop MySQL on db1115 to backup tendril T297605
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20244 and previous config saved to /var/cache/conftool/dbconfig/20220208-080758-marostegui.json
  • 08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20243 and previous config saved to /var/cache/conftool/dbconfig/20220208-080709-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20242 and previous config saved to /var/cache/conftool/dbconfig/20220208-075254-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20241 and previous config saved to /var/cache/conftool/dbconfig/20220208-075204-marostegui.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20240 and previous config saved to /var/cache/conftool/dbconfig/20220208-073659-marostegui.json
  • 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20239 and previous config saved to /var/cache/conftool/dbconfig/20220208-072155-marostegui.json
  • 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20238 and previous config saved to /var/cache/conftool/dbconfig/20220208-070339-marostegui.json
  • 07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2134.codfw.wmnet with OS bullseye
  • 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
  • 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
  • 06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2134.codfw.wmnet with OS bullseye
  • 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20237 and previous config saved to /var/cache/conftool/dbconfig/20220208-060943-marostegui.json
  • 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20236 and previous config saved to /var/cache/conftool/dbconfig/20220208-060310-marostegui.json
  • 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:12 ryankemper: T294805 Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent'` tmux session `elastic`
  • 00:12 ryankemper: T294805 old psi masters are out, done with all elastic master operations
  • 00:05 ryankemper: T294805 new psi masters `elastic1073`, `elastic1075`, and `elastic1083` are in

2022-02-07

  • 23:39 ryankemper: T294805 Removed old masters `elastic1034` and `elastic1038` (and `elastic1040` was removed earlier)
  • 23:35 ryankemper: T294805 Bringing in new omega master `elastic1057`
  • 23:31 ryankemper: T294805 Bringing in new omega master `elastic1076`
  • 23:27 ryankemper: T294805 Bringing in new master `elastic1068`
  • 23:27 ryankemper: T294805 Main search cluster all done, proceeding to `omega` cluster
  • 23:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:17 cwhite: end opensearch upgrade (eqiad) T299168
  • 23:09 ryankemper: T294805 Kicking out the final master `elastic1036` (which is also the currently elected leader); after this we'll be back to 3 masters as intended
  • 23:06 ryankemper: T294805 Running puppet and restarting elasticsearch services on `elastic1040` to make it no longer a master
  • 23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-psi-eqiad.service`
  • 23:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
  • 22:59 ryankemper: T294805 `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-omega-eqiad.service` on `elastic1074`
  • 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:57 ryankemper: T294805 Running puppet agent on new master elastic1074.eqiad.wmnet: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
  • 22:48 ryankemper: T294805 Disabled puppet across all of elastic1* in preparation for bringing new master hosts in
  • 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20235 and previous config saved to /var/cache/conftool/dbconfig/20220207-224733-ladsgroup.json
  • 22:45 inflatador: T294805 puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/736118
  • 22:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20234 and previous config saved to /var/cache/conftool/dbconfig/20220207-223228-ladsgroup.json
  • 22:25 cwhite: begin opensearch upgrade (eqiad) T299168
  • 22:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20233 and previous config saved to /var/cache/conftool/dbconfig/20220207-221723-ladsgroup.json
  • 22:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20232 and previous config saved to /var/cache/conftool/dbconfig/20220207-221345-ladsgroup.json
  • 22:11 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20231 and previous config saved to /var/cache/conftool/dbconfig/20220207-220218-ladsgroup.json
  • 22:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:00 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20230 and previous config saved to /var/cache/conftool/dbconfig/20220207-215840-ladsgroup.json
  • 21:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20229 and previous config saved to /var/cache/conftool/dbconfig/20220207-214335-ladsgroup.json
  • 21:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20228 and previous config saved to /var/cache/conftool/dbconfig/20220207-213650-ladsgroup.json
  • 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20227 and previous config saved to /var/cache/conftool/dbconfig/20220207-212830-ladsgroup.json
  • 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 21:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 21:09 otto@deploy1002: Finished deploy [airflow-dags/analytics-test@6d936db]: (no justification provided) (duration: 00m 08s)
  • 21:09 otto@deploy1002: Started deploy [airflow-dags/analytics-test@6d936db]: (no justification provided)
  • 21:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1129.eqiad.wmnet with OS bullseye
  • 20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20225 and previous config saved to /var/cache/conftool/dbconfig/20220207-205620-ladsgroup.json
  • 20:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20223 and previous config saved to /var/cache/conftool/dbconfig/20220207-204115-ladsgroup.json
  • 20:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1129.eqiad.wmnet with OS bullseye
  • 20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20222 and previous config saved to /var/cache/conftool/dbconfig/20220207-203120-ladsgroup.json
  • 20:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 20:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 20:30 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided) (duration: 00m 08s)
  • 20:30 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided)
  • 20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20221 and previous config saved to /var/cache/conftool/dbconfig/20220207-202611-ladsgroup.json
  • 20:23 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
  • 20:23 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
  • 20:19 eileen: revision 7dcdc017 -> ccd5afc3 civicrm update
  • 20:19 eileen: revision 7dcdc017 -> ccd5afc3
  • 20:19 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided) (duration: 00m 07s)
  • 20:18 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided)
  • 20:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20220 and previous config saved to /var/cache/conftool/dbconfig/20220207-201106-ladsgroup.json
  • 20:08 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync on main
  • 20:08 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply on main
  • 20:05 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync on main
  • 19:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:55 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply on main
  • 19:44 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided) (duration: 00m 08s)
  • 19:44 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided)
  • 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20219 and previous config saved to /var/cache/conftool/dbconfig/20220207-194020-ladsgroup.json
  • 19:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20218 and previous config saved to /var/cache/conftool/dbconfig/20220207-194013-ladsgroup.json
  • 19:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20217 and previous config saved to /var/cache/conftool/dbconfig/20220207-192508-ladsgroup.json
  • 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20216 and previous config saved to /var/cache/conftool/dbconfig/20220207-191003-ladsgroup.json
  • 19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:05 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Turn on wgVectorLanguageAlertInSidebar for all wikis (T300559) (duration: 00m 49s)
  • 19:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20215 and previous config saved to /var/cache/conftool/dbconfig/20220207-185459-ladsgroup.json
  • 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20214 and previous config saved to /var/cache/conftool/dbconfig/20220207-183059-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 18:20 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
  • 18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20213 and previous config saved to /var/cache/conftool/dbconfig/20220207-180857-ladsgroup.json
  • 18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
  • 18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
  • 18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
  • 18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
  • 18:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
  • 17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2020.wmnet
  • 17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2019.wmnet
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20212 and previous config saved to /var/cache/conftool/dbconfig/20220207-175352-ladsgroup.json
  • 17:51 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
  • 17:42 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20211 and previous config saved to /var/cache/conftool/dbconfig/20220207-173848-ladsgroup.json
  • 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
  • 17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2030.codfw.wmnet with OS buster
  • 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20210 and previous config saved to /var/cache/conftool/dbconfig/20220207-172343-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20209 and previous config saved to /var/cache/conftool/dbconfig/20220207-165952-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20208 and previous config saved to /var/cache/conftool/dbconfig/20220207-165944-ladsgroup.json
  • 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2030.codfw.wmnet with OS buster
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2029.codfw.wmnet with OS buster
  • 16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20207 and previous config saved to /var/cache/conftool/dbconfig/20220207-164439-ladsgroup.json
  • 16:41 moritzm: switch kubestagetcd2003 to plain disk storage
  • 16:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
  • 16:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
  • 16:30 moritzm: switch kubestagetcd2002 to plain disk storage
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20206 and previous config saved to /var/cache/conftool/dbconfig/20220207-162935-ladsgroup.json
  • 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
  • 16:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
  • 16:24 moritzm: switch kubestagetcd2001 to plain disk storage
  • 16:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
  • 16:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
  • 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster
  • 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20205 and previous config saved to /var/cache/conftool/dbconfig/20220207-161430-ladsgroup.json
  • 16:05 moritzm: migrating instances off ganeti1021
  • 16:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
  • 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20204 and previous config saved to /var/cache/conftool/dbconfig/20220207-160441-ladsgroup.json
  • 16:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20203 and previous config saved to /var/cache/conftool/dbconfig/20220207-160433-ladsgroup.json
  • 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20201 and previous config saved to /var/cache/conftool/dbconfig/20220207-154928-ladsgroup.json
  • 15:47 moritzm: installing pillow security updates
  • 15:44 jayme@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 02m 30s)
  • 15:41 jayme@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
  • 15:40 jayme: updated scap to 4.3.0 on A:mw-canary, A:parsoid-canary, A:mw-jobrunner-canary, A:restbase-canary - T300804
  • 15:37 jayme: uploaded scap 4.3-0 to apt.w.o - T300804
  • 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20200 and previous config saved to /var/cache/conftool/dbconfig/20220207-153424-ladsgroup.json
  • 15:30 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
  • 15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20199 and previous config saved to /var/cache/conftool/dbconfig/20220207-151917-ladsgroup.json
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20198 and previous config saved to /var/cache/conftool/dbconfig/20220207-151018-ladsgroup.json
  • 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20197 and previous config saved to /var/cache/conftool/dbconfig/20220207-150959-ladsgroup.json
  • 14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20196 and previous config saved to /var/cache/conftool/dbconfig/20220207-145454-ladsgroup.json
  • 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20195 and previous config saved to /var/cache/conftool/dbconfig/20220207-143950-ladsgroup.json
  • 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20194 and previous config saved to /var/cache/conftool/dbconfig/20220207-142445-ladsgroup.json
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20193 and previous config saved to /var/cache/conftool/dbconfig/20220207-141452-ladsgroup.json
  • 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 13:14 jbond: update ferm on bullseye
  • 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1020.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 13:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
  • 13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
  • 12:44 moritzm: installing ruby2.7 security updates
  • 12:40 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:34 moritzm: revert kubestagetcd1006 to plain disk storage
  • 12:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:32 taavi: UTC morning deploys done
  • 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:32 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (2/2) (duration: 00m 48s)
  • 12:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:31 moritzm: revert kubestagetcd1005 to plain disk storage
  • 12:31 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (1/2) (duration: 00m 48s)
  • 12:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:27 taavi@deploy1002: Synchronized w/robots.php: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (3/3) (duration: 00m 48s)
  • 12:26 taavi@deploy1002: Synchronized wmf-config: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (2/3) (duration: 00m 48s)
  • 12:25 taavi@deploy1002: Synchronized multiversion: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (1/3) (duration: 00m 48s)
  • 12:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:19 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Remove redundant patrolmarks flag from patroller usergroup (T300913) (duration: 00m 48s)
  • 12:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:17 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1009.eqiad.wmnet
  • 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:09 taavi: taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: Stop capturing media change tags (T286362) (2/2) (duration: 00m 50s)
  • 12:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:08 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Stop capturing media change tags (T286362) (1/2) (duration: 00m 50s)
  • 12:07 moritzm: revert kubestagetcd1004 to plain disk storage
  • 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
  • 12:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
  • 11:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1008.eqiad.wmnet
  • 11:40 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1007.eqiad.wmnet
  • 11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 11:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:14 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
  • 11:14 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
  • 11:00 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1006.eqiad.wmnet
  • 10:51 mmandere: rolling upgrade of varnish from version 6.0.9 to 6.0.10 across DCs T300264
  • 10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus2004.codfw.wmnet
  • 10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1004.eqiad.wmnet
  • 10:22 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1005.eqiad.wmnet
  • 09:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1004.eqiad.wmnet
  • 09:21 godog: temp-disable mfa for 'filippo' - T296629
  • 09:09 jayme: uncordoned kubernetes1014 - T301099
  • 08:02 jayme: powercycle kubernetes1014 - T301099
  • 06:20 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
  • 06:20 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
  • 06:10 jayme: draining kubernetes1014

2022-02-05

  • 22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
  • 21:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
  • 20:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
  • 19:29 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
  • 18:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 17:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 16:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 06:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 06:09 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 05:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye

2022-02-04

  • 23:43 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 23:43 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 23:02 inflatador: bking@deployment-puppetmaster04 local commit to public/private repo, see T299797 for more details
  • 22:37 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 22:36 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • 19:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2002-dev.wikimedia.org with OS bullseye
  • 18:52 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudservices2002-dev.wikimedia.org with OS bullseye
  • 17:00 arturo: add mcrouter 2022.01.31.00-1 to bullseye-wikimedia (T300578)
  • 16:48 jbond: update add new ferm package ferm_2.5.1-1+wmf11u2
  • 16:38 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:05 elukey: unmask prometheus-mysqld-exporter.service and clean up the old @analytics + wmf_auto_restart units (service+timer) not used anymore on an-coord100[12]
  • 14:25 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
  • 14:18 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
  • 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1020.eqiad.wmnet with OS buster
  • 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20174 and previous config saved to /var/cache/conftool/dbconfig/20220204-114117-root.json
  • 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20173 and previous config saved to /var/cache/conftool/dbconfig/20220204-112613-root.json
  • 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1020.eqiad.wmnet with OS buster
  • 11:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20172 and previous config saved to /var/cache/conftool/dbconfig/20220204-111110-root.json
  • 11:07 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s1 codfw T263127', diff saved to https://phabricator.wikimedia.org/P20171 and previous config saved to /var/cache/conftool/dbconfig/20220204-110427-marostegui.json
  • 10:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20170 and previous config saved to /var/cache/conftool/dbconfig/20220204-105606-root.json
  • 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20165 and previous config saved to /var/cache/conftool/dbconfig/20220204-104102-root.json
  • 10:40 moritzm: rebalancing row A in ganeti/eqiad, all nodes of that row are now running Buster T296721
  • 10:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 10:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
  • 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
  • 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20164 and previous config saved to /var/cache/conftool/dbconfig/20220204-082010-marostegui.json
  • 07:18 elukey: `git checkout main.html` on miscweb1002:/srv/org/wikidata/query to avoid puppet corrective actions (and the host being listed in alarms)
  • 07:09 elukey: cleanup wmf_auto_restart_prometheus-mysqld-exporter@analytics-meta on an-test-coord1001 and unmasked wmf_auto_restart_prometheus-mysqld-exporter (now used)
  • 07:03 elukey: clean up wmf_auto_restart_prometheus-mysqld-exporter@matomo on matomo1002 (not used anymore, listed as failed)
  • 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 schema change', diff saved to https://phabricator.wikimedia.org/P20163 and previous config saved to /var/cache/conftool/dbconfig/20220204-070003-marostegui.json
  • 06:00 legoktm: uploaded pygments 2.11.2 to apt.wm.o (T298399)
  • 02:48 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
  • 02:42 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts elastic2035.codfw.wmnet
  • 02:41 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
  • 01:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:04 brennen: for-real end of utc late backport & config window
  • 01:04 brennen@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/Thanks/modules/ext.thanks.flowthank.js: Backport: Correct attribute for flow thanks (T300831) (duration: 00m 49s)
  • 00:50 brennen: reopening utc late backport window for Correct attribute for flow thanks (T300831)
  • 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:12 cjming: end of UTC late backport & config window
  • 00:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 49s)
  • 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:10 cjming@deploy1002: Synchronized static/images/mobile/copyright/: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 53s)
  • 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-03

  • 23:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20159 and previous config saved to /var/cache/conftool/dbconfig/20220203-233447-marostegui.json
  • 23:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20158 and previous config saved to /var/cache/conftool/dbconfig/20220203-231942-marostegui.json
  • 23:15 ryankemper: T294805 Added a silence on alerts.wikimedia.org for `CirrusSearchJVMGCOldPoolFlatlined`
  • 23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20157 and previous config saved to /var/cache/conftool/dbconfig/20220203-230437-marostegui.json
  • 22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20156 and previous config saved to /var/cache/conftool/dbconfig/20220203-224933-marostegui.json
  • 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20155 and previous config saved to /var/cache/conftool/dbconfig/20220203-223923-marostegui.json
  • 22:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 22:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20154 and previous config saved to /var/cache/conftool/dbconfig/20220203-223916-marostegui.json
  • 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20153 and previous config saved to /var/cache/conftool/dbconfig/20220203-222411-marostegui.json
  • 22:18 ryankemper: T294805 Monitoring https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?orgId=1&var-cirrus_group=eqiad&var-cluster=elasticsearch&var-exported_cluster=production-search&var-smoothing=1&refresh=1m&from=now-3h&to=now as new hosts join the fleet
  • 22:18 ryankemper: T294805 Bringing in new eqiad hosts in batches of 4, with 15-20 mins between batches: `ryankemper@cumin1001:~$ sudo -E cumin -b 4 'elastic1*' 'sudo run-puppet-agent --force; sudo run-puppet-agent; sleep 900'` tmux session `es_eqiad`
  • 22:13 ryankemper: T294805 https://gerrit.wikimedia.org/r/c/operations/puppet/+/759617/ fixed the dependency issues, going to start bringing new hosts into service
  • 22:09 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20152 and previous config saved to /var/cache/conftool/dbconfig/20220203-220906-marostegui.json
  • 22:05 eileen: civicrm revision 7dcdc017 -> 04cbf35b
  • 22:04 volans@cumin2002: START - Cookbook sre.dns.netbox
  • 21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20150 and previous config saved to /var/cache/conftool/dbconfig/20220203-215402-marostegui.json
  • 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20149 and previous config saved to /var/cache/conftool/dbconfig/20220203-215154-marostegui.json
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20148 and previous config saved to /var/cache/conftool/dbconfig/20220203-215121-marostegui.json
  • 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20147 and previous config saved to /var/cache/conftool/dbconfig/20220203-213616-marostegui.json
  • 21:28 rzl: root@apt1001:/home/rzl# reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy # T300324
  • 21:27 rzl: root@apt1001:/home/rzl# reprepro copy stretch-wikimedia buster-wikimedia envoyproxy # T300324
  • 21:21 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759588; hoping this resolves dependency issues. Running puppet agent on `elastic1068`
  • 21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20145 and previous config saved to /var/cache/conftool/dbconfig/20220203-212111-marostegui.json
  • 21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20144 and previous config saved to /var/cache/conftool/dbconfig/20220203-210607-marostegui.json
  • 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20143 and previous config saved to /var/cache/conftool/dbconfig/20220203-210358-marostegui.json
  • 21:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 21:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20142 and previous config saved to /var/cache/conftool/dbconfig/20220203-210350-marostegui.json
  • 20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20140 and previous config saved to /var/cache/conftool/dbconfig/20220203-204846-marostegui.json
  • 20:43 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_recount_categories.service # T299823
  • 20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20139 and previous config saved to /var/cache/conftool/dbconfig/20220203-203341-marostegui.json
  • 20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib` wasn't there: https://phabricator.wikimedia.org/P20138
  • 20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib' wasn't there: https://phabricator.wikimedia.org/P20138
  • 20:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
  • 20:25 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
  • 20:22 ryankemper: T294805 Running puppet on single elastic host: `ryankemper@elastic1068:~$ sudo run-puppet-agent --force`
  • 20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20137 and previous config saved to /var/cache/conftool/dbconfig/20220203-201836-marostegui.json
  • 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20136 and previous config saved to /var/cache/conftool/dbconfig/20220203-201729-marostegui.json
  • 20:17 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759317 to activate roles for elastic eqiad replacement hosts
  • 20:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 20:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20135 and previous config saved to /var/cache/conftool/dbconfig/20220203-201721-marostegui.json
  • 20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:16 ryankemper: T294805 Disabled puppet on `elastic1*` in preparation for bringing new hosts into service: `ryankemper@cumin1001:~$ sudo cumin 'elastic1*' 'sudo disable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805"'`
  • 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1003.eqiad.wmnet with OS buster
  • 20:11 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.20 refs T293961
  • 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:08 mutante: planet1002/planet2002 - sudo systemctl start planet-update-en to manually start update after adding diff.wikimedia.org T230444
  • 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:07 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Drop skin override (T300814) (2/2) (duration: 00m 49s)
  • 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:06 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/skin.json: Backport: Drop skin override (T300814) (1/2) (duration: 00m 49s)
  • 20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS buster
  • 20:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20134 and previous config saved to /var/cache/conftool/dbconfig/20220203-200217-marostegui.json
  • 19:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20133 and previous config saved to /var/cache/conftool/dbconfig/20220203-194712-marostegui.json
  • 19:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
  • 19:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:41 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1003.eqiad.wmnet with OS buster
  • 19:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
  • 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:39 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1004.eqiad.wmnet with OS buster
  • 19:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
  • 19:34 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 00m 49s)
  • 19:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:33 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/ContentTranslation/modules/entrypoints/ext.cx.entrypoints.contributionsmenu.js: Backport: Update skin checks with new vector skin key. (T298916 T300814) (duration: 00m 50s)
  • 19:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
  • 19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20132 and previous config saved to /var/cache/conftool/dbconfig/20220203-193208-marostegui.json
  • 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:29 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/modules/ext.wikiEditor.js: Backport: New bucket for abtest data (T291308) (2/2) (duration: 00m 50s)
  • 19:28 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/includes/Hooks.php: Backport: New bucket for abtest data (T291308) (1/2) (duration: 00m 49s)
  • 19:27 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: Backport: New bucket for abtest data (T291308) (duration: 00m 50s)
  • 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:26 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add three domains to the wgCopyUploadsDomains allowlist (T299835 T300848) (duration: 00m 54s)
  • 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:42 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
  • 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20131 and previous config saved to /var/cache/conftool/dbconfig/20220203-183648-marostegui.json
  • 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20130 and previous config saved to /var/cache/conftool/dbconfig/20220203-183634-marostegui.json
  • 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20129 and previous config saved to /var/cache/conftool/dbconfig/20220203-182129-marostegui.json
  • 18:17 dancy: restarted php7.2-fpm processes on mediawiki12
  • 18:10 dancy: killed 8 spinning php7.2-fpm processes on mediawiki12
  • 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20128 and previous config saved to /var/cache/conftool/dbconfig/20220203-180624-marostegui.json
  • 17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20127 and previous config saved to /var/cache/conftool/dbconfig/20220203-175120-marostegui.json
  • 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20126 and previous config saved to /var/cache/conftool/dbconfig/20220203-174913-marostegui.json
  • 17:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20125 and previous config saved to /var/cache/conftool/dbconfig/20220203-174905-marostegui.json
  • 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20122 and previous config saved to /var/cache/conftool/dbconfig/20220203-173400-marostegui.json
  • 17:22 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
  • 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20120 and previous config saved to /var/cache/conftool/dbconfig/20220203-171856-marostegui.json
  • 17:13 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
  • 17:12 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
  • 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20118 and previous config saved to /var/cache/conftool/dbconfig/20220203-170351-marostegui.json
  • 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20117 and previous config saved to /var/cache/conftool/dbconfig/20220203-170144-marostegui.json
  • 17:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 17:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20116 and previous config saved to /var/cache/conftool/dbconfig/20220203-170136-marostegui.json
  • 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20115 and previous config saved to /var/cache/conftool/dbconfig/20220203-164632-marostegui.json
  • 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20114 and previous config saved to /var/cache/conftool/dbconfig/20220203-163127-marostegui.json
  • 16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20113 and previous config saved to /var/cache/conftool/dbconfig/20220203-162316-marostegui.json
  • 16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20111 and previous config saved to /var/cache/conftool/dbconfig/20220203-161622-marostegui.json
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20110 and previous config saved to /var/cache/conftool/dbconfig/20220203-161515-marostegui.json
  • 16:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20109 and previous config saved to /var/cache/conftool/dbconfig/20220203-161508-marostegui.json
  • 16:10 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:10 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20108 and previous config saved to /var/cache/conftool/dbconfig/20220203-160811-marostegui.json
  • 16:00 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20107 and previous config saved to /var/cache/conftool/dbconfig/20220203-160003-marostegui.json
  • 15:55 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts restbase2011.codfw.wmnet
  • 15:55 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
  • 15:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20106 and previous config saved to /var/cache/conftool/dbconfig/20220203-155306-marostegui.json
  • 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20105 and previous config saved to /var/cache/conftool/dbconfig/20220203-154458-marostegui.json
  • 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20104 and previous config saved to /var/cache/conftool/dbconfig/20220203-153801-marostegui.json
  • 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20103 and previous config saved to /var/cache/conftool/dbconfig/20220203-153653-marostegui.json
  • 15:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 15:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
  • 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20102 and previous config saved to /var/cache/conftool/dbconfig/20220203-153646-marostegui.json
  • 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 15:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20101 and previous config saved to /var/cache/conftool/dbconfig/20220203-152953-marostegui.json
  • 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20100 and previous config saved to /var/cache/conftool/dbconfig/20220203-152746-marostegui.json
  • 15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20099 and previous config saved to /var/cache/conftool/dbconfig/20220203-152739-marostegui.json
  • 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20098 and previous config saved to /var/cache/conftool/dbconfig/20220203-152141-marostegui.json
  • 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20097 and previous config saved to /var/cache/conftool/dbconfig/20220203-151234-marostegui.json
  • 15:12 moritzm: installing apache security updates on gerrit1001
  • 15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20096 and previous config saved to /var/cache/conftool/dbconfig/20220203-150636-marostegui.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20095 and previous config saved to /var/cache/conftool/dbconfig/20220203-145729-marostegui.json
  • 14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20094 and previous config saved to /var/cache/conftool/dbconfig/20220203-145132-marostegui.json
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20093 and previous config saved to /var/cache/conftool/dbconfig/20220203-145024-marostegui.json
  • 14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20092 and previous config saved to /var/cache/conftool/dbconfig/20220203-145017-marostegui.json
  • 14:44 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20091 and previous config saved to /var/cache/conftool/dbconfig/20220203-144224-marostegui.json
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20090 and previous config saved to /var/cache/conftool/dbconfig/20220203-144017-marostegui.json
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20089 and previous config saved to /var/cache/conftool/dbconfig/20220203-143544-marostegui.json
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20088 and previous config saved to /var/cache/conftool/dbconfig/20220203-143512-marostegui.json
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20087 and previous config saved to /var/cache/conftool/dbconfig/20220203-142039-marostegui.json
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20086 and previous config saved to /var/cache/conftool/dbconfig/20220203-142007-marostegui.json
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20085 and previous config saved to /var/cache/conftool/dbconfig/20220203-140534-marostegui.json
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20084 and previous config saved to /var/cache/conftool/dbconfig/20220203-140503-marostegui.json
  • 13:53 XioNoX: eqiad: push Capirca generated border-in filters
  • 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20083 and previous config saved to /var/cache/conftool/dbconfig/20220203-135029-marostegui.json
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20082 and previous config saved to /var/cache/conftool/dbconfig/20220203-134952-marostegui.json
  • 13:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 13:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20081 and previous config saved to /var/cache/conftool/dbconfig/20220203-134944-marostegui.json
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20080 and previous config saved to /var/cache/conftool/dbconfig/20220203-134746-marostegui.json
  • 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20079 and previous config saved to /var/cache/conftool/dbconfig/20220203-134739-marostegui.json
  • 13:44 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 jayme@cumin1001: START - Cookbook sre.dns.netbox
  • 13:35 jbond: disable puppet fleet wide for puppetdb restart
  • 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20078 and previous config saved to /var/cache/conftool/dbconfig/20220203-133439-marostegui.json
  • 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20077 and previous config saved to /var/cache/conftool/dbconfig/20220203-133234-marostegui.json
  • 13:28 marostegui: Test T300858
  • 13:28 moritzm: installing apache security updates
  • 13:27 jayme: moved kubernetes staging master,nodes,etcd from wikimedia_cluster "kubernetes" to "kubernetes-staging" - T273866
  • 13:27 XioNoX: esams: push Capirca generated border-in filters
  • 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20076 and previous config saved to /var/cache/conftool/dbconfig/20220203-131935-marostegui.json
  • 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20075 and previous config saved to /var/cache/conftool/dbconfig/20220203-131729-marostegui.json
  • 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20074 and previous config saved to /var/cache/conftool/dbconfig/20220203-130430-marostegui.json
  • 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20073 and previous config saved to /var/cache/conftool/dbconfig/20220203-130224-marostegui.json
  • 12:58 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 12:57 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20072 and previous config saved to /var/cache/conftool/dbconfig/20220203-125737-marostegui.json
  • 12:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20071 and previous config saved to /var/cache/conftool/dbconfig/20220203-125730-marostegui.json
  • 12:53 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
  • 12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
  • 12:52 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 12:51 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
  • 12:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 12:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:48 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
  • 12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 12:44 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 12:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 12:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 12:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20069 and previous config saved to /var/cache/conftool/dbconfig/20220203-124225-marostegui.json
  • 12:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:38 taavi: UTC morning backport window done
  • 12:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:33 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: mniwiktionary: Add localized mobile wordmark (T294709) (2/2) (duration: 00m 49s)
  • 12:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:32 taavi@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-mni.svg: Config: mniwiktionary: Add localized mobile wordmark (T294709) (1/2) (duration: 00m 50s)
  • 12:29 XioNoX: eqsin: push Capirca generated border-in filters
  • 12:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20068 and previous config saved to /var/cache/conftool/dbconfig/20220203-122720-marostegui.json
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20067 and previous config saved to /var/cache/conftool/dbconfig/20220203-122612-marostegui.json
  • 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
  • 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
  • 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20066 and previous config saved to /var/cache/conftool/dbconfig/20220203-122529-marostegui.json
  • 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:19 XioNoX: codfw: push Capirca generated border-in filters
  • 12:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:16 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add www.gbols.smns-bw.org to the wgCopyUploadsDomains allowlist (T300842) (duration: 00m 50s)
  • 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20065 and previous config saved to /var/cache/conftool/dbconfig/20220203-121216-marostegui.json
  • 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20064 and previous config saved to /var/cache/conftool/dbconfig/20220203-121024-marostegui.json
  • 12:10 XioNoX: eqord: push Capirca generated border-in filters
  • 12:09 mlitn@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Stop normalizing full text scores (T296631) (duration: 00m 52s)
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20063 and previous config saved to /var/cache/conftool/dbconfig/20220203-120832-marostegui.json
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20062 and previous config saved to /var/cache/conftool/dbconfig/20220203-120825-marostegui.json
  • 11:57 kart_: Updated cxserver to 2022-02-03-112745-production, this should unbreak Flores MT!
  • 11:57 XioNoX: ulsfo: push Capirca generated border-in filters
  • 11:55 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
  • 11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20061 and previous config saved to /var/cache/conftool/dbconfig/20220203-115519-marostegui.json
  • 11:53 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
  • 11:53 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20060 and previous config saved to /var/cache/conftool/dbconfig/20220203-115320-marostegui.json
  • 11:51 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
  • 11:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
  • 11:49 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
  • 11:47 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
  • 11:46 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
  • 11:46 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
  • 11:45 moritzm: installing openjdk-11 security updates
  • 11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20059 and previous config saved to /var/cache/conftool/dbconfig/20220203-114015-marostegui.json
  • 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20058 and previous config saved to /var/cache/conftool/dbconfig/20220203-113907-marostegui.json
  • 11:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 11:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20057 and previous config saved to /var/cache/conftool/dbconfig/20220203-113859-marostegui.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20056 and previous config saved to /var/cache/conftool/dbconfig/20220203-113815-marostegui.json
  • 11:36 arturo: reprepro changes @ apt1001 after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/758050
  • 11:33 moritzm: draining ganeti1020 for eventual reimage
  • 11:26 vgutierrez: rolling varnish-fe restart to catch the new listen_depth config value
  • 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20055 and previous config saved to /var/cache/conftool/dbconfig/20220203-112355-marostegui.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20054 and previous config saved to /var/cache/conftool/dbconfig/20220203-112311-marostegui.json
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20053 and previous config saved to /var/cache/conftool/dbconfig/20220203-111921-marostegui.json
  • 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20052 and previous config saved to /var/cache/conftool/dbconfig/20220203-111908-marostegui.json
  • 11:15 topranks: Adding BGP peering to lsw1-f1-eqiad on cr2-eqiad. T299758.
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20051 and previous config saved to /var/cache/conftool/dbconfig/20220203-110850-marostegui.json
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20050 and previous config saved to /var/cache/conftool/dbconfig/20220203-110403-marostegui.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20049 and previous config saved to /var/cache/conftool/dbconfig/20220203-105345-marostegui.json
  • 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20048 and previous config saved to /var/cache/conftool/dbconfig/20220203-105238-marostegui.json
  • 10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20047 and previous config saved to /var/cache/conftool/dbconfig/20220203-105230-marostegui.json
  • 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20046 and previous config saved to /var/cache/conftool/dbconfig/20220203-104858-marostegui.json
  • 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20045 and previous config saved to /var/cache/conftool/dbconfig/20220203-103725-marostegui.json
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20044 and previous config saved to /var/cache/conftool/dbconfig/20220203-103354-marostegui.json
  • 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20043 and previous config saved to /var/cache/conftool/dbconfig/20220203-103008-marostegui.json
  • 10:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 10:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20042 and previous config saved to /var/cache/conftool/dbconfig/20220203-103001-marostegui.json
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20041 and previous config saved to /var/cache/conftool/dbconfig/20220203-102221-marostegui.json
  • 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20040 and previous config saved to /var/cache/conftool/dbconfig/20220203-101456-marostegui.json
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1015.eqiad.wmnet
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1014.eqiad.wmnet
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20039 and previous config saved to /var/cache/conftool/dbconfig/20220203-100716-marostegui.json
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1013.eqiad.wmnet
  • 10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1012.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1010.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1015.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1014.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1013.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1012.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1011.eqiad.wmnet
  • 10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1010.eqiad.wmnet
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20038 and previous config saved to /var/cache/conftool/dbconfig/20220203-095952-marostegui.json
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20037 and previous config saved to /var/cache/conftool/dbconfig/20220203-095907-marostegui.json
  • 09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20036 and previous config saved to /var/cache/conftool/dbconfig/20220203-095859-marostegui.json
  • 09:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1183.eqiad.wmnet with OS bullseye
  • 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20034 and previous config saved to /var/cache/conftool/dbconfig/20220203-094447-marostegui.json
  • 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20033 and previous config saved to /var/cache/conftool/dbconfig/20220203-094354-marostegui.json
  • 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20032 and previous config saved to /var/cache/conftool/dbconfig/20220203-094107-marostegui.json
  • 09:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20031 and previous config saved to /var/cache/conftool/dbconfig/20220203-094059-marostegui.json
  • 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1183.eqiad.wmnet with OS bullseye
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20030 and previous config saved to /var/cache/conftool/dbconfig/20220203-092850-marostegui.json
  • 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20029 and previous config saved to /var/cache/conftool/dbconfig/20220203-092554-marostegui.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20028 and previous config saved to /var/cache/conftool/dbconfig/20220203-091345-marostegui.json
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20027 and previous config saved to /var/cache/conftool/dbconfig/20220203-091237-marostegui.json
  • 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20026 and previous config saved to /var/cache/conftool/dbconfig/20220203-091224-marostegui.json
  • 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20025 and previous config saved to /var/cache/conftool/dbconfig/20220203-091050-marostegui.json
  • 09:00 marostegui: Failover m2 from db1183 to db1159 - T300329
  • 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20024 and previous config saved to /var/cache/conftool/dbconfig/20220203-085720-marostegui.json
  • 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20023 and previous config saved to /var/cache/conftool/dbconfig/20220203-085545-marostegui.json
  • 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20022 and previous config saved to /var/cache/conftool/dbconfig/20220203-085159-marostegui.json
  • 08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 08:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20021 and previous config saved to /var/cache/conftool/dbconfig/20220203-085151-marostegui.json
  • 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20020 and previous config saved to /var/cache/conftool/dbconfig/20220203-084215-marostegui.json
  • 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20019 and previous config saved to /var/cache/conftool/dbconfig/20220203-083647-marostegui.json
  • 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20018 and previous config saved to /var/cache/conftool/dbconfig/20220203-082710-marostegui.json
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20017 and previous config saved to /var/cache/conftool/dbconfig/20220203-082302-marostegui.json
  • 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20016 and previous config saved to /var/cache/conftool/dbconfig/20220203-082249-marostegui.json
  • 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20015 and previous config saved to /var/cache/conftool/dbconfig/20220203-082142-marostegui.json
  • 08:10 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 5hours)
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20014 and previous config saved to /var/cache/conftool/dbconfig/20220203-080745-marostegui.json
  • 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20013 and previous config saved to /var/cache/conftool/dbconfig/20220203-080637-marostegui.json
  • 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20012 and previous config saved to /var/cache/conftool/dbconfig/20220203-080254-marostegui.json
  • 08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 08:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20011 and previous config saved to /var/cache/conftool/dbconfig/20220203-080247-marostegui.json
  • 07:55 _joe_: restarted php-fpm on wtp1029, segfaulting
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20010 and previous config saved to /var/cache/conftool/dbconfig/20220203-075240-marostegui.json
  • 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20009 and previous config saved to /var/cache/conftool/dbconfig/20220203-074742-marostegui.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20008 and previous config saved to /var/cache/conftool/dbconfig/20220203-073735-marostegui.json
  • 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20007 and previous config saved to /var/cache/conftool/dbconfig/20220203-073237-marostegui.json
  • 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20006 and previous config saved to /var/cache/conftool/dbconfig/20220203-073129-marostegui.json
  • 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 07:23 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 T300329
  • 07:23 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 T300329
  • 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20005 and previous config saved to /var/cache/conftool/dbconfig/20220203-071732-marostegui.json
  • 07:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 07:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20004 and previous config saved to /var/cache/conftool/dbconfig/20220203-071348-marostegui.json
  • 07:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 07:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 07:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 07:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P20003 and previous config saved to /var/cache/conftool/dbconfig/20220203-071141-marostegui.json
  • 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P20002 and previous config saved to /var/cache/conftool/dbconfig/20220203-071111-marostegui.json
  • 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20001 and previous config saved to /var/cache/conftool/dbconfig/20220203-065636-marostegui.json
  • 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20000 and previous config saved to /var/cache/conftool/dbconfig/20220203-065606-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19999 and previous config saved to /var/cache/conftool/dbconfig/20220203-064131-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P19998 and previous config saved to /var/cache/conftool/dbconfig/20220203-064101-marostegui.json
  • 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P19997 and previous config saved to /var/cache/conftool/dbconfig/20220203-062627-marostegui.json
  • 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P19996 and previous config saved to /var/cache/conftool/dbconfig/20220203-062556-marostegui.json
  • 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P19995 and previous config saved to /var/cache/conftool/dbconfig/20220203-062243-marostegui.json
  • 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 06:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P19994 and previous config saved to /var/cache/conftool/dbconfig/20220203-061703-marostegui.json
  • 06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 06:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 01:12 brennen: UTC late backport window finished
  • 01:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti2029.codfw.wmnet with OS buster
  • 01:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:09 brennen@deploy1002: Finished scap: Backports: Changes the labels of the Vector skins (T299927) and Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 24m 48s)
  • 01:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:44 brennen@deploy1002: Started scap: Backports: Changes the labels of the Vector skins (T299927) and Pass skin name to Hooks::isSkinLegacy (T299971)
  • 00:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster

2022-02-02

  • 22:26 mutante: gitlab - introducing parameter to fetch TLS certs either with acmechief or certbot (if in cloud). Boolean $use_acmechief = lookup('profile::gitlab::use_acmechief'), confirmed noop in prod on gitlab1001.wikimedia.org ( T297411)
  • 21:36 ejegg: updated CiviCRM from 2bd5fb5e to 7dcdc017
  • 20:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:04 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.20 refs T293961 (duration: 00m 49s)
  • 20:03 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.20 refs T293961
  • 19:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:49 dancy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: cowikimedia: Allow bureaucrats to remove sysop and bureaucrat flags (T300779) (duration: 00m 50s)
  • 19:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:42 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: multiversion: Improve error message if wikiversions.php has wrong format (duration: 00m 49s)
  • 19:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 62b2acb: Migration mode enabled everywhere (T299927) (duration: 00m 49s)
  • 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:27 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/SkinVector.php: bdc20dd: Fix the opt in URl (T300097) (duration: 00m 49s)
  • 19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:24 urbanecm@deploy1002: Synchronized wmf-config/: a48f8bd: Migrate calls of wmf* constants to wmg* constants (T45956) (duration: 00m 51s)
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19993 and previous config saved to /var/cache/conftool/dbconfig/20220202-191918-marostegui.json
  • 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:14 urbanecm@deploy1002: Synchronized multiversion/buildConfigCache.php: 83f1f6a: Consistently write to $wmgRealm the same value as to $wmfRealm (T45956) (duration: 00m 49s)
  • 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:10 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/{kywiki,kywiki-1.5x,kywiki-2x}.png (T300241)
  • 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:09 topranks: Running homer to enable interface et-1/0/2 on cr1-eqiad (towards lsw1-e1-eqiad) to test connectivity.
  • 19:09 urbanecm@deploy1002: Synchronized logos/config.yaml: 335cbee: kywiki: update logo (3/3; T300241) (duration: 00m 49s)
  • 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:08 urbanecm@deploy1002: Synchronized wmf-config/logos.php: 335cbee: kywiki: update logo (2/3; T300241) (duration: 00m 53s)
  • 19:07 urbanecm@deploy1002: Synchronized static/images/project-logos/: 335cbee: kywiki: update logo (1/3; T300241) (duration: 00m 50s)
  • 19:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19992 and previous config saved to /var/cache/conftool/dbconfig/20220202-190414-marostegui.json
  • 18:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19991 and previous config saved to /var/cache/conftool/dbconfig/20220202-184909-marostegui.json
  • 18:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19990 and previous config saved to /var/cache/conftool/dbconfig/20220202-183404-marostegui.json
  • 18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19989 and previous config saved to /var/cache/conftool/dbconfig/20220202-183034-marostegui.json
  • 18:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19988 and previous config saved to /var/cache/conftool/dbconfig/20220202-183027-marostegui.json
  • 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 18:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 18:16 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/filerepo/file/ForeignAPIFile.php: Backport: Revert "Support audio on filepage in InstantCommons" (T300751) (duration: 00m 51s)
  • 18:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19987 and previous config saved to /var/cache/conftool/dbconfig/20220202-181522-marostegui.json
  • 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19986 and previous config saved to /var/cache/conftool/dbconfig/20220202-180018-marostegui.json
  • 17:45 cwhite: end logstash upgrade (codfw) T299168
  • 17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19985 and previous config saved to /var/cache/conftool/dbconfig/20220202-174513-marostegui.json
  • 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19984 and previous config saved to /var/cache/conftool/dbconfig/20220202-174138-marostegui.json
  • 17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19983 and previous config saved to /var/cache/conftool/dbconfig/20220202-174125-marostegui.json
  • 17:32 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 17:26 cwhite: begin logstash upgrade (codfw) T299168
  • 17:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19982 and previous config saved to /var/cache/conftool/dbconfig/20220202-172620-marostegui.json
  • 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19981 and previous config saved to /var/cache/conftool/dbconfig/20220202-171115-marostegui.json
  • 16:59 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 16:59 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 16:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19979 and previous config saved to /var/cache/conftool/dbconfig/20220202-165611-marostegui.json
  • 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
  • 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
  • 16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
  • 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2012.codfw.wmnet
  • 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2011.codfw.wmnet
  • 16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2010.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2012.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2011.codfw.wmnet
  • 16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2010.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2005.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2006.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2007.codfw.wmnet
  • 16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2008.codfw.wmnet
  • 16:41 Emperor: standardising nginx weights for codfw swift proxies to match eqiad ones T300738
  • 16:41 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2009.codfw.wmnet
  • 16:41 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2009.codfw.wmnet
  • 16:39 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:38 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:30 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
  • 16:27 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:26 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
  • 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19977 and previous config saved to /var/cache/conftool/dbconfig/20220202-162435-marostegui.json
  • 16:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19976 and previous config saved to /var/cache/conftool/dbconfig/20220202-162428-marostegui.json
  • 16:24 jbond: disable ldap email checks on mx2001
  • 16:19 Emperor: rolling restart of swift frontends to bring new ones into service T300738
  • 16:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19975 and previous config saved to /var/cache/conftool/dbconfig/20220202-160923-marostegui.json
  • 15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19974 and previous config saved to /var/cache/conftool/dbconfig/20220202-155418-marostegui.json
  • 15:45 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:44 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:43 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:43 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:41 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:41 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19973 and previous config saved to /var/cache/conftool/dbconfig/20220202-153913-marostegui.json
  • 15:37 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 15:37 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:35 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:35 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:34 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19972 and previous config saved to /var/cache/conftool/dbconfig/20220202-153206-marostegui.json
  • 15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 15:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 15:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19970 and previous config saved to /var/cache/conftool/dbconfig/20220202-152552-marostegui.json
  • 15:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19969 and previous config saved to /var/cache/conftool/dbconfig/20220202-151047-marostegui.json
  • 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19968 and previous config saved to /var/cache/conftool/dbconfig/20220202-150832-root.json
  • 15:00 XioNoX: esams: push Capirca generated loopback filters
  • 14:59 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19967 and previous config saved to /var/cache/conftool/dbconfig/20220202-145542-marostegui.json
  • 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19966 and previous config saved to /var/cache/conftool/dbconfig/20220202-145329-root.json
  • 14:47 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 XioNoX: codfw: push Capirca generated loopback filters
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19965 and previous config saved to /var/cache/conftool/dbconfig/20220202-144038-marostegui.json
  • 14:39 jayme@cumin1001: START - Cookbook sre.dns.netbox
  • 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19963 and previous config saved to /var/cache/conftool/dbconfig/20220202-143825-root.json
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19962 and previous config saved to /var/cache/conftool/dbconfig/20220202-143221-marostegui.json
  • 14:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19961 and previous config saved to /var/cache/conftool/dbconfig/20220202-143214-marostegui.json
  • 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19960 and previous config saved to /var/cache/conftool/dbconfig/20220202-142321-root.json
  • 14:21 XioNoX: eqsin: push Capirca generated loopback filters
  • 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19959 and previous config saved to /var/cache/conftool/dbconfig/20220202-141709-marostegui.json
  • 14:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 14:15 XioNoX: cr2-eqdfw: push Capirca generated loopback filters
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es1020 - as it is the master', diff saved to https://phabricator.wikimedia.org/P19958 and previous config saved to /var/cache/conftool/dbconfig/20220202-141455-marostegui.json
  • 14:13 vgutierrez: pool cp1087 running envoy as TLS terminator - T271421
  • 14:09 XioNoX: cr2-eqord: push Capirca generated loopback filters
  • 14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19957 and previous config saved to /var/cache/conftool/dbconfig/20220202-140818-root.json
  • 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179 schema change', diff saved to https://phabricator.wikimedia.org/P19956 and previous config saved to /var/cache/conftool/dbconfig/20220202-140317-marostegui.json
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19955 and previous config saved to /var/cache/conftool/dbconfig/20220202-140239-root.json
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19954 and previous config saved to /var/cache/conftool/dbconfig/20220202-140204-marostegui.json
  • 13:50 elukey: move docker on ml-serve-ctrl* nodes from device mapper to overlay2
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19953 and previous config saved to /var/cache/conftool/dbconfig/20220202-134735-root.json
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19952 and previous config saved to /var/cache/conftool/dbconfig/20220202-134659-marostegui.json
  • 13:40 XioNoX: ULSFO routers: push Capirca generated loopback filters
  • 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19951 and previous config saved to /var/cache/conftool/dbconfig/20220202-133713-marostegui.json
  • 13:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 13:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 13:35 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
  • 13:34 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on production
  • 13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
  • 13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
  • 13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on production
  • 13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on canary
  • 13:32 ottomata: roll restarting eventgate-main to pick up stream-configs for rdf-streaming-updater.reconcile
  • 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19949 and previous config saved to /var/cache/conftool/dbconfig/20220202-133231-root.json
  • 13:31 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:31 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
  • 13:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 13:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 13:29 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
  • 13:28 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on canary
  • 13:28 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync on production
  • 13:25 XioNoX: rename cr3-ulsfo loopback terms in preparation of move to Capirca
  • 13:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19947 and previous config saved to /var/cache/conftool/dbconfig/20220202-132510-marostegui.json
  • 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19946 and previous config saved to /var/cache/conftool/dbconfig/20220202-131728-root.json
  • 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19945 and previous config saved to /var/cache/conftool/dbconfig/20220202-131006-marostegui.json
  • 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 13:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19944 and previous config saved to /var/cache/conftool/dbconfig/20220202-130224-root.json
  • 12:59 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: ULS: Remove unused ULSEventLogging variable (T275894) (duration: 00m 49s)
  • 12:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19942 and previous config saved to /var/cache/conftool/dbconfig/20220202-125500-marostegui.json
  • 12:54 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Clean-up decommisioned Print schema configs (T196159) (duration: 00m 50s)
  • 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19941 and previous config saved to /var/cache/conftool/dbconfig/20220202-125034-root.json
  • 12:43 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1087.eqiad.wmnet with OS buster
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T298558)', diff saved to https://phabricator.wikimedia.org/P19940 and previous config saved to /var/cache/conftool/dbconfig/20220202-124122-marostegui.json
  • 12:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 12:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19939 and previous config saved to /var/cache/conftool/dbconfig/20220202-124115-marostegui.json
  • 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19938 and previous config saved to /var/cache/conftool/dbconfig/20220202-123956-marostegui.json
  • 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19937 and previous config saved to /var/cache/conftool/dbconfig/20220202-123531-root.json
  • 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 12:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19936 and previous config saved to /var/cache/conftool/dbconfig/20220202-123127-marostegui.json
  • 12:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 12:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19934 and previous config saved to /var/cache/conftool/dbconfig/20220202-122610-marostegui.json
  • 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19933 and previous config saved to /var/cache/conftool/dbconfig/20220202-122112-marostegui.json
  • 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
  • 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 65%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19932 and previous config saved to /var/cache/conftool/dbconfig/20220202-122027-root.json
  • 12:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 12:11 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: prod: READ_NEW for CentralAuth hidden level migration (T289068) (duration: 00m 50s)
  • 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19930 and previous config saved to /var/cache/conftool/dbconfig/20220202-121105-marostegui.json
  • 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19929 and previous config saved to /var/cache/conftool/dbconfig/20220202-120608-marostegui.json
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19928 and previous config saved to /var/cache/conftool/dbconfig/20220202-120524-root.json
  • 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19927 and previous config saved to /var/cache/conftool/dbconfig/20220202-115601-marostegui.json
  • 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19926 and previous config saved to /var/cache/conftool/dbconfig/20220202-115103-marostegui.json
  • 11:50 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp1087.eqiad.wmnet with OS buster
  • 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19925 and previous config saved to /var/cache/conftool/dbconfig/20220202-115020-root.json
  • 11:48 vgutierrez: depool cp1087 to be reimaged as cache::text_envoy - T271421
  • 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19924 and previous config saved to /var/cache/conftool/dbconfig/20220202-114639-marostegui.json
  • 11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 11:45 _joe_: repooling thanos-fe1001 T300119
  • 11:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19923 and previous config saved to /var/cache/conftool/dbconfig/20220202-113558-marostegui.json
  • 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19922 and previous config saved to /var/cache/conftool/dbconfig/20220202-113516-root.json
  • 11:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19921 and previous config saved to /var/cache/conftool/dbconfig/20220202-113007-marostegui.json
  • 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19920 and previous config saved to /var/cache/conftool/dbconfig/20220202-112849-marostegui.json
  • 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:28 _joe_: depooling thanos-fe1001 for testing T300119
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 15%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19919 and previous config saved to /var/cache/conftool/dbconfig/20220202-112013-root.json
  • 11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19918 and previous config saved to /var/cache/conftool/dbconfig/20220202-111804-marostegui.json
  • 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19917 and previous config saved to /var/cache/conftool/dbconfig/20220202-111502-marostegui.json
  • 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19916 and previous config saved to /var/cache/conftool/dbconfig/20220202-110509-root.json
  • 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19915 and previous config saved to /var/cache/conftool/dbconfig/20220202-110259-marostegui.json
  • 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19914 and previous config saved to /var/cache/conftool/dbconfig/20220202-105957-marostegui.json
  • 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19913 and previous config saved to /var/cache/conftool/dbconfig/20220202-105006-root.json
  • 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19912 and previous config saved to /var/cache/conftool/dbconfig/20220202-104755-marostegui.json
  • 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19911 and previous config saved to /var/cache/conftool/dbconfig/20220202-104453-marostegui.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges and recentchanges groups from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19910 and previous config saved to /var/cache/conftool/dbconfig/20220202-103830-marostegui.json
  • 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 2%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19909 and previous config saved to /var/cache/conftool/dbconfig/20220202-103502-root.json
  • 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1021 after reimage', diff saved to https://phabricator.wikimedia.org/P19908 and previous config saved to /var/cache/conftool/dbconfig/20220202-103436-marostegui.json
  • 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 1%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19907 and previous config saved to /var/cache/conftool/dbconfig/20220202-103401-root.json
  • 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19906 and previous config saved to /var/cache/conftool/dbconfig/20220202-103250-marostegui.json
  • 10:28 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19905 and previous config saved to /var/cache/conftool/dbconfig/20220202-102717-marostegui.json
  • 10:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 10:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 10:23 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:22 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1008.eqiad.wmnet with OS buster
  • 10:21 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:21 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 10:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 10:11 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 10:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1021.eqiad.wmnet with OS bullseye
  • 10:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
  • 10:09 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
  • 10:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 10:06 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 04s)
  • 10:06 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 10:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 10:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1008.eqiad.wmnet with OS buster
  • 09:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1019.eqiad.wmnet with OS buster
  • 09:40 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bullseye
  • 09:39 moritzm: installing apache/apache-modsecurity2 security updates
  • 09:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 09:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19904 and previous config saved to /var/cache/conftool/dbconfig/20220202-093231-marostegui.json
  • 09:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 09:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19903 and previous config saved to /var/cache/conftool/dbconfig/20220202-093223-marostegui.json
  • 09:28 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19902 and previous config saved to /var/cache/conftool/dbconfig/20220202-091718-marostegui.json
  • 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS buster
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1021 T300127', diff saved to https://phabricator.wikimedia.org/P19901 and previous config saved to /var/cache/conftool/dbconfig/20220202-091355-marostegui.json
  • 09:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 09:10 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 09:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 09:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 09:08 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 09:08 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 09:07 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Enable writes on es4 T300127 (duration: 00m 50s)
  • 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19900 and previous config saved to /var/cache/conftool/dbconfig/20220202-090214-marostegui.json
  • 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1020 to es4 primary and set section read-write T300127', diff saved to https://phabricator.wikimedia.org/P19899 and previous config saved to /var/cache/conftool/dbconfig/20220202-090121-marostegui.json
  • 09:00 marostegui: Starting es4 eqiad failover from es1021 to es1020 - T300127
  • 08:52 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 08:52 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 08:48 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 08:48 root@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19898 and previous config saved to /var/cache/conftool/dbconfig/20220202-084709-marostegui.json
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19897 and previous config saved to /var/cache/conftool/dbconfig/20220202-084150-marostegui.json
  • 08:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19896 and previous config saved to /var/cache/conftool/dbconfig/20220202-084143-marostegui.json
  • 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19895 and previous config saved to /var/cache/conftool/dbconfig/20220202-082638-marostegui.json
  • 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19894 and previous config saved to /var/cache/conftool/dbconfig/20220202-081134-marostegui.json
  • 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19893 and previous config saved to /var/cache/conftool/dbconfig/20220202-075629-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19892 and previous config saved to /var/cache/conftool/dbconfig/20220202-075244-marostegui.json
  • 07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19891 and previous config saved to /var/cache/conftool/dbconfig/20220202-075236-marostegui.json
  • 07:51 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1) (duration: 04m 09s)
  • 07:47 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1)
  • 07:46 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 07:45 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 07:44 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (duration: 02m 19s)
  • 07:42 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard
  • 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Set es1020 with weight 10 T300127', diff saved to https://phabricator.wikimedia.org/P19890 and previous config saved to /var/cache/conftool/dbconfig/20220202-073918-root.json
  • 07:38 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 07:38 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switchover es4 T300127
  • 07:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19889 and previous config saved to /var/cache/conftool/dbconfig/20220202-073731-marostegui.json
  • 07:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 07:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 07:36 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Disable writes on es4 T300127 (duration: 00m 50s)
  • 07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 07:30 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Disable writes on es4 T300127 (duration: 00m 51s)
  • 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19888 and previous config saved to /var/cache/conftool/dbconfig/20220202-072227-marostegui.json
  • 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19887 and previous config saved to /var/cache/conftool/dbconfig/20220202-070722-marostegui.json
  • 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19886 and previous config saved to /var/cache/conftool/dbconfig/20220202-070012-marostegui.json
  • 07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 02:54 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 02:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS buster
  • 02:19 ejegg: updated CiviCRM from 0513f1b7 to 3d379e25
  • 01:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS buster
  • 01:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS buster
  • 01:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
  • 01:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
  • 01:12 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
  • 01:12 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
  • 01:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:03 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: rdf-streaming-updater: add the reconciliation stream (T279541) (duration: 00m 49s)
  • 00:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
  • 00:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:51 urbanecm: UTC late B&C window completed
  • 00:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b560843: Add wgUploadNavigationUrl upload page of ptwikinews (T300466) (duration: 00m 50s)
  • 00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS buster
  • 00:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:40 urbanecm@deploy1002: Synchronized docroot/noc/db.php: 06444c1: Start writing to some wmg* constants (T45956; 2/2) (duration: 00m 49s)
  • 00:39 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 06444c1: Start writing to some wmg* constants (T45956; 1/2) (duration: 00m 49s)
  • 00:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:29 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b2c13c6: Enable migration mode on all group 0, group 1 and desktop-improvement wikis (T299927) (duration: 01m 58s)
  • 00:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS buster
  • 00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-01

  • 22:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
  • 22:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2002-dev.codfw.wmnet with OS bullseye
  • 22:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
  • 22:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ml-serve2005.codfw.wmnet with OS buster
  • 22:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
  • 21:55 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2002-dev.codfw.wmnet with OS bullseye
  • 21:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:14 Lucas_WMDE: Deployed patch for T297754
  • 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:42 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.17 (duration: 01m 35s)
  • 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:38 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.20 refs T293961
  • 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19884 and previous config saved to /var/cache/conftool/dbconfig/20220201-202806-marostegui.json
  • 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 20:21 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.18 (duration: 04m 08s)
  • 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 20:20 ejegg: updated payments-wiki from 933e8669 to dbcb5254
  • 20:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P19882 and previous config saved to /var/cache/conftool/dbconfig/20220201-201259-marostegui.json
  • 20:12 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.20 refs T293961 (duration: 51m 42s)
  • 20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P19881 and previous config saved to /var/cache/conftool/dbconfig/20220201-195755-marostegui.json
  • 19:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
  • 19:55 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e] (hadoop-test): Hotfix analytics weekly train TEST [analytics/refinery@6a7983e] (duration: 05m 51s)
  • 19:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:49 joal@deploy1002: Started deploy [analytics/refinery@6a7983e] (hadoop-test): Hotfix analytics weekly train TEST [analytics/refinery@6a7983e]
  • 19:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19880 and previous config saved to /var/cache/conftool/dbconfig/20220201-194250-marostegui.json
  • 19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19879 and previous config saved to /var/cache/conftool/dbconfig/20220201-194144-marostegui.json
  • 19:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19878 and previous config saved to /var/cache/conftool/dbconfig/20220201-194136-marostegui.json
  • 19:40 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e] (thin): Hotfix analytics weekly train THIN [analytics/refinery@6a7983e] (duration: 00m 07s)
  • 19:40 joal@deploy1002: Started deploy [analytics/refinery@6a7983e] (thin): Hotfix analytics weekly train THIN [analytics/refinery@6a7983e]
  • 19:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P19877 and previous config saved to /var/cache/conftool/dbconfig/20220201-192632-marostegui.json
  • 19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:22 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e]: Hotfix analytics weekly train [analytics/refinery@6a7983e] (duration: 19m 09s)
  • 19:20 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.20 refs T293961
  • 19:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS buster
  • 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 19:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P19876 and previous config saved to /var/cache/conftool/dbconfig/20220201-191127-marostegui.json
  • 19:02 joal@deploy1002: Started deploy [analytics/refinery@6a7983e]: Hotfix analytics weekly train [analytics/refinery@6a7983e]
  • 18:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19875 and previous config saved to /var/cache/conftool/dbconfig/20220201-185622-marostegui.json
  • 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19874 and previous config saved to /var/cache/conftool/dbconfig/20220201-185516-marostegui.json
  • 18:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19873 and previous config saved to /var/cache/conftool/dbconfig/20220201-185507-marostegui.json
  • 18:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS buster
  • 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS buster
  • 18:40 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19872 and previous config saved to /var/cache/conftool/dbconfig/20220201-184027-root.json
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P19871 and previous config saved to /var/cache/conftool/dbconfig/20220201-184002-marostegui.json
  • 18:38 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
  • 18:38 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply on canary
  • 18:38 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply on production
  • 18:36 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
  • 18:35 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
  • 18:33 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply on canary
  • 18:33 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply on production
  • 18:30 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
  • 18:29 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply on canary
  • 18:29 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply on production
  • 18:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19870 and previous config saved to /var/cache/conftool/dbconfig/20220201-182523-root.json
  • 18:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS buster
  • 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P19869 and previous config saved to /var/cache/conftool/dbconfig/20220201-182458-marostegui.json
  • 18:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-staging2001.codfw.wmnet with OS buster
  • 18:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 60%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19868 and previous config saved to /var/cache/conftool/dbconfig/20220201-181019-root.json
  • 18:10 cwhite: end logstash upgrade (eqiad) T299168
  • 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19867 and previous config saved to /var/cache/conftool/dbconfig/20220201-180953-marostegui.json
  • 18:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19866 and previous config saved to /var/cache/conftool/dbconfig/20220201-180847-marostegui.json
  • 18:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 18:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 18:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19865 and previous config saved to /var/cache/conftool/dbconfig/20220201-180839-marostegui.json
  • 18:04 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2017.wmnet
  • 18:03 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2017.codfw.wmnet with OS buster
  • 17:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS buster
  • 17:57 urbanecm@deploy1002: Synchronized wmf-config/config/amiwiki.yaml: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (3/3) (duration: 00m 49s)
  • 17:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 17:56 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (2/3) (duration: 00m 50s)
  • 17:55 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (1/3) (duration: 00m 51s)
  • 17:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19864 and previous config saved to /var/cache/conftool/dbconfig/20220201-175516-root.json
  • 17:54 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c24f002] (duration: 05m 41s)
  • 17:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 17:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P19863 and previous config saved to /var/cache/conftool/dbconfig/20220201-175334-marostegui.json
  • 17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 17:52 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php amiwiki
  • 17:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php amiwiki growthexperiments
  • 17:49 btullis@deploy1002: Started deploy [analytics/refinery@c24f002] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c24f002]
  • 17:48 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002] (thin): Regular analytics weekly train THIN [analytics/refinery@c24f002] (duration: 00m 07s)
  • 17:48 btullis@deploy1002: Started deploy [analytics/refinery@c24f002] (thin): Regular analytics weekly train THIN [analytics/refinery@c24f002]
  • 17:47 cwhite: begin logstash upgrade (eqiad) T299168
  • 17:42 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002]: Regular analytics weekly train [analytics/refinery@c24f002] (duration: 11m 29s)
  • 17:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 40%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19862 and previous config saved to /var/cache/conftool/dbconfig/20220201-174012-root.json
  • 17:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P19861 and previous config saved to /var/cache/conftool/dbconfig/20220201-173830-marostegui.json
  • 17:30 btullis@deploy1002: Started deploy [analytics/refinery@c24f002]: Regular analytics weekly train [analytics/refinery@c24f002]
  • 17:29 btullis: about to deploy analytics/refinery
  • 17:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 17:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19860 and previous config saved to /var/cache/conftool/dbconfig/20220201-172509-root.json
  • 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19859 and previous config saved to /var/cache/conftool/dbconfig/20220201-172325-marostegui.json
  • 17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19858 and previous config saved to /var/cache/conftool/dbconfig/20220201-172219-marostegui.json
  • 17:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19857 and previous config saved to /var/cache/conftool/dbconfig/20220201-172205-marostegui.json
  • 17:21 vgutierrez: pool cp2039 running envoy as TLS terminator - T271421
  • 17:17 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2017.codfw.wmnet with OS buster
  • 17:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 20%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19856 and previous config saved to /var/cache/conftool/dbconfig/20220201-171005-root.json
  • 17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P19855 and previous config saved to /var/cache/conftool/dbconfig/20220201-170701-marostegui.json
  • 16:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2039.codfw.wmnet with OS buster
  • 16:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19854 and previous config saved to /var/cache/conftool/dbconfig/20220201-165501-root.json
  • 16:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P19852 and previous config saved to /var/cache/conftool/dbconfig/20220201-165156-marostegui.json
  • 16:51 papaul: rebooting pfw3a-codfw and pfw3b for JUNOS upgrade
  • 16:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:49 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 16:43 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2008.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19851 and previous config saved to /var/cache/conftool/dbconfig/20220201-163958-root.json
  • 16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19850 and previous config saved to /var/cache/conftool/dbconfig/20220201-163651-marostegui.json
  • 16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19849 and previous config saved to /var/cache/conftool/dbconfig/20220201-163545-marostegui.json
  • 16:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19848 and previous config saved to /var/cache/conftool/dbconfig/20220201-163537-marostegui.json
  • 16:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 1%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19847 and previous config saved to /var/cache/conftool/dbconfig/20220201-162454-root.json
  • 16:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P19846 and previous config saved to /var/cache/conftool/dbconfig/20220201-162033-marostegui.json
  • 16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19845 and previous config saved to /var/cache/conftool/dbconfig/20220201-161353-marostegui.json
  • 16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 16:12 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp2039.codfw.wmnet with OS buster
  • 16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:10 vgutierrez: depool cp2039 to be reimaged as cache::text_envoy - T271421
  • 16:09 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
  • 16:09 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P19844 and previous config saved to /var/cache/conftool/dbconfig/20220201-160528-marostegui.json
  • 16:05 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 10s)
  • 16:04 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2007.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19843 and previous config saved to /var/cache/conftool/dbconfig/20220201-155023-marostegui.json
  • 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19842 and previous config saved to /var/cache/conftool/dbconfig/20220201-154716-marostegui.json
  • 15:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19841 and previous config saved to /var/cache/conftool/dbconfig/20220201-154709-marostegui.json
  • 15:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1010.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 15:34 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
  • 15:34 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19840 and previous config saved to /var/cache/conftool/dbconfig/20220201-153204-marostegui.json
  • 15:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 15:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2006.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19839 and previous config saved to /var/cache/conftool/dbconfig/20220201-152323-marostegui.json
  • 15:22 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
  • 15:22 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
  • 15:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1010.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
  • 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1010.eqiad.wmnet
  • 15:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19838 and previous config saved to /var/cache/conftool/dbconfig/20220201-151700-marostegui.json
  • 15:13 kart_: Deployed Flores MT for cxserver + Updated cxserver to 2022-01-13-174407-production (T298584, T292412, T292415, T298679, T298752) + Updated cxserver to 2022-02-01-141918-production (T298592)
  • 15:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1010.eqiad.wmnet
  • 15:10 jelto: update scap to 4.2.2 on all hosts - T300392
  • 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P19837 and previous config saved to /var/cache/conftool/dbconfig/20220201-150818-marostegui.json
  • 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1016.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 15:07 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1016.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 15:05 mmandere@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum6002.drmrs.wmnet
  • 15:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19836 and previous config saved to /var/cache/conftool/dbconfig/20220201-150155-marostegui.json
  • 15:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
  • 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19835 and previous config saved to /var/cache/conftool/dbconfig/20220201-150049-marostegui.json
  • 15:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19834 and previous config saved to /var/cache/conftool/dbconfig/20220201-150041-marostegui.json
  • 14:59 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
  • 14:59 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
  • 14:58 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
  • 14:56 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
  • 14:56 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
  • 14:53 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
  • 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P19833 and previous config saved to /var/cache/conftool/dbconfig/20220201-145314-marostegui.json
  • 14:52 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
  • 14:52 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
  • 14:52 mmandere@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum6002.drmrs.wmnet
  • 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P19832 and previous config saved to /var/cache/conftool/dbconfig/20220201-144536-marostegui.json
  • 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19831 and previous config saved to /var/cache/conftool/dbconfig/20220201-143809-marostegui.json
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19830 and previous config saved to /var/cache/conftool/dbconfig/20220201-143504-marostegui.json
  • 14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19829 and previous config saved to /var/cache/conftool/dbconfig/20220201-143456-marostegui.json
  • 14:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P19828 and previous config saved to /var/cache/conftool/dbconfig/20220201-143031-marostegui.json
  • 14:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2005.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P19827 and previous config saved to /var/cache/conftool/dbconfig/20220201-141952-marostegui.json
  • 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19826 and previous config saved to /var/cache/conftool/dbconfig/20220201-141527-marostegui.json
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19825 and previous config saved to /var/cache/conftool/dbconfig/20220201-141420-marostegui.json
  • 14:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19824 and previous config saved to /var/cache/conftool/dbconfig/20220201-141413-marostegui.json
  • 14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P19823 and previous config saved to /var/cache/conftool/dbconfig/20220201-140447-marostegui.json
  • 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P19822 and previous config saved to /var/cache/conftool/dbconfig/20220201-135908-marostegui.json
  • 13:54 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 13:54 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 13:52 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
  • 13:50 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 13:50 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
  • 13:50 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
  • 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19821 and previous config saved to /var/cache/conftool/dbconfig/20220201-134942-marostegui.json
  • 13:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
  • 13:48 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
  • 13:48 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
  • 13:47 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19820 and previous config saved to /var/cache/conftool/dbconfig/20220201-134740-marostegui.json
  • 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:47 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
  • 13:47 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
  • 13:47 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
  • 13:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19819 and previous config saved to /var/cache/conftool/dbconfig/20220201-134524-marostegui.json
  • 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P19818 and previous config saved to /var/cache/conftool/dbconfig/20220201-134403-marostegui.json
  • 13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
  • 13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 13:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 13:41 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
  • 13:41 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
  • 13:41 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
  • 13:41 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
  • 13:38 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 13:32 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P19817 and previous config saved to /var/cache/conftool/dbconfig/20220201-133020-marostegui.json
  • 13:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19816 and previous config saved to /var/cache/conftool/dbconfig/20220201-132858-marostegui.json
  • 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19815 and previous config saved to /var/cache/conftool/dbconfig/20220201-132652-marostegui.json
  • 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
  • 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
  • 13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19814 and previous config saved to /var/cache/conftool/dbconfig/20220201-132624-marostegui.json
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P19813 and previous config saved to /var/cache/conftool/dbconfig/20220201-131515-marostegui.json
  • 13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19812 and previous config saved to /var/cache/conftool/dbconfig/20220201-131119-marostegui.json
  • 13:09 hashar: Restarting CI Jenkins
  • 13:09 hashar: Restarting Gerrit
  • 13:01 hashar: Restarted Jenkins on releases1002.eqiad.wmnet
  • 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19810 and previous config saved to /var/cache/conftool/dbconfig/20220201-130010-marostegui.json
  • 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19809 and previous config saved to /var/cache/conftool/dbconfig/20220201-125805-marostegui.json
  • 12:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19808 and previous config saved to /var/cache/conftool/dbconfig/20220201-125615-marostegui.json
  • 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 12:56 marostegui: Set innodb_adaptive_hash_index=OFF on: db1129 es1029 es1030 es1028 es1020 es1023 T268869
  • 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19807 and previous config saved to /var/cache/conftool/dbconfig/20220201-125605-marostegui.json
  • 12:52 mmandere@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum6001.drmrs.wmnet
  • 12:42 mmandere@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum6001.drmrs.wmnet
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19806 and previous config saved to /var/cache/conftool/dbconfig/20220201-124110-marostegui.json
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P19805 and previous config saved to /var/cache/conftool/dbconfig/20220201-124100-marostegui.json
  • 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19804 and previous config saved to /var/cache/conftool/dbconfig/20220201-124004-marostegui.json
  • 12:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:39 moritzm: installing openjdk-11 security updates
  • 12:31 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
  • 12:30 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
  • 12:30 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
  • 12:30 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
  • 12:30 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
  • 12:29 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
  • 12:29 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
  • 12:28 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
  • 12:28 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
  • 12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P19803 and previous config saved to /var/cache/conftool/dbconfig/20220201-122556-marostegui.json
  • 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19802 and previous config saved to /var/cache/conftool/dbconfig/20220201-121051-marostegui.json
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19801 and previous config saved to /var/cache/conftool/dbconfig/20220201-120847-marostegui.json
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19800 and previous config saved to /var/cache/conftool/dbconfig/20220201-120839-marostegui.json
  • 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19799 and previous config saved to /var/cache/conftool/dbconfig/20220201-115923-marostegui.json
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P19798 and previous config saved to /var/cache/conftool/dbconfig/20220201-115334-marostegui.json
  • 11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19797 and previous config saved to /var/cache/conftool/dbconfig/20220201-114418-marostegui.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P19796 and previous config saved to /var/cache/conftool/dbconfig/20220201-113830-marostegui.json
  • 11:31 elukey: roll restart ORES to pick up logging change (use XFF header when possible) - T299137
  • 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19795 and previous config saved to /var/cache/conftool/dbconfig/20220201-112913-marostegui.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19794 and previous config saved to /var/cache/conftool/dbconfig/20220201-112325-marostegui.json
  • 11:19 hnowlan: roll-restarting maps services in eqiad for updates
  • 11:17 hnowlan: roll-restarting maps services in codfw for updates
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19793 and previous config saved to /var/cache/conftool/dbconfig/20220201-111420-marostegui.json
  • 11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19792 and previous config saved to /var/cache/conftool/dbconfig/20220201-111413-marostegui.json
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19791 and previous config saved to /var/cache/conftool/dbconfig/20220201-111409-marostegui.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19790 and previous config saved to /var/cache/conftool/dbconfig/20220201-110855-marostegui.json
  • 11:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19789 and previous config saved to /var/cache/conftool/dbconfig/20220201-110848-marostegui.json
  • 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P19788 and previous config saved to /var/cache/conftool/dbconfig/20220201-105906-marostegui.json
  • 10:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 10:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 10:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2105.codfw.wmnet with OS bullseye
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19787 and previous config saved to /var/cache/conftool/dbconfig/20220201-105343-marostegui.json
  • 10:53 Lucas_WMDE: Deployed patch for T297754
  • 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P19786 and previous config saved to /var/cache/conftool/dbconfig/20220201-104402-marostegui.json
  • 10:41 vgutierrez: restart ATS-TLS on cp3058
  • 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s4 codfw T263127', diff saved to https://phabricator.wikimedia.org/P19785 and previous config saved to /var/cache/conftool/dbconfig/20220201-104118-marostegui.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19784 and previous config saved to /var/cache/conftool/dbconfig/20220201-103838-marostegui.json
  • 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19783 and previous config saved to /var/cache/conftool/dbconfig/20220201-102857-marostegui.json
  • 10:25 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19782 and previous config saved to /var/cache/conftool/dbconfig/20220201-102512-marostegui.json
  • 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1010.eqiad.wmnet with OS buster
  • 10:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2105.codfw.wmnet with OS bullseye
  • 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bumeh-ctr out of all services on: 5 hosts
  • 10:24 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Bumeh-ctr out of all services on: 5 hosts
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19781 and previous config saved to /var/cache/conftool/dbconfig/20220201-102356-marostegui.json
  • 10:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19780 and previous config saved to /var/cache/conftool/dbconfig/20220201-102333-marostegui.json
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19779 and previous config saved to /var/cache/conftool/dbconfig/20220201-102300-marostegui.json
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19778 and previous config saved to /var/cache/conftool/dbconfig/20220201-102221-marostegui.json
  • 10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19777 and previous config saved to /var/cache/conftool/dbconfig/20220201-102207-marostegui.json
  • 10:14 vgutierrez: pool cp3062 running envoy as TLS terminator - T271421
  • 10:10 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
  • 10:10 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
  • 10:08 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P19775 and previous config saved to /var/cache/conftool/dbconfig/20220201-100756-marostegui.json
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19774 and previous config saved to /var/cache/conftool/dbconfig/20220201-100703-marostegui.json
  • 10:05 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
  • 10:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
  • 10:01 ayounsi@cumin1001: START - Cookbook sre.ganeti.makevm for new host netflow6001.drmrs.wmnet
  • 10:01 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3062.esams.wmnet with OS buster
  • 10:01 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19773 and previous config saved to /var/cache/conftool/dbconfig/20220201-100052-root.json
  • 10:00 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
  • 10:00 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
  • 09:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1010.eqiad.wmnet with OS buster
  • 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P19772 and previous config saved to /var/cache/conftool/dbconfig/20220201-095251-marostegui.json
  • 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19771 and previous config saved to /var/cache/conftool/dbconfig/20220201-095158-marostegui.json
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19770 and previous config saved to /var/cache/conftool/dbconfig/20220201-094548-root.json
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19769 and previous config saved to /var/cache/conftool/dbconfig/20220201-093747-marostegui.json
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19768 and previous config saved to /var/cache/conftool/dbconfig/20220201-093717-marostegui.json
  • 09:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19767 and previous config saved to /var/cache/conftool/dbconfig/20220201-093709-marostegui.json
  • 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19766 and previous config saved to /var/cache/conftool/dbconfig/20220201-093653-marostegui.json
  • 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19765 and previous config saved to /var/cache/conftool/dbconfig/20220201-093044-root.json
  • 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P19764 and previous config saved to /var/cache/conftool/dbconfig/20220201-092204-marostegui.json
  • 09:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2127.codfw.wmnet with OS bullseye
  • 09:20 moritzm: installing apache/apache-modsecurity2 security updates
  • 09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS bullseye
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19763 and previous config saved to /var/cache/conftool/dbconfig/20220201-091541-marostegui.json
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: repooling', diff saved to https://phabricator.wikimedia.org/P19762 and previous config saved to /var/cache/conftool/dbconfig/20220201-091541-root.json
  • 09:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19761 and previous config saved to /var/cache/conftool/dbconfig/20220201-091534-marostegui.json
  • 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P19760 and previous config saved to /var/cache/conftool/dbconfig/20220201-090700-marostegui.json
  • 09:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp3062.esams.wmnet with OS buster
  • 09:02 mmandere: apt1001 Delete unused stretch and buster dist libvarnisapi1 package T300264
  • 09:01 vgutierrez: depool cp3062 to be reimaged as cache::text_envoy - T271421
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 5%: repooling', diff saved to https://phabricator.wikimedia.org/P19759 and previous config saved to /var/cache/conftool/dbconfig/20220201-090031-root.json
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19758 and previous config saved to /var/cache/conftool/dbconfig/20220201-090029-marostegui.json
  • 08:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1100.eqiad.wmnet with OS bullseye
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19757 and previous config saved to /var/cache/conftool/dbconfig/20220201-085155-marostegui.json
  • 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19756 and previous config saved to /var/cache/conftool/dbconfig/20220201-085040-marostegui.json
  • 08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19755 and previous config saved to /var/cache/conftool/dbconfig/20220201-084956-marostegui.json
  • 08:46 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2127.codfw.wmnet with OS bullseye
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19754 and previous config saved to /var/cache/conftool/dbconfig/20220201-084524-marostegui.json
  • 08:43 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS bullseye
  • 08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2074.codfw.wmnet with OS bullseye
  • 08:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2109.codfw.wmnet with OS bullseye
  • 08:38 moritzm: draining ganeti1016 for eventual reimage
  • 08:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P19753 and previous config saved to /var/cache/conftool/dbconfig/20220201-083452-marostegui.json
  • 08:33 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1100.eqiad.wmnet with OS bullseye
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19752 and previous config saved to /var/cache/conftool/dbconfig/20220201-083020-marostegui.json
  • 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19751 and previous config saved to /var/cache/conftool/dbconfig/20220201-082906-marostegui.json
  • 08:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19750 and previous config saved to /var/cache/conftool/dbconfig/20220201-082825-marostegui.json
  • 08:28 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1100.eqiad.wmnet with OS bullseye
  • 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1008.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 08:23 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1008.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
  • 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P19749 and previous config saved to /var/cache/conftool/dbconfig/20220201-081947-marostegui.json
  • 08:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1100.eqiad.wmnet with OS bullseye
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19748 and previous config saved to /var/cache/conftool/dbconfig/20220201-081321-marostegui.json
  • 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1100 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19747 and previous config saved to /var/cache/conftool/dbconfig/20220201-081050-marostegui.json
  • 08:07 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2109.codfw.wmnet with OS bullseye
  • 08:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2074.codfw.wmnet with OS bullseye
  • 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: repooling', diff saved to https://phabricator.wikimedia.org/P19746 and previous config saved to /var/cache/conftool/dbconfig/20220201-080449-root.json
  • 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19745 and previous config saved to /var/cache/conftool/dbconfig/20220201-080442-marostegui.json
  • 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19744 and previous config saved to /var/cache/conftool/dbconfig/20220201-080328-marostegui.json
  • 08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19743 and previous config saved to /var/cache/conftool/dbconfig/20220201-080315-marostegui.json
  • 08:01 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1003.eqiad.wmnet
  • 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19742 and previous config saved to /var/cache/conftool/dbconfig/20220201-075816-marostegui.json
  • 07:56 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1005.eqiad.wmnet
  • 07:56 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1005.eqiad.wmnet
  • 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19741 and previous config saved to /var/cache/conftool/dbconfig/20220201-074945-root.json
  • 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P19740 and previous config saved to /var/cache/conftool/dbconfig/20220201-074810-marostegui.json
  • 07:47 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1005.eqiad.wmnet
  • 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19739 and previous config saved to /var/cache/conftool/dbconfig/20220201-074311-marostegui.json
  • 07:39 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1005.eqiad.wmnet
  • 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19738 and previous config saved to /var/cache/conftool/dbconfig/20220201-073441-root.json
  • 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P19737 and previous config saved to /var/cache/conftool/dbconfig/20220201-073306-marostegui.json
  • 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19736 and previous config saved to /var/cache/conftool/dbconfig/20220201-073256-marostegui.json
  • 07:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 07:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19735 and previous config saved to /var/cache/conftool/dbconfig/20220201-073248-marostegui.json
  • 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19734 and previous config saved to /var/cache/conftool/dbconfig/20220201-071938-root.json
  • 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19733 and previous config saved to /var/cache/conftool/dbconfig/20220201-071801-marostegui.json
  • 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19732 and previous config saved to /var/cache/conftool/dbconfig/20220201-071743-marostegui.json
  • 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19731 and previous config saved to /var/cache/conftool/dbconfig/20220201-071648-marostegui.json
  • 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 07:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19730 and previous config saved to /var/cache/conftool/dbconfig/20220201-071640-marostegui.json
  • 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: repooling', diff saved to https://phabricator.wikimedia.org/P19729 and previous config saved to /var/cache/conftool/dbconfig/20220201-070434-root.json
  • 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19728 and previous config saved to /var/cache/conftool/dbconfig/20220201-070239-marostegui.json
  • 07:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P19727 and previous config saved to /var/cache/conftool/dbconfig/20220201-070135-marostegui.json
  • 06:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host db1110.eqiad.wmnet with OS bullseye
  • 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 5%: repooling', diff saved to https://phabricator.wikimedia.org/P19726 and previous config saved to /var/cache/conftool/dbconfig/20220201-064930-root.json
  • 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19725 and previous config saved to /var/cache/conftool/dbconfig/20220201-064734-marostegui.json
  • 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P19724 and previous config saved to /var/cache/conftool/dbconfig/20220201-064631-marostegui.json
  • 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19723 and previous config saved to /var/cache/conftool/dbconfig/20220201-064620-marostegui.json
  • 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19722 and previous config saved to /var/cache/conftool/dbconfig/20220201-064549-marostegui.json
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: repooling', diff saved to https://phabricator.wikimedia.org/P19721 and previous config saved to /var/cache/conftool/dbconfig/20220201-064149-root.json
  • 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19720 and previous config saved to /var/cache/conftool/dbconfig/20220201-063126-marostegui.json
  • 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19719 and previous config saved to /var/cache/conftool/dbconfig/20220201-063044-marostegui.json
  • 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19718 and previous config saved to /var/cache/conftool/dbconfig/20220201-063013-marostegui.json
  • 06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
  • 06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19717 and previous config saved to /var/cache/conftool/dbconfig/20220201-062646-root.json
  • 06:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1110.eqiad.wmnet with OS bullseye
  • 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19716 and previous config saved to /var/cache/conftool/dbconfig/20220201-062111-marostegui.json
  • 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19715 and previous config saved to /var/cache/conftool/dbconfig/20220201-061540-marostegui.json
  • 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19714 and previous config saved to /var/cache/conftool/dbconfig/20220201-061142-root.json
  • 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19713 and previous config saved to /var/cache/conftool/dbconfig/20220201-060035-marostegui.json
  • 05:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19712 and previous config saved to /var/cache/conftool/dbconfig/20220201-055921-marostegui.json
  • 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 05:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19711 and previous config saved to /var/cache/conftool/dbconfig/20220201-055638-root.json
  • 05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T298558)', diff saved to https://phabricator.wikimedia.org/P19710 and previous config saved to /var/cache/conftool/dbconfig/20220201-055327-marostegui.json
  • 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 05:08 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 03:37 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 03:36 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 02:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:18 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
  • 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 01:48 ryankemper: T282117 Merged https://gerrit.wikimedia.org/r/c/operations/dns/+/717606 and successfully ran `sudo -i authdns-update` on `authdns1001`. `commons-query.wikimedia.org` is online now. (sidenote: go-live date of service is 2022-02-01)
  • 01:42 ryankemper: T299222 `ryankemper@cumin1001:~$ sudo cumin 'wcqs*' 'sudo rm -fv /etc/default/wcqs-updater'`
  • 01:42 ryankemper: T299222 `ryankemper@cumin1001:~$ sudo cumin 'wdqs*' 'sudo rm -fv /etc/default/wdqs-updater'`
  • 01:25 ryankemper: T299222 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/757124; running puppet on `w*qs*` before purging old filepaths
  • 00:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:24 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable Local upload on ptwikinews (T300466) (duration: 00m 50s)
  • 00:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:18 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
  • 00:11 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Lower The Wikipedia Library extension edit count (T288070) (duration: 00m 50s)
  • 00:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
  • 00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
  • 00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

Archives

See Server Admin Log/Archives.