You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0))
imported>Stashbot
(zabe@deploy1002: Finished scap: Backport for Start reading from rev_comment_id in group1 wikis (T299954) (duration: 08m 00s))
 
(512 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== 2021-11-12 ==
== 2023-05-30 ==
* 21:00 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:38 zabe@deploy1002: Finished scap: Backport for [[gerrit:924564{{!}}Start reading from rev_comment_id in group1 wikis (T299954)]] (duration: 08m 00s)
* 20:57 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 23:31 zabe@deploy1002: zabe: Backport for [[gerrit:924564{{!}}Start reading from rev_comment_id in group1 wikis (T299954)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 18:09 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 23:30 zabe@deploy1002: Started scap: Backport for [[gerrit:924564{{!}}Start reading from rev_comment_id in group1 wikis (T299954)]]
* 18:08 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 22:22 ejegg: civicrm upgraded from {{Gerrit|415aa7e5}} to {{Gerrit|5905a403}}
* 17:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 21:56 samtar@deploy1002: Finished scap: Backport for [[gerrit:924570{{!}}linker: Check for null parser in Linker::makeThumbLink2 (T337794)]] (duration: 07m 48s)
* 17:35 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 21:50 samtar@deploy1002: jforrester and samtar: Backport for [[gerrit:924570{{!}}linker: Check for null parser in Linker::makeThumbLink2 (T337794)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 17:33 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 21:48 samtar@deploy1002: Started scap: Backport for [[gerrit:924570{{!}}linker: Check for null parser in Linker::makeThumbLink2 (T337794)]]
* 17:23 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:58 ladsgroup@deploy1002: ladsgroup: Backport for [[gerrit:924569{{!}}Add WANCache to ParserOutputPageProperties::finalize (T336698)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 17:15 ottomata: restarting and arming keyholder on deploy1002 - [[phab:T295380|T295380]]
* 20:57 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:924569{{!}}Add WANCache to ParserOutputPageProperties::finalize (T336698)]]
* 17:02 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 20:40 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:924568{{!}}Add WANCache to ParserOutputPageProperties::finalize (T336698)]] (duration: 09m 27s)
* 16:59 otto@deploy1002: Finished deploy [airflow-dags/analytics@093f067] (hadoop-test): (no justification provided) (duration: 00m 04s)
* 20:32 ladsgroup@deploy1002: ladsgroup: Backport for [[gerrit:924568{{!}}Add WANCache to ParserOutputPageProperties::finalize (T336698)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 16:59 otto@deploy1002: Started deploy [airflow-dags/analytics@093f067] (hadoop-test): (no justification provided)
* 20:30 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:924568{{!}}Add WANCache to ParserOutputPageProperties::finalize (T336698)]]
* 16:52 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:12 inflatador: bking@wdqs2009 depool wdqs2009 until it catches up with lag
* 16:38 otto@deploy1002: Finished deploy [airflow-dags/analytics@093f067] (hadoop-test): (no justification provided) (duration: 01m 12s)
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:924536{{!}}Turn on A/B Test Hebrew (T336969)]] (duration: 08m 46s)
* 16:36 otto@deploy1002: Started deploy [airflow-dags/analytics@093f067] (hadoop-test): (no justification provided)
* 20:03 samtar@deploy1002: ksarabia and samtar: Backport for [[gerrit:924536{{!}}Turn on A/B Test Hebrew (T336969)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 16:15 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:01 samtar@deploy1002: Started scap: Backport for [[gerrit:924536{{!}}Turn on A/B Test Hebrew (T336969)]]
* 16:11 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 19:48 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics@cd667c2]: Deplot Iceberg version of referrer_daily on analytics Airflow instance. [[phab:T335305|T335305]]. (duration: 00m 09s)
* 14:38 moritzm: installing 5.10.70 kernels on bullseye systems (just the update, no coordinated reboot)
* 19:48 xcollazo@deploy1002: Started deploy [airflow-dags/analytics@cd667c2]: Deplot Iceberg version of referrer_daily on analytics Airflow instance. [[phab:T335305|T335305]].
* 11:05 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2100.codfw.wmnet with OS buster
* 19:36 bking@deploy1002: Finished deploy [wdqs/wdqs@dff41b7]: 0.3.124 (duration: 04m 02s)
* 10:47 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host db2100.codfw.wmnet with OS buster
* 19:32 bking@deploy1002: Started deploy [wdqs/wdqs@dff41b7]: 0.3.124
* 10:46 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 19:29 bking@deploy1002: Finished deploy [wdqs/wdqs@dff41b7]: 0.3.124 (duration: 00m 54s)
* 10:45 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 19:29 bking@deploy1002: Started deploy [wdqs/wdqs@dff41b7]: 0.3.124
* 10:42 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 19:29 bking@deploy1002: Finished deploy [wdqs/wdqs@dff41b7]: 0.3.124 (duration: 16m 36s)
* 10:41 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 19:24 ryankemper@puppetmaster1001: conftool action : set/weight=0:pooled=inactive; selector: name=wdqs2021.*
* 10:35 ema: A:cp re-enable puppet after successful testing of https://gerrit.wikimedia.org/r/c/operations/puppet/+/737424 on cp4027 [[phab:T293879|T293879]]
* 19:12 inflatador: [WDQS Deploy] Deploying version 0.3.124
* 10:25 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 19:11 bking@deploy1002: Started deploy [wdqs/wdqs@dff41b7]: 0.3.124
* 10:17 ema: A:cp disable-puppet to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/737424 on cp4027 [[phab:T293879|T293879]]
* 18:27 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.11  refs [[phab:T337525|T337525]]
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 100%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17736 and previous config saved to /var/cache/conftool/dbconfig/20211112-084813-root.json
* 17:45 mutante: re-enabling puppet on contint2001
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 75%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17735 and previous config saved to /var/cache/conftool/dbconfig/20211112-083310-root.json
* 16:20 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_growthexperiments-userImpactUpdateRecentlyEdited
* 08:27 moritzm: imported openjdk-8 8u312-b07-1~deb11u1 to component/jdk8 for bullseye-wikimedia (rebuild of latest Java 8 security release for Bullseye)
* 16:19 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_growthexperiments-userImpactUpdateRecentlyRegistered
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 50%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17734 and previous config saved to /var/cache/conftool/dbconfig/20211112-081806-root.json
* 16:14 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:924053{{!}}[Growth] Enable user impact refresh on 10 more wikis (T336203)]] (duration: 07m 08s)
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 40%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17733 and previous config saved to /var/cache/conftool/dbconfig/20211112-080302-root.json
* 16:07 urbanecm@deploy1002: Started scap: Backport for [[gerrit:924053{{!}}[Growth] Enable user impact refresh on 10 more wikis (T336203)]]
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 25%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17732 and previous config saved to /var/cache/conftool/dbconfig/20211112-074759-root.json
* 16:00 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 20%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17731 and previous config saved to /var/cache/conftool/dbconfig/20211112-073255-root.json
* 16:00 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 10%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17730 and previous config saved to /var/cache/conftool/dbconfig/20211112-071752-root.json
* 15:58 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Add weight for db1104', diff saved to https://phabricator.wikimedia.org/P17729 and previous config saved to /var/cache/conftool/dbconfig/20211112-070236-marostegui.json
* 15:58 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 07:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 5%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P17728 and previous config saved to /var/cache/conftool/dbconfig/20211112-070141-root.json
* 15:57 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:56 otto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:56 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:15 tgr: UTC late deploys done
* 15:55 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:14 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:738284{{!}}Enable GrowthExperiments image recommendations on eswiki (T294878)]] (duration: 00m 56s)
* 15:54 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:54 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:54 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:53 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:51 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2002.codfw.wmnet with OS bullseye
* 15:51 otto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:51 otto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:49 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 15:15 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet with OS bullseye
* 15:15 aborrero@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
* 15:14 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - aborrero@cumin2002"
* 15:10 tgr_: UTC evening deploys done
* 15:08 tgr@deploy1002: Finished scap: Backport for [[gerrit:924160{{!}}ve.ui.MWGalleryDialog: Fix showing the search panel (T337638)]], [[gerrit:924456{{!}}Hide 'editnotice-notext' message in VE (and mobile apps) (T337633)]], [[gerrit:924458{{!}}ve.ui.MWGalleryDialog: Fix showing the search panel (T337638)]] (duration: 08m 08s)
* 15:05 bking@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 15:03 bking@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 15:02 tgr@deploy1002: tgr and matmarex: Backport for [[gerrit:924160{{!}}ve.ui.MWGalleryDialog: Fix showing the search panel (T337638)]], [[gerrit:924456{{!}}Hide 'editnotice-notext' message in VE (and mobile apps) (T337633)]], [[gerrit:924458{{!}}ve.ui.MWGalleryDialog: Fix showing the search panel (T337638)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 15:00 tgr@deploy1002: Started scap: Backport for [[gerrit:924160{{!}}ve.ui.MWGalleryDialog: Fix showing the search panel (T337638)]], [[gerrit:924456{{!}}Hide 'editnotice-notext' message in VE (and mobile apps) (T337633)]], [[gerrit:924458{{!}}ve.ui.MWGalleryDialog: Fix showing the search panel (T337638)]]
* 14:50 moritzm: installing texlive-bin security updates
* 14:49 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2002.codfw.wmnet with reason: host reimage
* 14:46 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2002.codfw.wmnet with reason: host reimage
* 14:36 tgr@deploy1002: Finished scap: Backport for [[gerrit:924159{{!}}Hide 'editnotice-notext' message in VE (and mobile apps) (T337633)]] (duration: 08m 01s)
* 14:29 tgr@deploy1002: matmarex and tgr: Backport for [[gerrit:924159{{!}}Hide 'editnotice-notext' message in VE (and mobile apps) (T337633)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 14:28 herron@cumin1001: START - Cookbook sre.hosts.reimage for host mwlog2002.codfw.wmnet with OS bullseye
* 14:27 tgr@deploy1002: Started scap: Backport for [[gerrit:924159{{!}}Hide 'editnotice-notext' message in VE (and mobile apps) (T337633)]]
* 14:16 herron@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2002.codfw.wmnet with OS bullseye
* 14:16 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetdb1003.eqiad.wmnet with OS bookworm
* 14:14 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 14:13 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 14:08 bking@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 14:06 bking@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 14:06 moritzm: installing libwebp security updates
* 14:06 tgr@deploy1002: Finished scap: Backport for [[gerrit:924158{{!}}editpage: Change the order of hooks slightly for FlaggedRevs (T337637)]] (duration: 08m 14s)
* 13:59 tgr@deploy1002: tgr and matmarex: Backport for [[gerrit:924158{{!}}editpage: Change the order of hooks slightly for FlaggedRevs (T337637)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:58 tgr@deploy1002: Started scap: Backport for [[gerrit:924158{{!}}editpage: Change the order of hooks slightly for FlaggedRevs (T337637)]]
* 13:57 tgr@deploy1002: Finished scap: Backport for [[gerrit:924488{{!}}prod: Remove $wgCampaignEventsEnableMultipleOrganizers (T334088)]] (duration: 16m 13s)
* 13:56 mvernon@cumin2002: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2009.codfw.wmnet
* 13:55 mvernon@cumin2002: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
* 13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2009.codfw.wmnet with OS bullseye
* 13:42 tgr@deploy1002: tgr and daimona: Backport for [[gerrit:924488{{!}}prod: Remove $wgCampaignEventsEnableMultipleOrganizers (T334088)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:40 tgr@deploy1002: Started scap: Backport for [[gerrit:924488{{!}}prod: Remove $wgCampaignEventsEnableMultipleOrganizers (T334088)]]
* 13:33 mlitn@deploy1002: Finished scap: Backport for [[gerrit:924454{{!}}Fix maxJobs default]], [[gerrit:924455{{!}}Fix maxJobs default]] (duration: 07m 39s)
* 13:27 mlitn@deploy1002: mlitn: Backport for [[gerrit:924454{{!}}Fix maxJobs default]], [[gerrit:924455{{!}}Fix maxJobs default]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:25 mlitn@deploy1002: Started scap: Backport for [[gerrit:924454{{!}}Fix maxJobs default]], [[gerrit:924455{{!}}Fix maxJobs default]]
* 13:20 tgr@deploy1002: Finished scap: Backport for [[gerrit:924079{{!}}GrowthExperiments: Re-add $wgGERestbaseUrl]] (duration: 09m 26s)
* 13:13 tgr@deploy1002: tgr: Backport for [[gerrit:924079{{!}}GrowthExperiments: Re-add $wgGERestbaseUrl]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:11 herron@cumin1001: START - Cookbook sre.hosts.reimage for host mwlog2002.codfw.wmnet with OS bullseye
* 13:11 tgr@deploy1002: Started scap: Backport for [[gerrit:924079{{!}}GrowthExperiments: Re-add $wgGERestbaseUrl]]
* 13:09 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2009.codfw.wmnet with reason: host reimage
* 13:09 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 13:09 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 13:08 bblack: lvs1018: restart pybal for wikireplicas monitoring removal
* 13:08 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 13:06 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 13:06 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2009.codfw.wmnet with reason: host reimage
* 13:06 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 13:04 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 13:03 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2005-dev.codfw.wmnet with reason: host reimage
* 13:00 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2005-dev.codfw.wmnet with reason: host reimage
* 12:51 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 12:51 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe2009.codfw.wmnet with OS bullseye
* 12:39 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 12:39 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 12:29 volans: disablig puppet where cadvisor is present
* 12:14 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol2005-dev.codfw.wmnet with OS bullseye
* 11:51 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
* 11:51 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:51 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:51 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for moved cloudcontrol2005-dev - cmooney@cumin1001"
* 11:50 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for moved cloudcontrol2005-dev - cmooney@cumin1001"
* 11:50 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 11:47 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 11:46 slyngshede@cumin1001: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
* 11:45 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on puppetboard2003.codfw.wmnet,puppetboard1003.eqiad.wmnet with reason: building_systems
* 11:45 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on puppetboard2003.codfw.wmnet,puppetboard1003.eqiad.wmnet with reason: building_systems
* 11:41 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 11:41 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 11:21 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 11:21 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 11:14 hashar@deploy1002: Finished deploy [gerrit/gerrit@6deabc9]: wm-checks-api: add support for DUCT - [[phab:T331651|T331651]] (duration: 00m 08s)
* 11:14 hashar@deploy1002: Started deploy [gerrit/gerrit@6deabc9]: wm-checks-api: add support for DUCT - [[phab:T331651|T331651]]
* 11:07 slyngshede@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
* 11:07 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
* 11:00 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:57 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:56 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 10:56 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 10:53 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 10:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:50 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:50 slyngshede@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 10:41 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:41 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:11 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetboard2003.codfw.wmnet with OS bookworm
* 10:11 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host puppetboard1003.eqiad.wmnet with OS bookworm
* 10:00 zabe@deploy1002: Finished scap: Backport for [[gerrit:924469{{!}}Start reading from rev_comment_id in group0 wikis (T299954)]] (duration: 08m 12s)
* 09:59 slyngshede@cumin1001: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
* 09:58 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 09:57 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
* 09:54 zabe@deploy1002: zabe: Backport for [[gerrit:924469{{!}}Start reading from rev_comment_id in group0 wikis (T299954)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 09:52 zabe@deploy1002: Started scap: Backport for [[gerrit:924469{{!}}Start reading from rev_comment_id in group0 wikis (T299954)]]
* 09:52 zabe@deploy1002: Finished scap: Backport for [[gerrit:923635{{!}}Check for null when using ::getCheckUserHelperFieldset (T337599)]] (duration: 09m 52s)
* 09:49 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetboard2003.codfw.wmnet with reason: host reimage
* 09:46 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetboard2003.codfw.wmnet with reason: host reimage
* 09:43 zabe@deploy1002: zabe: Backport for [[gerrit:923635{{!}}Check for null when using ::getCheckUserHelperFieldset (T337599)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 09:43 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 09:43 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 09:43 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:43 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host puppetdb1003.eqiad.wmnet with OS bookworm
* 09:42 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:42 zabe@deploy1002: Started scap: Backport for [[gerrit:923635{{!}}Check for null when using ::getCheckUserHelperFieldset (T337599)]]
* 09:40 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 09:40 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 09:37 zabe@deploy1002: Finished scap: Backport for [[gerrit:922492{{!}}Start reading from rev_comment_id in test wikis (T299954)]] (duration: 07m 48s)
* 09:34 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetboard1003.eqiad.wmnet with reason: host reimage
* 09:33 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=2) for new host testvm2006.codfw.wmnet
* 09:33 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 09:33 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 09:33 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:33 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:32 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:31 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetboard1003.eqiad.wmnet with reason: host reimage
* 09:30 zabe@deploy1002: zabe: Backport for [[gerrit:922492{{!}}Start reading from rev_comment_id in test wikis (T299954)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 09:30 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 09:30 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetdb2003.codfw.wmnet with OS bookworm
* 09:29 zabe@deploy1002: Started scap: Backport for [[gerrit:922492{{!}}Start reading from rev_comment_id in test wikis (T299954)]]
* 09:27 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:24 tgr@deploy1002: Finished scap: Backport for [[gerrit:924361{{!}}Improve handling of missing image recommendation]] (duration: 08m 57s)
* 09:22 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetboard2003.codfw.wmnet with OS bookworm
* 09:20 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetboard1003.eqiad.wmnet with OS bookworm
* 09:19 arturo: run aborrero@cumin1001:~ 2s 98 $ sudo cumin "P<nowiki>{</nowiki>R:Profile::Mariadb::Section = 's7'<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:wmcs::db::wikireplicas::mariadb_multiinstance<nowiki>}</nowiki>" "/usr/local/sbin/maintain-meta_p --all-databases --bootstrap"
* 09:17 tgr@deploy1002: tgr: Backport for [[gerrit:924361{{!}}Improve handling of missing image recommendation]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 09:15 tgr@deploy1002: Started scap: Backport for [[gerrit:924361{{!}}Improve handling of missing image recommendation]]
* 09:14 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 09:14 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 09:14 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:13 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 09:11 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 09:11 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 09:06 tgr@deploy1002: Finished scap: Backport for [[gerrit:923644{{!}}Section images: Do not treat unexpected kinds as production errors]] (duration: 14m 22s)
* 09:00 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2006.codfw.wmnet
* 09:00 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 09:00 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 09:00 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:00 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:59 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:54 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:53 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:53 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:53 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:53 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:53 tgr@deploy1002: tgr: Backport for [[gerrit:923644{{!}}Section images: Do not treat unexpected kinds as production errors]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 08:52 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:51 tgr@deploy1002: Started scap: Backport for [[gerrit:923644{{!}}Section images: Do not treat unexpected kinds as production errors]]
* 08:50 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:50 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 08:49 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2006.codfw.wmnet
* 08:49 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:49 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:49 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:49 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:48 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:44 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:44 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:44 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:44 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:44 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:43 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:41 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:41 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 08:39 tgr@deploy1002: Finished scap: Backport for [[gerrit:923643{{!}}Improve logging of invalid image recommendation kinds]] (duration: 10m 30s)
* 08:39 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=2) for new host testvm2006.codfw.wmnet
* 08:39 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:39 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:39 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:39 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:38 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:36 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:36 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:35 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:35 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:34 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:34 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:34 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:33 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:31 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:31 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 08:30 tgr@deploy1002: tgr: Backport for [[gerrit:923643{{!}}Improve logging of invalid image recommendation kinds]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 08:29 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2006.codfw.wmnet
* 08:29 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:29 tgr@deploy1002: Started scap: Backport for [[gerrit:923643{{!}}Improve logging of invalid image recommendation kinds]]
* 08:29 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:28 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:27 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:27 jayme: re-enable puppet on P:kubernetes::node for https://gerrit.wikimedia.org/r/c/operations/puppet/+/909687
* 08:25 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:20 jayme: disable puppet on P:kubernetes::node (apart from staging-codfw) for https://gerrit.wikimedia.org/r/c/operations/puppet/+/909687
* 08:15 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 08:15 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 08:15 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:14 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 08:12 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 08:12 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 08:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb2003.codfw.wmnet with reason: host reimage
* 08:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb2003.codfw.wmnet with reason: host reimage
* 08:08 tgr@deploy1002: Finished scap: Backport for [[gerrit:924356{{!}}Section images: Accept more recommendation types]] (duration: 07m 51s)
* 08:01 tgr@deploy1002: tgr: Backport for [[gerrit:924356{{!}}Section images: Accept more recommendation types]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 08:00 tgr@deploy1002: Started scap: Backport for [[gerrit:924356{{!}}Section images: Accept more recommendation types]]
* 07:56 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:924086{{!}}Revert "Rename wgPageContentLanguage to wgPageViewLanguage" partially (T337634)]] (duration: 09m 17s)
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host puppetdb2003.codfw.wmnet with OS bookworm
* 07:48 ladsgroup@deploy1002: func and ladsgroup: Backport for [[gerrit:924086{{!}}Revert "Rename wgPageContentLanguage to wgPageViewLanguage" partially (T337634)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 07:46 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:924086{{!}}Revert "Rename wgPageContentLanguage to wgPageViewLanguage" partially (T337634)]]
* 07:45 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2006.codfw.wmnet
* 07:45 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:45 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:45 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:45 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48633 and previous config saved to /var/cache/conftool/dbconfig/20230530-074445-root.json
* 07:44 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:42 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 07:41 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:41 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:41 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:41 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:40 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:38 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 07:38 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 07:31 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=2) for new host testvm2006.codfw.wmnet
* 07:31 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:31 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:31 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:31 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:30 moritzm: move LDAP permissions for hghani from cn=nda to cn=wmf [[phab:T322145|T322145]]
* 07:30 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48632 and previous config saved to /var/cache/conftool/dbconfig/20230530-072941-root.json
* 07:29 kartik@deploy1002: Finished scap: Backport for [[gerrit:924050{{!}}testwiki: Enable Section Translation for 9 Wikipedia (T337290)]] (duration: 09m 38s)
* 07:28 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 07:28 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:27 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:21 kartik@deploy1002: kartik: Backport for [[gerrit:924050{{!}}testwiki: Enable Section Translation for 9 Wikipedia (T337290)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 07:19 kartik@deploy1002: Started scap: Backport for [[gerrit:924050{{!}}testwiki: Enable Section Translation for 9 Wikipedia (T337290)]]
* 07:17 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:17 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:17 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:17 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:16 kartik@deploy1002: Finished scap: Backport for [[gerrit:923527{{!}}Undeploy Special:Contribute from unsupported skins (T337366)]] (duration: 11m 49s)
* 07:16 moritzm: update bookworm installer to rc4 [[phab:T330495|T330495]]
* 07:16 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48630 and previous config saved to /var/cache/conftool/dbconfig/20230530-071436-root.json
* 07:10 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 07:10 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 07:10 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2006.codfw.wmnet
* 07:10 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:10 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:10 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:10 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:09 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:07 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 07:07 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:07 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:07 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:07 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:06 kartik@deploy1002: kartik: Backport for [[gerrit:923527{{!}}Undeploy Special:Contribute from unsupported skins (T337366)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 07:06 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:04 kartik@deploy1002: Started scap: Backport for [[gerrit:923527{{!}}Undeploy Special:Contribute from unsupported skins (T337366)]]
* 07:04 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 07:03 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 07:02 slyngshede@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=2) for new host testvm2006.codfw.wmnet
* 07:02 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 07:02 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 07:02 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:02 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 07:01 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48629 and previous config saved to /var/cache/conftool/dbconfig/20230530-065932-root.json
* 06:58 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 06:58 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 06:57 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 06:51 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
* 06:51 slyngshede@cumin1001: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
* 06:51 slyngshede@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:51 slyngshede@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 06:50 slyngshede@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - slyngshede@cumin1001"
* 06:48 slyngshede@cumin1001: START - Cookbook sre.dns.netbox
* 06:48 slyngshede@cumin1001: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48628 and previous config saved to /var/cache/conftool/dbconfig/20230530-064427-root.json
* 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48625 and previous config saved to /var/cache/conftool/dbconfig/20230530-062922-root.json
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48624 and previous config saved to /var/cache/conftool/dbconfig/20230530-061417-root.json
* 05:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48623 and previous config saved to /var/cache/conftool/dbconfig/20230530-055913-root.json
* 05:43 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'configure' for AS: 62597
* 05:43 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 62597
* 05:41 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nray out of all services on: 1255 hosts
* 05:40 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Nray out of all services on: 1255 hosts
* 05:40 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nray out of all services on: 784 hosts
* 05:40 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Nray out of all services on: 784 hosts
* 05:28 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hxi-ctr out of all services on: 784 hosts
* 05:27 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Hxi-ctr out of all services on: 784 hosts
* 05:26 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hxi-ctr out of all services on: 1255 hosts
* 05:25 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Hxi-ctr out of all services on: 1255 hosts
* 05:22 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 62597
* 05:17 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 62597
* 04:28 kart_: Updated cxserver to 2023-05-29-112644-production ([[phab:T337657|T337657]])
* 04:28 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 04:27 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 04:24 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 04:24 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 04:21 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 04:20 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 03:54 mwpresync@deploy1002: Pruned MediaWiki: 1.41.0-wmf.9 (duration: 02m 10s)
* 03:52 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.41.0-wmf.11  refs [[phab:T337525|T337525]] (duration: 49m 54s)
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.41.0-wmf.11  refs [[phab:T337525|T337525]]


== 2021-11-11 ==
== 2023-05-29 ==
* 16:56 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 15:19 eoghan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: This is being worked on
* 16:30 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 15:19 eoghan@cumin1001: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: This is being worked on
* 16:28 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 14:18 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on stat1009.eqiad.wmnet with reason: Bringing stat1009 into service
* 16:28 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 14:18 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on stat1009.eqiad.wmnet with reason: Bringing stat1009 into service
* 16:26 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 13:57 vgutierrez@puppetmaster1001: conftool action : set/weight=10; selector: name=dbproxy.*,dc=eqiad
* 16:26 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 11:25 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:26 jynus@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1139.eqiad.wmnet with OS buster
* 11:24 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:15 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6001.drmrs.wmnet with OS buster
* 11:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48618 and previous config saved to /var/cache/conftool/dbconfig/20230529-112242-root.json
* 16:12 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 11:13 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:49 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS buster
* 11:13 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:44 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6001.drmrs.wmnet with OS buster
* 11:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48617 and previous config saved to /var/cache/conftool/dbconfig/20230529-110737-root.json
* 15:18 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS buster
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48616 and previous config saved to /var/cache/conftool/dbconfig/20230529-105233-root.json
* 15:16 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48615 and previous config saved to /var/cache/conftool/dbconfig/20230529-103728-root.json
* 14:59 moritzm: installing krb5 security updates on buster/bullseye (client-side libs/tools only, KDCs already fixed)
* 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48614 and previous config saved to /var/cache/conftool/dbconfig/20230529-102223-root.json
* 14:55 moritzm: installing PHP 7.0 security updates
* 10:07 vgutierrez: restarting pybal on lvs1018
* 14:52 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48612 and previous config saved to /var/cache/conftool/dbconfig/20230529-100719-root.json
* 14:50 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - btullis@cumin1001
* 10:05 oblivian@deploy1002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
* 14:46 moritzm: installing sqlalchemy security updates on stretch
* 10:05 oblivian@deploy1002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:05 oblivian@deploy1002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
* 14:41 moritzm: installing libxstream-java security updates
* 10:05 oblivian@deploy1002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
* 14:38 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - btullis@cumin1001
* 10:04 oblivian@deploy1002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
* 14:33 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 10:04 oblivian@deploy1002: helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync
* 14:32 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 10:03 oblivian@deploy1002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
* 14:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:03 oblivian@deploy1002: helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync
* 14:21 volans: uploaded python3-wmflib_1.0.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 10:03 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 14:15 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 10:02 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 14:12 jynus@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1139.eqiad.wmnet with OS buster
* 10:00 vgutierrez: restarting pybal on lvs1020
* 14:10 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2100.codfw.wmnet with OS buster
* 09:59 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 14:05 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 09:58 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 13:59 moritzm: installing bind9 security updates (only client-side-tools/libs)
* 09:56 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 13:48 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1139.eqiad.wmnet with OS buster
* 09:55 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 13:45 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host db2100.codfw.wmnet with OS buster
* 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48611 and previous config saved to /var/cache/conftool/dbconfig/20230529-095214-root.json
* 13:38 root@cumin1001: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
* 09:52 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 13:38 root@cumin1001: START - Cookbook sre.hosts.ipmi-password-reset
* 09:51 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:50 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:49 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 13:14 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:735367{{!}}Load Wikibase Client before other Wikibase extensions (T294224)]] (duration: 00m 55s)
* 09:45 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:45 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 13:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48610 and previous config saved to /var/cache/conftool/dbconfig/20230529-093709-root.json
* 13:01 Lucas_WMDE: UTC morning backport+config window formally over (I’ll do one more config change shortly)
* 09:31 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 13:00 kharlan@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:738213{{!}}GrowthExperiments: Add campaign pattern for control group (T295068)]] (duration: 00m 55s)
* 09:31 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 12:50 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: [[gerrit:737189{{!}}Don't need to keep all config in memory]] (resync, previous deploy for this file was missing `git rebase`) (duration: 00m 55s)
* 09:30 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 12:47 kharlan@deploy1002: Synchronized php-1.38.0-wmf.7/extensions/GrowthExperiments/includes/Specials/SpecialCreateAccountCampaign.php: Backport: [[gerrit:737960{{!}}CreateAccountCampaign: Show/hide new HTML based on query param (T295068) (2/2 SpecialCreateAccountCampaign.php)]] (duration: 00m 55s)
* 09:29 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 12:46 kharlan@deploy1002: Synchronized php-1.38.0-wmf.7/extensions/GrowthExperiments/includes/HomepageHooks.php: Backport: [[gerrit:737960{{!}}CreateAccountCampaign: Show/hide new HTML based on query param (T295068) (1/2 HomepageHooks.php)]] (duration: 00m 54s)
* 09:13 godog: start partial rollout of cadvisor to eqiad/codfw (~10%) [[phab:T108027|T108027]]
* 12:37 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1116.eqiad.wmnet with OS buster
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48609 and previous config saved to /var/cache/conftool/dbconfig/20230529-090216-root.json
* 12:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48608 and previous config saved to /var/cache/conftool/dbconfig/20230529-084711-root.json
* 12:30 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2097.codfw.wmnet with OS buster
* 08:45 godog: delete old raw blocks from thanos - [[phab:T337236|T337236]]
* 12:28 kharlan@deploy1002: Synchronized php-1.38.0-wmf.7/includes/specialpage/LoginSignupSpecialPage.php: Backport: [[gerrit:737961{{!}}LoginSignup: Add function for overriding benefits container (T295068)]] (duration: 00m 57s)
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48607 and previous config saved to /var/cache/conftool/dbconfig/20230529-083206-root.json
* 12:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48606 and previous config saved to /var/cache/conftool/dbconfig/20230529-081702-root.json
* 12:22 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48604 and previous config saved to /var/cache/conftool/dbconfig/20230529-080157-root.json
* 12:21 moritzm: imported openjdk-8 8u312-b07-1~deb10u1 to component/jdk8 for buster-wikimedia (rebuild of latest Java 8 security release for Buster)
* 07:57 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:56 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:15 awight@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: [[gerrit:737189{{!}}Don't need to keep all config in memory]] (duration: 00m 55s)
* 07:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48603 and previous config saved to /var/cache/conftool/dbconfig/20230529-074653-root.json
* 12:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48602 and previous config saved to /var/cache/conftool/dbconfig/20230529-073148-root.json
* 12:13 awight@deploy1002: Synchronized multiversion/MWConfigCacheGenerator.php: Config: [[gerrit:737192{{!}}Avoid error suppression]] (duration: 00m 55s)
* 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48601 and previous config saved to /var/cache/conftool/dbconfig/20230529-071643-root.json
* 12:10 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host db2097.codfw.wmnet with OS buster
* 05:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool sanitarium masters for s1, s2, s3, s5 [[phab:T337446|T337446]]', diff saved to https://phabricator.wikimedia.org/P48598 and previous config saved to /var/cache/conftool/dbconfig/20230529-051043-root.json
* 12:10 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host db1116.eqiad.wmnet with OS buster
* 12:08 awight@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: [[gerrit:737187{{!}}Anchor relative import]] (duration: 00m 56s)
* 11:32 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: working on network tests
* 11:31 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: working on network tests
* 11:28 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov1001.eqiad.wmnet with OS buster
* 11:04 jynus@cumin1001: START - Cookbook sre.hosts.reimage for host dbprov1001.eqiad.wmnet with OS buster
* 10:56 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov2001.codfw.wmnet with OS buster
* 10:37 moritzm: updated routinator in thirdparty/routinator for bullseye-wikimedia to 0.10.12 [[phab:T292503|T292503]]
* 10:24 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov2001.codfw.wmnet with OS buster
* 10:18 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3065.esams.wmnet with OS buster
* 10:15 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: working on network tests
* 10:15 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: working on network tests
* 10:15 vgutierrez: pool cp3065 running haproxy - [[phab:T290005|T290005]]
* 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions from s5 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17725 and previous config saved to /var/cache/conftool/dbconfig/20211111-092528-marostegui.json
* 09:13 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp3065.esams.wmnet with OS buster
* 09:10 vgutierrez: depool cp3065 to be reimaged as cache::upload_haproxy - [[phab:T290005|T290005]]
* 09:03 arturo: pull all packages for buster-wikimedia/thirdparty/kubeadm-k8s-1-21 ([[phab:T282942|T282942]])
* 08:17 marostegui: Upgrade db2078 [[phab:T288720|T288720]]
* 08:13 marostegui: Restart db1132 [[phab:T288720|T288720]]
* 06:56 elukey: `systemctl start prometheus-mysqld-exporter@analytics_meta` on db1108
* 06:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1104.eqiad.wmnet with OS buster
* 06:10 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1104.eqiad.wmnet with OS buster
* 06:06 marostegui: Stop replication on db1104 (old master) [[phab:T294321|T294321]]
* 06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (old master) [[phab:T294321|T294321]]', diff saved to https://phabricator.wikimedia.org/P17723 and previous config saved to /var/cache/conftool/dbconfig/20211111-060242-marostegui.json
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1109 to s8 primary and set section read-write [[phab:T294321|T294321]]', diff saved to https://phabricator.wikimedia.org/P17722 and previous config saved to /var/cache/conftool/dbconfig/20211111-060102-marostegui.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T294321|T294321]]', diff saved to https://phabricator.wikimedia.org/P17721 and previous config saved to /var/cache/conftool/dbconfig/20211111-060031-marostegui.json
* 06:00 marostegui: Starting s8 eqiad failover from db1104 to db1109 - [[phab:T294321|T294321]]
* 05:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 31 hosts with reason: Primary switchover s8 [[phab:T294321|T294321]]
* 05:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 31 hosts with reason: Primary switchover s8 [[phab:T294321|T294321]]
* 02:52 eileen: civicrm revision {{Gerrit|7e38867f}} -> {{Gerrit|817e514a}} (latest)
* 00:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:18 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Set wgForeignUploadTargets on officewiki [[phab:T295510|T295510]] (duration: 00m 56s)


== 2021-11-10 ==
== 2023-05-28 ==
* 23:46 ebernhardson: start test backup/restore of 1tb commonswiki from relforge to swift in eqiad
* 13:19 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
* 23:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript updateSpecialPages.php --wiki=foundationwiki --only=DoubleRedirects
* 13:17 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
* 23:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript updateSpecialPages.php --wiki=foundationwiki --only=BrokenRedirects
* 13:16 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
* 22:06 bblack: dns2002 - restart ntp.servce to fix drmrs peering
* 13:16 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
* 22:01 bblack: dns1002 - restart ntp.servce to fix drmrs peering
* 06:12 marostegui: Change innodb_fast_shutdown to 0 on db1154 before downgrading [[phab:T337446|T337446]]
* 21:56 bblack: dns2001 - restart ntp.service to fix drmrs peering
* 21:53 bblack: dns1001 - restart ntp.service to see if drmrs associations cleared up after dns changes, etc
* 21:24 bblack: asw1-b1[23]-drmrs: added ipv6 router-advertisement clauses, which work, but probably imperfectly :)
* 19:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns6001.wikimedia.org with OS buster
* 19:51 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns6002.wikimedia.org with OS buster
* 19:51 ottomata: altering <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.maps.tiles_change to increase to 6 partitions in kafka main-eqiad, main-codfw and jumbo-eqiad: https://phabricator.wikimedia.org/T293366#7497076
* 19:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:46 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:43 cjming: end of UTC evening backport & config window
* 19:42 cjming: end of UTC late backport & config window
* 19:41 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737814{{!}}Lower mobile web click tracking rate (T295432)]] (duration: 00m 55s)
* 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:35 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737814{{!}}Lower mobile web click tracking rate (T295432)]] (duration: 00m 57s)
* 19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:23 legoktm: uploaded php-pcov_1.0.6-4+wmf1~buster1_amd64.changes to apt.wm.o ([[phab:T243847|T243847]])
* 18:57 mutante: removing mediawiki font packages from parsoid hosts - [[phab:T294378|T294378]]
* 18:37 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host dns6002.wikimedia.org with OS buster
* 18:37 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host dns6001.wikimedia.org with OS buster
* 18:19 dancy@deploy1002: Finished scap: Config: [[gerrit:737976{{!}}Get rid of obsolete train-versions.json file]] (duration: 15m 57s)
* 18:09 bblack: drmrs - rebooting a bunch of hosts to bios for further settings, please ignore any accidental alerts - they do *look* like they're alert-disabled)
* 18:08 vgutierrez: restart haproxy on cp4026 and cp5006 to enable hitless reloads - [[phab:T290005|T290005]]
* 18:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:03 dancy@deploy1002: Started scap: Config: [[gerrit:737976{{!}}Get rid of obsolete train-versions.json file]]
* 17:10 bblack@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dns6001.wikimedia.org with OS buster
* 16:49 bblack@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dns6002.wikimedia.org with OS buster
* 16:47 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:44 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:32 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: [[phab:T295480|T295480]]: Move all cirrussearch traffic to codfw (duration: 00m 55s)
* 16:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:28 elukey: move atskafka to the new CA bundle - [[phab:T291905|T291905]]
* 16:26 elukey: move kafkatee instances (analytics-test,centralog) to the new CA bundle - [[phab:T291905|T291905]]
* 16:14 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host dns6002.wikimedia.org with OS buster
* 16:12 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host dns6001.wikimedia.org with OS buster
* 15:52 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: [[phab:T295480|T295480]]: Move all cirrussearch traffic to codfw (duration: 00m 56s)
* 14:09 legoktm: restarted mailman3/mailman3-web to pick up new DNS for m5-master
* 14:08 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 14:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:48 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 13:47 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 13:46 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 13:36 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:03 Lucas_WMDE: UTC morning backport+config window done
* 13:01 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737696{{!}}Enable the visual editor on the 2022 namespace on Wikimania wiki (T295267)]] (duration: 00m 55s)
* 12:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:56 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:53 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737695{{!}}Update $wgNamespacesToBeSearchedDefault for Wikimania 2022 (T295267)]] (duration: 00m 55s)
* 12:46 XioNoX: delete route6 object for 2a02:ec80::/32 (split in two /48s)
* 12:46 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@bea7fa6] (eqiad): Update kartotherian-package to {{Gerrit|006c027}} (duration: 01m 20s)
* 12:45 XioNoX: delete ROA for  2a02:ec80::/32
* 12:45 mbsantos@deploy1002: Started deploy [kartotherian/deploy@bea7fa6] (eqiad): Update kartotherian-package to {{Gerrit|006c027}}
* 12:43 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@bea7fa6] (codfw): Update kartotherian-package to {{Gerrit|006c027}} (duration: 01m 31s)
* 12:41 mbsantos@deploy1002: Started deploy [kartotherian/deploy@bea7fa6] (codfw): Update kartotherian-package to {{Gerrit|006c027}}
* 12:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:38 mbsantos@deploy1002: Finished deploy [tilerator/deploy@ba00d7a] (eqiad): Update tilerator-package to {{Gerrit|1221976}} (duration: 01m 15s)
* 12:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:36 mbsantos@deploy1002: Started deploy [tilerator/deploy@ba00d7a] (eqiad): Update tilerator-package to {{Gerrit|1221976}}
* 12:36 mbsantos@deploy1002: Finished deploy [tilerator/deploy@ba00d7a] (codfw): Update tilerator-package to {{Gerrit|1221976}} (duration: 02m 06s)
* 12:34 mbsantos@deploy1002: Started deploy [tilerator/deploy@ba00d7a] (codfw): Update tilerator-package to {{Gerrit|1221976}}
* 12:34 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:735394{{!}}Remove tmpUseRequestLanguagesForRdfOutput Wikibase setting (T285795)]] (2/2) (duration: 00m 56s)
* 12:32 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:735394{{!}}Remove tmpUseRequestLanguagesForRdfOutput Wikibase setting (T285795)]] (1/2) (duration: 00m 56s)
* 12:30 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:30 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:25 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes.php wikimaniawiki --fix # [[phab:T295267|T295267]] (0 to fix, 0 resolvable, 0 deleted, looks good)
* 12:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:20 urbanecm: Connect `Jbuatti (WMF)@foundationwiki` to SUL
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737082{{!}}create 2022 namespace for wikimaniawiki (T295267)]] (duration: 00m 56s)
* 12:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:07 urbanecm: wikiadmin@10.64.48.109(centralauth)> delete from globalnames where gn_name='DJemielniak (WMF)'; # to let OIT create that account globally, SULification of foundationwiki, [[phab:T205347|T205347]]
* 12:07 urbanecm: wikiadmin@10.64.48.109(centralauth)> delete from localnames where ln_name='DJemielniak (WMF)' and ln_wiki='foundationwiki'; # to let OIT create that account globally, SULification of foundationwiki, [[phab:T205347|T205347]]
* 12:07 urbanecm: wikiadmin@10.64.48.109(centralauth)> delete from localnames where ln_wiki='foundationwiki' and ln_name='AAnctil (WMF)'; # to let OIT create that account globally, SULification of foundationwiki, [[phab:T205347|T205347]]
* 12:06 urbanecm: wikiadmin@10.64.48.109(centralauth)> select * from localnames where ln_name='AAnctil (WMF)'; # to let OIT create that account globally, SULification of foundationwiki, [[phab:T205347|T205347]]
* 12:06 urbanecm: wikiadmin@10.64.48.109(centralauth)> delete from globalnames where gn_name='AAnctil (WMF)'; # to let OIT create that account globally, SULification of foundationwiki, [[phab:T205347|T205347]]
* 09:38 marostegui: Upgrade db1124, db1125, db1133 and pc2014 to mariadb 10.4.22
* 09:22 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti6004.drmrs.wmnet with OS buster
* 08:43 volans@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6004.drmrs.wmnet with OS buster
* 08:39 volans@cumin1001: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host ganeti6004.drmrs.wmnet
* 08:22 volans@cumin1001: START - Cookbook sre.hosts.dhcp for host ganeti6004.drmrs.wmnet
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1109 with weight 0 [[phab:T294321|T294321]]', diff saved to https://phabricator.wikimedia.org/P17715 and previous config saved to /var/cache/conftool/dbconfig/20211110-064120-root.json
* 04:15 tgr: [[phab:T283606|T283606]]: running foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --search-index
* 01:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:54 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737808{{!}}Scale down the foundation wiki logo (T295303)]] (duration: 00m 56s)
* 00:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:48 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737771{{!}}Add mobile logo and wordmark for metawiki (T295303)]] (duration: 00m 55s)
* 00:47 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/: Config: [[gerrit:737771{{!}}Add mobile logo and wordmark for metawiki (T295303)]] (duration: 00m 56s)
* 00:42 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737773{{!}}Add mobile wordmark for foundation-wiki (T295303)]] (duration: 00m 55s)
* 00:41 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikimedia-wordmark.svg: Config: [[gerrit:737773{{!}}Add mobile wordmark for foundation-wiki (T295303)]] (duration: 00m 56s)
* 00:39 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:29 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:737081{{!}}Add enwikibooks in wgImportSources to bnwikibooks (T295051)]] (duration: 00m 56s)
* 00:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2021-11-09 ==
== 2023-05-27 ==
* 20:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:40 Amir1: insert into templatelinks (tl_from, tl_from_namespace, tl_target_id) values (686, 0, 199); on db1154:3113 ([[phab:T337446|T337446]])
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:42 godog: silence systemd state alert flapping on stat1009 until monday
* 19:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:03 tzatziki: removing 1 file for legal compliance
* 19:55 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734422{{!}}Disable DPL on Wikinews where not in use (T287916)]] (duration: 00m 57s)
* 19:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:50 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734421{{!}}Disable DPL on Wikibooks where not in use (T287916)]] (duration: 00m 56s)
* 19:11 Reedy: echo "https://wikipedia.org/.well-known/assetlinks.json" {{!}} mwscript purgeList.php enwiki
* 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:59 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:45 mbsantos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:55 mutante: re-enabled puppet on mw* after deploying and testing gerrit:736595 on canary
* 17:37 mmandere@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns6001.wikimedia.org with OS buster
* 17:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:08 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host dns6001.wikimedia.org with OS buster
* 16:55 mmandere@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti6004.drmrs.wmnet with OS buster
* 16:50 mutante: snapshot* - disabling puppet - converting some crons
* 16:41 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6004.drmrs.wmnet with OS buster
* 16:38 mmandere@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti6004.drmrs.wmnet with OS buster
* 16:16 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 16:12 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 16:07 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6004.drmrs.wmnet with OS buster
* 16:07 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:49 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6004.drmrs.wmnet with OS buster
* 15:08 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6004.drmrs.wmnet with OS buster
* 14:52 bblack: rebooting ganeti6003
* 14:21 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 14:19 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti6003.drmrs.wmnet with OS buster
* 14:11 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001
* 14:08 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 13:51 vgutierrez: pool cp5006 (upload) running haproxy-tls - [[phab:T290005|T290005]]
* 13:50 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5006.eqsin.wmnet with OS buster
* 13:47 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 13:15 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6003.drmrs.wmnet with OS buster
* 13:09 mmandere@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti6003.drmrs.wmnet with OS buster
* 13:02 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6003.drmrs.wmnet with OS buster
* 12:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:22 Lucas_WMDE: UTC morning backport+config window done
* 12:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:735769{{!}}Remove unused `global` statement]] (duration: 00m 55s)
* 12:18 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 12:12 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6003.drmrs.wmnet with OS buster
* 12:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:07 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734717{{!}}Add language codes agq and mcn to wmgExtraLanguageNames (T288335, T293884)]] (duration: 00m 56s)
* 12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:57 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 11:48 volans@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Release v0.2.9 - volans@cumin2002
* 11:48 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp5006.eqsin.wmnet with OS buster
* 11:47 volans@cumin2002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Release v0.2.9 - volans@cumin2002
* 11:45 vgutierrez: depool cp5006 to be reimaged as cache::upload_haproxy - [[phab:T290005|T290005]]
* 11:40 volans@deploy1002: Finished deploy [homer/deploy@c570af3]: Homer release v0.2.9 (duration: 01m 29s)
* 11:39 volans@deploy1002: Started deploy [homer/deploy@c570af3]: Homer release v0.2.9
* 11:32 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6003.drmrs.wmnet with OS buster
* 10:22 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti6002.drmrs.wmnet with OS buster
* 09:31 vgutierrez: pool cp4026 - [[phab:T290005|T290005]]
* 09:03 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6002.drmrs.wmnet with OS buster
* 08:43 elukey: drop istio 1.6.* and kubeflow-kfserving-build images from the docker registry
* 07:23 elukey: `apt-get clean` on stat1006 to free some space (root partition full)
* 02:43 ejegg: updated fundraising CiviCRM from {{Gerrit|ac6f333d}} -> {{Gerrit|7e38867f}}
* 02:38 ejegg: updated payments-wiki {{Gerrit|73de4731}} -> {{Gerrit|49ad5962}}
* 02:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2021-11-08 ==
== 2023-05-26 ==
* 23:39 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:48 tzatziki: removing 2 files for legal compliance
* 23:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:50 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:50 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 20:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:47 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 20:06 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c09793f5a918d280df444ade10e28eca136f7508}}: kswiki: Adding wordmark and tagline to IS.php ([[phab:T294093|T294093]]) (duration: 00m 55s)
* 20:47 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:24 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 20:05 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/: {{Gerrit|5f7864f}}: {{Gerrit|54e7f74}}: kswiki: Adding wordmark and tagline files ([[phab:T294093|T294093]]) (duration: 00m 54s)
* 19:24 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 20:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:21 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 19:58 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e66bd53b54ee29423affcba768b7a3cf1a81714a}}: Enable TheWikipediaLibrary on meta & testwiki ([[phab:T288070|T288070]]) (duration: 00m 55s)
* 19:21 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 19:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:15 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 19:52 ottomata: an-coord1002: drop user 'admin'@'localhost'; start slave; to fix broken replication - [[phab:T284150|T284150]]
* 19:15 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:26 demon@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.10 refs [[phab:T330216|T330216]]
* 19:48 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1ca184b142f1f14b4c1537f6503e0ef893155453}}: Add a new "all assessments" option to MediaSearch assessments dropdown ([[phab:T285349|T285349]]) (duration: 00m 55s)
* 17:38 demon@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.10  refs [[phab:T330216|T330216]] (duration: 06m 10s)
* 19:46 sukhe: upload pdns-recursor 4.5.7-1wm1 to apt.wm.o (buster)
* 17:31 demon@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.10  refs [[phab:T330216|T330216]]
* 19:42 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.7/skins/MinervaNeue/resources/: {{Gerrit|8375e38ee4d57b4ff3d30be96473b927ac8e4ef0}}: Instrument mobile talk page clicks ([[phab:T294738|T294738]]) (duration: 00m 54s)
* 16:37 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host puppetboard2003.codfw.wmnet with OS bookworm
* 19:41 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.7/skins/MinervaNeue/includes/Skins/SkinMinerva.php: {{Gerrit|8375e38ee4d57b4ff3d30be96473b927ac8e4ef0}}: Instrument mobile talk page clicks ([[phab:T294738|T294738]]) (duration: 00m 54s)
* 16:36 jbond@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host puppetboard1003.eqiad.wmnet with OS bookworm
* 19:39 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.7/extensions/WikidataPageBanner/includes/WikidataPageBanner.php: {{Gerrit|2c74457b26e0d288a371fd76bcb91b697554f9fd}}: WikidataPageBanner should disable table of contents using public functions ([[phab:T295003|T295003]]) (duration: 00m 55s)
* 15:54 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:54 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2005-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
* 19:31 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.7/extensions/VisualEditor/modules/ve-mw/preinit/ve.init.mw.ArticleTargetSaver.js: {{Gerrit|9d7cde492a69dc4e18403f3dbacd2de27c3f05e0}}: ArticleTargetSaver: ve.init may be undefined ([[phab:T294981|T294981]]) (duration: 00m 55s)
* 15:52 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2005-dev.private.codfw.wikimedia.cloud - aborrero@cumin2002"
* 19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:50 aborrero@cumin2002: START - Cookbook sre.dns.netbox
* 19:22 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bf70a8bd3337bc24cc23b1f257f7eb99ec2607b8}}: Make reply tool available as opt-out on dewiki ([[phab:T294591|T294591]]) (duration: 00m 56s)
* 15:41 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetboard2003.codfw.wmnet with OS bookworm
* 19:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:40 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:40 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetboard1003.eqiad.wmnet with OS bookworm
* 17:51 vgutierrez: depool cp4026 - [[phab:T290005|T290005]]
* 15:38 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 17:39 vgutierrez: pool cp4026 - [[phab:T290005|T290005]]
* 15:36 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:31 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 15:34 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 17:27 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:34 elukey@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 17:27 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:31 nskaggs@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 16:59 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:30 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:59 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:08 nskaggs@cumin1001: START - Cookbook sre.wikireplicas.update-views
* 16:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:26 oblivian@puppetmaster1001: conftool action : set/weight=10; selector: cluster=videoscaler,dc=eqiad,name=parse.*
* 16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:25 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,dc=eqiad,name=parse.*
* 16:34 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:737425{{!}} Bumping portals to master (T128546)]] (duration: 00m 56s)
* 14:25 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,dc=eqiad,name="parse.*"
* 16:33 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:737425{{!}} Bumping portals to master (T128546)]] (duration: 00m 56s)
* 14:25 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,dc=eqiad,name="parse.*"
* 16:23 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:08 jbond@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host puppetboard1003.eqiad.wmnet
* 16:23 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:08 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM puppetboard1003.eqiad.wmnet - jbond@cumin1001"
* 16:18 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:06 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM puppetboard1003.eqiad.wmnet - jbond@cumin1001"
* 16:18 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:06 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetboard1003.eqiad.wmnet on all recursors
* 16:13 vgutierrez: depool cp4026 - [[phab:T290005|T290005]]
* 14:06 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache puppetboard1003.eqiad.wmnet on all recursors
* 16:08 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:06 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:08 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:06 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM puppetboard1003.eqiad.wmnet - jbond@cumin1001"
* 16:06 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:05 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM puppetboard1003.eqiad.wmnet - jbond@cumin1001"
* 16:06 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:03 jbond@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host puppetboard2003.codfw.wmnet
* 16:06 vgutierrez: pool cp4026 using haproxy as the TLS termination layer - [[phab:T290005|T290005]]
* 14:03 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM puppetboard2003.codfw.wmnet - jbond@cumin2002"
* 16:00 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:03 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM puppetboard2003.codfw.wmnet - jbond@cumin2002"
* 16:00 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:02 jbond@cumin1001: START - Cookbook sre.dns.netbox
* 15:51 XioNoX: remove ROA for 185.15.58.0/23
* 14:02 jbond@cumin1001: START - Cookbook sre.ganeti.makevm for new host puppetboard1003.eqiad.wmnet
* 15:50 XioNoX: create RIPE RPKI ROA for 2a02:ec80:600::/48 and 2a02:ec80:500::/48
* 14:02 jbond@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetboard2003.codfw.wmnet on all recursors
* 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:02 jbond@cumin2002: START - Cookbook sre.dns.wipe-cache puppetboard2003.codfw.wmnet on all recursors
* 15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:02 jbond@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:18 bblack: asw1-b13-drmrs: "delete forwarding-options dhcp-relay forward-only" to fix dhcp+installer issues in this rack.
* 14:02 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM puppetboard2003.codfw.wmnet - jbond@cumin2002"
* 15:12 ema: A:cp re-enable puppet after testing https://gerrit.wikimedia.org/r/c/operations/puppet/+/737385 on cp4021 [[phab:T293879|T293879]]
* 14:01 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM puppetboard2003.codfw.wmnet - jbond@cumin2002"
* 15:02 ema: merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/737385 with puppet disabled on A:cp [[phab:T293879|T293879]]
* 13:58 jbond@cumin2002: START - Cookbook sre.dns.netbox
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 13:58 jbond@cumin2002: START - Cookbook sre.ganeti.makevm for new host puppetboard2003.codfw.wmnet
* 13:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 13:58 jbond@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host puppetdb2003.codfw.wmnet
* 13:32 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4026.ulsfo.wmnet with OS buster
* 13:58 jbond@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 13:56 jbond@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 13:56 jbond@cumin2002: START - Cookbook sre.ganeti.makevm for new host puppetdb2003.codfw.wmnet
* 13:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:56 jbond@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host puppetdb1003.eqiad.wmnet
* 13:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:56 jbond@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:55 jbond@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host puppetdb2003.codfw.wmnet
* 12:43 Lucas_WMDE: UTC morning backport+config window done
* 13:55 jbond@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:52 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:28 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp4026.ulsfo.wmnet with OS buster
* 13:51 jbond@cumin1001: START - Cookbook sre.dns.netbox
* 12:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:46 jbond@cumin2002: START - Cookbook sre.dns.netbox
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:699692{{!}}Update autonyms in wmgExtraLanguageNames (T284870)]] (duration: 00m 56s)
* 13:46 jbond@cumin2002: START - Cookbook sre.ganeti.makevm for new host puppetdb2003.codfw.wmnet
* 12:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:45 jbond@cumin1001: START - Cookbook sre.dns.netbox
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust weights for s5 codfw replicas after removing special groups from them [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17708 and previous config saved to /var/cache/conftool/dbconfig/20211108-120203-marostegui.json
* 13:45 jbond@cumin1001: START - Cookbook sre.ganeti.makevm for new host puppetdb1003.eqiad.wmnet
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions logpager recentchanges recentchangeslinked watchlist from s5 codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17707 and previous config saved to /var/cache/conftool/dbconfig/20211108-115945-marostegui.json
* 13:13 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:41 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6002.drmrs.wmnet with OS buster
* 13:13 bblack@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add the new pybal IPs at edge-only sites - bblack@cumin1001"
* 11:32 vgutierrez@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4026.ulsfo.wmnet with OS buster
* 13:12 bblack@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add the new pybal IPs at edge-only sites - bblack@cumin1001"
* 11:01 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6002.drmrs.wmnet with OS buster
* 13:06 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 10:53 hnowlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 12:47 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1023.eqiad.wmnet with OS bullseye
* 10:53 hnowlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 12:43 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:49 hnowlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 12:43 bblack@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add rest of eqiad+codfw pybal IPs - bblack@cumin1001"
* 10:49 hnowlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 12:41 bblack@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add rest of eqiad+codfw pybal IPs - bblack@cumin1001"
* 10:49 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp4026.ulsfo.wmnet with OS buster
* 12:39 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 10:27 vgutierrez: depool cp4026 to be reimaged as a haproxy-tls test node - [[phab:T290005|T290005]]
* 12:21 hashar@deploy1002: Finished deploy [gerrit/gerrit@0932557]: wm-patch-demo: do not return runs when there are no wikis {{!}} [[phab:T332474|T332474]] (duration: 00m 08s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 12:21 hashar@deploy1002: Started deploy [gerrit/gerrit@0932557]: wm-patch-demo: do not return runs when there are no wikis {{!}} [[phab:T332474|T332474]]
* 10:17 Lucas_WMDE: Deployed patch for [[phab:T294693|T294693]]
* 11:50 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1023.eqiad.wmnet with OS bullseye
* 09:47 XioNoX: all core routers: add drmrs to prefix lists + confed
* 11:35 hashar@deploy1002: Finished deploy [gerrit/gerrit@c490ae6]: wm-patch-demo: link to other patches, use WARNING to prevent chipset collapsing {{!}} [[phab:T332474|T332474]] (duration: 00m 08s)
* 09:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 11:35 hashar@deploy1002: Started deploy [gerrit/gerrit@c490ae6]: wm-patch-demo: link to other patches, use WARNING to prevent chipset collapsing {{!}} [[phab:T332474|T332474]]
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 10:54 cmooney@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:23 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 10:54 cmooney@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:22 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 10:38 cmooney@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:22 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:27 cmooney@cumin1001: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 09:54 effie: pool parse1013-parse1016 to the jobrunner cluster  - [[phab:T329366|T329366]]
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 09:29 jbond: disable puppet fleet wide to deploy minor puppet change https://gerrit.wikimedia.org/r/c/operations/puppet/+/923353
* 09:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 09:28 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1016.eqiad.wmnet with OS buster
* 08:51 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:26 effie: parse1013-parse1016 have neen depooled and removed from the parsoid-php service - [[phab:T329366|T329366]]
* 08:51 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:26 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1014.eqiad.wmnet with OS buster
* 08:24 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 09:24 jnuche@deploy1002: Installation of scap version "4.52.3" completed for 596 hosts
* 05:53 rzl: rebooted wikitech-static via rackspace web UI - [[phab:T295266|T295266]]
* 09:23 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1013.eqiad.wmnet with OS buster
* 09:23 jnuche@deploy1002: Installing scap version "4.52.3" for 596 hosts
* 09:13 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 09:13 elukey@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 09:08 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host parse1015.eqiad.wmnet with OS buster
* 08:59 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1016.eqiad.wmnet with reason: host reimage
* 08:56 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1014.eqiad.wmnet with reason: host reimage
* 08:54 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1013.eqiad.wmnet with reason: host reimage
* 08:54 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on parse1015.eqiad.wmnet with reason: host reimage
* 08:52 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1016.eqiad.wmnet with reason: host reimage
* 08:52 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1015.eqiad.wmnet with reason: host reimage
* 08:51 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1014.eqiad.wmnet with reason: host reimage
* 08:51 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1013.eqiad.wmnet with reason: host reimage
* 08:39 jiji@cumin1001: START - Cookbook sre.hosts.reimage for host parse1016.eqiad.wmnet with OS buster
* 08:39 jiji@cumin1001: START - Cookbook sre.hosts.reimage for host parse1015.eqiad.wmnet with OS buster
* 08:39 jiji@cumin1001: START - Cookbook sre.hosts.reimage for host parse1014.eqiad.wmnet with OS buster
* 08:39 jiji@cumin1001: START - Cookbook sre.hosts.reimage for host parse1013.eqiad.wmnet with OS buster
* 08:10 jiji@cumin1001: conftool action : set/pooled=inactive; selector: dc=eqiad,name=parse101[3-6].eqiad.wmnet
* 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48591 and previous config saved to /var/cache/conftool/dbconfig/20230526-075903-root.json
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48590 and previous config saved to /var/cache/conftool/dbconfig/20230526-075809-root.json
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48589 and previous config saved to /var/cache/conftool/dbconfig/20230526-074358-root.json
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48588 and previous config saved to /var/cache/conftool/dbconfig/20230526-074304-root.json
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48587 and previous config saved to /var/cache/conftool/dbconfig/20230526-072854-root.json
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48586 and previous config saved to /var/cache/conftool/dbconfig/20230526-072759-root.json
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48585 and previous config saved to /var/cache/conftool/dbconfig/20230526-071349-root.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48584 and previous config saved to /var/cache/conftool/dbconfig/20230526-071255-root.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48583 and previous config saved to /var/cache/conftool/dbconfig/20230526-065844-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48582 and previous config saved to /var/cache/conftool/dbconfig/20230526-065750-root.json
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48581 and previous config saved to /var/cache/conftool/dbconfig/20230526-064340-root.json
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48580 and previous config saved to /var/cache/conftool/dbconfig/20230526-064245-root.json
* 06:42 elukey: `apt-get clean` on stat1008 to clean up some space in the root partition
* 06:36 elukey: `truncate /var/log/kerberos/krb5kdc.log -s 10g` on krb1001 to avoid the root partition to fill up
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48579 and previous config saved to /var/cache/conftool/dbconfig/20230526-062835-root.json
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48578 and previous config saved to /var/cache/conftool/dbconfig/20230526-062741-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1156 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48577 and previous config saved to /var/cache/conftool/dbconfig/20230526-061330-root.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48576 and previous config saved to /var/cache/conftool/dbconfig/20230526-061236-root.json
* 03:51 fab@deploy1002: Finished deploy [airflow-dags/research@77cf676]: (no justification provided) (duration: 00m 17s)
* 03:51 fab@deploy1002: Started deploy [airflow-dags/research@77cf676]: (no justification provided)


== 2021-11-06 ==
== 2023-05-25 ==
* 01:43 dduvall@deploy1002: Synchronized php-1.38.0-wmf.7/includes/parser/ParserOutput.php: Backport: [[gerrit:737079{{!}}Regression fix: do language conversion on ToC in ParserOutput::getText() (T295187)]] (duration: 00m 56s)
* 22:14 zabe@deploy1002: Finished scap: Backport for [[gerrit:923283{{!}}Replace deprecated Hooks::runWithoutAbort (T335536)]], [[gerrit:923276{{!}}BannerRenderer: Make sure the language variant is valid (T337427)]] (duration: 09m 14s)
* 01:42 dduvall: emergency backport https://gerrit.wikimedia.org/r/c/mediawiki/core/+/737079 deployed and verified on mwdebug1002. syncing to all targets
* 22:07 zabe@deploy1002: zabe and ladsgroup: Backport for [[gerrit:923283{{!}}Replace deprecated Hooks::runWithoutAbort (T335536)]], [[gerrit:923276{{!}}BannerRenderer: Make sure the language variant is valid (T337427)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 01:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 zabe@deploy1002: Started scap: Backport for [[gerrit:923283{{!}}Replace deprecated Hooks::runWithoutAbort (T335536)]], [[gerrit:923276{{!}}BannerRenderer: Make sure the language variant is valid (T337427)]]
* 01:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:26 htriedman@deploy1002: Finished deploy [airflow-dags/platform_eng@77cf676]: (no justification provided) (duration: 00m 08s)
* 01:33 dduvall: performing emergency backport deployment of https://gerrit.wikimedia.org/r/c/mediawiki/core/+/737079
* 21:25 htriedman@deploy1002: Started deploy [airflow-dags/platform_eng@77cf676]: (no justification provided)
* 20:47 TheresNoTime: close UTC late backport
* 20:47 samtar@deploy1002: Finished scap: Backport for [[gerrit:923282{{!}}Manual backport of OOUI change I63293edd62 (tab dialog fix) (T337515)]] (duration: 08m 34s)
* 20:40 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:923282{{!}}Manual backport of OOUI change I63293edd62 (tab dialog fix) (T337515)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:38 samtar@deploy1002: Started scap: Backport for [[gerrit:923282{{!}}Manual backport of OOUI change I63293edd62 (tab dialog fix) (T337515)]]
* 20:32 samtar@deploy1002: Finished scap: Backport for [[gerrit:923281{{!}}Use document feature classes to extract A/B test state (T335972)]] (duration: 10m 58s)
* 20:22 samtar@deploy1002: jdrewniak and samtar: Backport for [[gerrit:923281{{!}}Use document feature classes to extract A/B test state (T335972)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:21 samtar@deploy1002: Started scap: Backport for [[gerrit:923281{{!}}Use document feature classes to extract A/B test state (T335972)]]
* 20:13 samtar@deploy1002: Finished scap: Backport for [[gerrit:919838{{!}}[prod] Configure logging for the CampaignEvents channel (T337365)]] (duration: 08m 31s)
* 20:06 samtar@deploy1002: samtar and daimona: Backport for [[gerrit:919838{{!}}[prod] Configure logging for the CampaignEvents channel (T337365)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 20:05 samtar@deploy1002: Started scap: Backport for [[gerrit:919838{{!}}[prod] Configure logging for the CampaignEvents channel (T337365)]]
* 19:32 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:32 bblack@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add pybal-low-traffic.svc.codfw.wmnet - bblack@cumin1001"
* 19:31 bblack@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add pybal-low-traffic.svc.codfw.wmnet - bblack@cumin1001"
* 19:29 bblack@cumin1001: START - Cookbook sre.dns.netbox
* 19:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48575 and previous config saved to /var/cache/conftool/dbconfig/20230525-190946-root.json
* 19:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48574 and previous config saved to /var/cache/conftool/dbconfig/20230525-190859-root.json
* 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48573 and previous config saved to /var/cache/conftool/dbconfig/20230525-185441-root.json
* 18:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48572 and previous config saved to /var/cache/conftool/dbconfig/20230525-185354-root.json
* 18:43 htriedman@deploy1002: Finished deploy [airflow-dags/platform_eng@6b27584]: (no justification provided) (duration: 00m 19s)
* 18:43 htriedman@deploy1002: Started deploy [airflow-dags/platform_eng@6b27584]: (no justification provided)
* 18:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48571 and previous config saved to /var/cache/conftool/dbconfig/20230525-183937-root.json
* 18:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48570 and previous config saved to /var/cache/conftool/dbconfig/20230525-183849-root.json
* 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48568 and previous config saved to /var/cache/conftool/dbconfig/20230525-182432-root.json
* 18:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48567 and previous config saved to /var/cache/conftool/dbconfig/20230525-182345-root.json
* 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48566 and previous config saved to /var/cache/conftool/dbconfig/20230525-180927-root.json
* 18:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48565 and previous config saved to /var/cache/conftool/dbconfig/20230525-180840-root.json
* 17:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48564 and previous config saved to /var/cache/conftool/dbconfig/20230525-175423-root.json
* 17:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48563 and previous config saved to /var/cache/conftool/dbconfig/20230525-175335-root.json
* 17:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48562 and previous config saved to /var/cache/conftool/dbconfig/20230525-173918-root.json
* 17:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48561 and previous config saved to /var/cache/conftool/dbconfig/20230525-173831-root.json
* 17:27 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:27 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update DNS entires for migration IPs eqiad row E F switches. - cmooney@cumin1001"
* 17:26 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update DNS entires for migration IPs eqiad row E F switches. - cmooney@cumin1001"
* 17:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48559 and previous config saved to /var/cache/conftool/dbconfig/20230525-172413-root.json
* 17:23 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 17:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1158 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48558 and previous config saved to /var/cache/conftool/dbconfig/20230525-172326-root.json
* 17:15 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:14 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:14 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:14 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:13 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:12 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:09 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 17:08 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 17:07 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 17:06 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 17:05 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 17:03 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:39 topranks: adding outbound shaper config on eqsin to codfw transport cct ([[phab:T328313|T328313]])
* 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48557 and previous config saved to /var/cache/conftool/dbconfig/20230525-163657-ladsgroup.json
* 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P48556 and previous config saved to /var/cache/conftool/dbconfig/20230525-162151-ladsgroup.json
* 16:18 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 16:18 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 16:14 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 16:14 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 16:11 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e[1,3]-eqiad.mgmt,lsw1-f1-eqiad.mgmt with reason: Migrate lsw1-e3-eqiad uplinks to spine
* 16:11 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e[1,3]-eqiad.mgmt,lsw1-f1-eqiad.mgmt with reason: Migrate lsw1-e3-eqiad uplinks to spine
* 16:07 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gerrit2002.wikimedia.org with reason: maintenance
* 16:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on gerrit2002.wikimedia.org with reason: maintenance
* 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P48555 and previous config saved to /var/cache/conftool/dbconfig/20230525-160645-ladsgroup.json
* 16:02 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
* 15:57 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e2-eqiad.mgmt,lsw1-f1-eqiad.mgmt with reason: Migrate lsw1-e2-eqiad uplink from lsw1-f1 to ssw1-f1
* 15:56 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e2-eqiad.mgmt,lsw1-f1-eqiad.mgmt with reason: Migrate lsw1-e2-eqiad uplink from lsw1-f1 to ssw1-f1
* 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48553 and previous config saved to /var/cache/conftool/dbconfig/20230525-155139-ladsgroup.json
* 15:49 dancy@deploy1002: Finished deploy [integration/docroot@dac2b70]: Updated Scap URLs (duration: 00m 07s)
* 15:49 dancy@deploy1002: Started deploy [integration/docroot@dac2b70]: Updated Scap URLs
* 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T336886|T336886]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20230525-154927-ladsgroup.json
* 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 ([[phab:T336886|T336886]])', diff saved to  and previous config saved to /var/cache/conftool/dbconfig/20230525-154906-ladsgroup.json
* 15:44 dancy: dancy@deploy1002 Updated scap URLs on doc.wikimedia.org
* 15:43 dancy@deploy1002: Finished deploy [integration/docroot@78e6f40]: (no justification provided) (duration: 00m 10s)
* 15:43 dancy@deploy1002: Started deploy [integration/docroot@78e6f40]: (no justification provided)
* 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P48552 and previous config saved to /var/cache/conftool/dbconfig/20230525-153359-ladsgroup.json
* 15:33 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e[1-2]-eqiad.mgmt with reason: Migrate lsw1-e1-eqiad to cr1-eqiad link to ssw1-e1-eqiad
* 15:33 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e[1-2]-eqiad.mgmt with reason: Migrate lsw1-e1-eqiad to cr1-eqiad link to ssw1-e1-eqiad
* 15:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 15:30 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
* 15:28 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbproxy1022.eqiad.wmnet with OS bullseye
* 15:27 kartik@deploy1002: Finished scap: Backport for [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]] (duration: 07m 01s)
* 15:22 kartik@deploy1002: kartik: Backport for [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 15:21 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-eqiad,lsw1-f1-eqiad.mgmt with reason: Migrate lsw1-e1-eqiad to cr2-eqiad link to ssw1-e1-eqiad
* 15:20 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cr2-eqiad,lsw1-f1-eqiad.mgmt with reason: Migrate lsw1-e1-eqiad to cr2-eqiad link to ssw1-e1-eqiad
* 15:20 kartik@deploy1002: Started scap: Backport for [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]]
* 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P48551 and previous config saved to /var/cache/conftool/dbconfig/20230525-151853-ladsgroup.json
* 15:18 kartik@deploy1002: Finished scap: Backport for [[gerrit:923268{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]], [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]] (duration: 68m 07s)
* 15:14 dzahn@cumin1001: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
* 15:10 topranks: Migrating cr1-eqiad downlink to row E/F from lsw1-e1-eqiad et-0/0/48 to ssw1-e1-eqiad et-0/0/31
* 15:10 mutante: gerrit-replica.wikimedia.org - gerrit2002 - reimaging - scheduled maintenance
* 15:09 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: maintenance
* 15:08 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: maintenance
* 15:04 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-eqiad,lsw1-e1-eqiad.mgmt with reason: Migrate lsw1-e1-eqiad to cr1-eqiad link to ssw1-e1-eqiad
* 15:04 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cr1-eqiad,lsw1-e1-eqiad.mgmt with reason: Migrate lsw1-e1-eqiad to cr1-eqiad link to ssw1-e1-eqiad
* 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48550 and previous config saved to /var/cache/conftool/dbconfig/20230525-150347-ladsgroup.json
* 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48549 and previous config saved to /var/cache/conftool/dbconfig/20230525-145857-ladsgroup.json
* 14:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 14:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48548 and previous config saved to /var/cache/conftool/dbconfig/20230525-145836-ladsgroup.json
* 14:54 marostegui: Wikireplicas are lagging behind for the following sections: s1, s2, s5, s7 [[phab:T337446|T337446]]
* 14:54 aikochou@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P48547 and previous config saved to /var/cache/conftool/dbconfig/20230525-144330-ladsgroup.json
* 14:32 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1022.eqiad.wmnet with OS bullseye
* 14:29 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['dbproxy1026']
* 14:29 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbproxy1027']
* 14:28 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1027']
* 14:28 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1026']
* 14:28 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbproxy1025']
* 14:28 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbproxy1024']
* 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P48546 and previous config saved to /var/cache/conftool/dbconfig/20230525-142824-ladsgroup.json
* 14:28 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1025']
* 14:28 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1024']
* 14:28 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbproxy1023']
* 14:28 jclark@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbproxy1022']
* 14:27 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1022']
* 14:27 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1023']
* 14:27 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbproxy1023']
* 14:27 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbproxy1022']
* 14:27 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1022']
* 14:26 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1023']
* 14:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbproxy1022']
* 14:26 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1022']
* 14:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbproxy1022']
* 14:25 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1022']
* 14:25 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbproxy1026']
* 14:22 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=videoscaler
* 14:22 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=jobrunner
* 14:22 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1072']
* 14:22 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver
* 14:21 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:21 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=api_appserver,dc=eqiad
* 14:21 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=appserver,dc=eqiad
* 14:20 jclark@cumin1001: START - Cookbook sre.dns.netbox
* 14:14 bblack@cumin1001: conftool action : set/pooled=yes; selector: service=parsoid-php,dc=eqiad
* 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48545 and previous config saved to /var/cache/conftool/dbconfig/20230525-141318-ladsgroup.json
* 14:12 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:11 kartik@deploy1002: kartik: Backport for [[gerrit:923268{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]], [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 14:11 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1026.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:10 kartik@deploy1002: Started scap: Backport for [[gerrit:923268{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]], [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]]
* 14:09 volans@cumin1001: END (PASS) - Cookbook sre.puppetboard.restart-reboot (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>puppetboard2002.codfw.wmnet<nowiki>}</nowiki> and (A:puppetboard)
* 14:09 volans@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetboard.discovery.wmnet. on all recursors
* 14:08 volans@cumin1001: START - Cookbook sre.dns.wipe-cache puppetboard.discovery.wmnet. on all recursors
* 14:08 volans@cumin1001: START - Cookbook sre.puppetboard.restart-reboot rolling restart_daemons on P<nowiki>{</nowiki>puppetboard2002.codfw.wmnet<nowiki>}</nowiki> and (A:puppetboard)
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 ([[phab:T336886|T336886]])', diff saved to https://phabricator.wikimedia.org/P48544 and previous config saved to /var/cache/conftool/dbconfig/20230525-140822-ladsgroup.json
* 14:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 14:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 14:08 kartik@deploy1002: Finished scap: Backport for [[gerrit:923268{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]], [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]] (duration: 15m 56s)
* 13:53 kartik@deploy1002: kartik: Backport for [[gerrit:923268{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]], [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 13:52 kartik@deploy1002: Started scap: Backport for [[gerrit:923268{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]], [[gerrit:923269{{!}}Show Contribute menu item in main menu when Special:Contribute is enabled (T336838)]]
* 13:46 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:923252{{!}}Change maint script to do work via jobs]] (duration: 07m 42s)
* 13:44 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
* 13:44 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1026.mgmt.eqiad.wmnet with reboot policy FORCED
* 13:38 urbanecm@deploy1002: Started scap: Backport for [[gerrit:923252{{!}}Change maint script to do work via jobs]]
* 13:28 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:923273{{!}}Handle 'prefix' when 'action=edit', even if another extension overrides action (T337436)]], [[gerrit:923274{{!}}Handle 'prefix' when 'action=edit', even if another extension overrides action (T337436)]] (duration: 09m 06s)
* 13:24 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1026.mgmt.eqiad.wmnet with reboot policy FORCED
* 13:20 urbanecm@deploy1002: urbanecm and matmarex: Backport for [[gerrit:923273{{!}}Handle 'prefix' when 'action=edit', even if another extension overrides action (T337436)]], [[gerrit:923274{{!}}Handle 'prefix' when 'action=edit', even if another extension overrides action (T337436)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 13:19 urbanecm@deploy1002: Started scap: Backport for [[gerrit:923273{{!}}Handle 'prefix' when 'action=edit', even if another extension overrides action (T337436)]], [[gerrit:923274{{!}}Handle 'prefix' when 'action=edit', even if another extension overrides action (T337436)]]
* 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool sanitarium masters for s1, s5, s2, s7', diff saved to https://phabricator.wikimedia.org/P48538 and previous config saved to /var/cache/conftool/dbconfig/20230525-121012-root.json
* 11:56 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 11:56 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 11:54 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 11:54 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 11:52 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 11:51 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 11:49 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 11:49 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 11:43 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 11:43 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 11:40 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 11:40 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 11:39 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48537 and previous config saved to /var/cache/conftool/dbconfig/20230525-113914-root.json
* 11:38 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 11:38 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 11:38 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 11:31 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 11:31 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 11:30 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 11:30 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 11:28 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 11:27 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 11:26 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 11:26 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 11:25 cgoubert@deploy1002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
* 11:25 cgoubert@deploy1002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
* 11:25 cgoubert@deploy1002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
* 11:25 cgoubert@deploy1002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
* 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48536 and previous config saved to /var/cache/conftool/dbconfig/20230525-112409-root.json
* 11:22 cgoubert@deploy1002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
* 11:22 cgoubert@deploy1002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
* 11:21 cgoubert@deploy1002: helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync
* 11:20 cgoubert@deploy1002: helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync
* 11:15 jbond: update udplog on mwlog server
* 11:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48535 and previous config saved to /var/cache/conftool/dbconfig/20230525-110948-root.json
* 11:09 jbond: upload udplog_1.10_amd64.deb
* 11:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48534 and previous config saved to /var/cache/conftool/dbconfig/20230525-110905-root.json
* 11:05 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 11:04 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 11:03 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 11:03 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 10:54 klausman@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48533 and previous config saved to /var/cache/conftool/dbconfig/20230525-105443-root.json
* 10:54 klausman@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:54 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 10:54 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48532 and previous config saved to /var/cache/conftool/dbconfig/20230525-105400-root.json
* 10:53 klausman@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 klausman@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:49 klausman@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:49 klausman@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:48 klausman@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:41 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol2005-dev.wikimedia.org
* 10:41 aborrero@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:41 aborrero@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2005-dev.wikimedia.org decommissioned, removing all IPs except the asset tag one - aborrero@cumin2002"
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48531 and previous config saved to /var/cache/conftool/dbconfig/20230525-103939-root.json
* 10:39 aborrero@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol2005-dev.wikimedia.org decommissioned, removing all IPs except the asset tag one - aborrero@cumin2002"
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48530 and previous config saved to /var/cache/conftool/dbconfig/20230525-103855-root.json
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48529 and previous config saved to /var/cache/conftool/dbconfig/20230525-103445-root.json
* 10:32 aborrero@cumin2002: START - Cookbook sre.dns.netbox
* 10:24 aborrero@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol2005-dev.wikimedia.org
* 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48528 and previous config saved to /var/cache/conftool/dbconfig/20230525-102434-root.json
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48527 and previous config saved to /var/cache/conftool/dbconfig/20230525-102351-root.json
* 10:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48526 and previous config saved to /var/cache/conftool/dbconfig/20230525-101940-root.json
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48525 and previous config saved to /var/cache/conftool/dbconfig/20230525-100927-root.json
* 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48524 and previous config saved to /var/cache/conftool/dbconfig/20230525-100846-root.json
* 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48523 and previous config saved to /var/cache/conftool/dbconfig/20230525-100436-root.json
* 10:00 kart_: Updated cxserver to 2023-05-25-093623-production (config: language pairs transform fix + [[phab:T331201|T331201]])
* 09:57 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 09:56 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48522 and previous config saved to /var/cache/conftool/dbconfig/20230525-095423-root.json
* 09:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48521 and previous config saved to /var/cache/conftool/dbconfig/20230525-095341-root.json
* 09:51 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 09:51 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48520 and previous config saved to /var/cache/conftool/dbconfig/20230525-094931-root.json
* 09:48 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 09:48 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48519 and previous config saved to /var/cache/conftool/dbconfig/20230525-093918-root.json
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48518 and previous config saved to /var/cache/conftool/dbconfig/20230525-093426-root.json
* 09:32 apergos: running from dumpsdata1004 via ariel login screen session, as root, rsync with bwlimit 100000  to dumpsdata1006, copying all public xml dumps data
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'db2179 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48517 and previous config saved to /var/cache/conftool/dbconfig/20230525-092413-root.json
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48516 and previous config saved to /var/cache/conftool/dbconfig/20230525-091922-root.json
* 09:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2179', diff saved to https://phabricator.wikimedia.org/P48515 and previous config saved to /var/cache/conftool/dbconfig/20230525-091132-root.json
* 09:10 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1026.mgmt.eqiad.wmnet with reboot policy FORCED
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48514 and previous config saved to /var/cache/conftool/dbconfig/20230525-090417-root.json
* 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48513 and previous config saved to /var/cache/conftool/dbconfig/20230525-084912-root.json
* 08:32 elukey: revoke kafka_mirror_maker TLS cert (cergen based), remove old cergen certs from puppet private - [[phab:T337248|T337248]]
* 07:52 matthiasmullie: UTC morning backports done
* 07:51 mlitn@deploy1002: Finished scap: Backport for [[gerrit:922853{{!}}Change maint script to do work via jobs (T322872)]] (duration: 16m 12s)
* 07:37 mlitn@deploy1002: mlitn: Backport for [[gerrit:922853{{!}}Change maint script to do work via jobs (T322872)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 07:35 mlitn@deploy1002: Started scap: Backport for [[gerrit:922853{{!}}Change maint script to do work via jobs (T322872)]]
* 07:18 mlitn@deploy1002: Finished scap: Backport for [[gerrit:921561{{!}}[WikibaseMediaInfo] Add 'main subject of' property]] (duration: 14m 02s)
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1158', diff saved to https://phabricator.wikimedia.org/P48511 and previous config saved to /var/cache/conftool/dbconfig/20230525-071719-root.json
* 07:10 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 07:06 mlitn@deploy1002: mlitn: Backport for [[gerrit:921561{{!}}[WikibaseMediaInfo] Add 'main subject of' property]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 07:04 mlitn@deploy1002: Started scap: Backport for [[gerrit:921561{{!}}[WikibaseMediaInfo] Add 'main subject of' property]]
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1196', diff saved to https://phabricator.wikimedia.org/P48509 and previous config saved to /var/cache/conftool/dbconfig/20230525-064418-root.json
* 06:09 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1156', diff saved to https://phabricator.wikimedia.org/P48506 and previous config saved to /var/cache/conftool/dbconfig/20230525-055734-root.json
* 05:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 9 hosts with reason: [[phab:T337446|T337446]]
* 05:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 9 hosts with reason: [[phab:T337446|T337446]]
* 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1161', diff saved to https://phabricator.wikimedia.org/P48504 and previous config saved to /var/cache/conftool/dbconfig/20230525-055236-root.json
* 05:48 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 05:48 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 05:41 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 05:41 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:36 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:36 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2110', diff saved to https://phabricator.wikimedia.org/P48503 and previous config saved to /var/cache/conftool/dbconfig/20230525-051923-root.json
* 02:14 eileen: civicrm upgraded from {{Gerrit|b8cab6f6}} to {{Gerrit|415aa7e5}}
* 02:14 eileen: civicrm upgraded from {{Gerrit|b8cab6f6}} to {{Gerrit|415aa7e5}}


== 2021-11-05 ==
== 2023-05-24 ==
* 23:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:18 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:922921{{!}}[Growth] Deploy Personalized praise to pilot wikis with notifications (T334630)]] (duration: 09m 40s)
* 23:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:10 urbanecm@deploy1002: urbanecm: Backport for [[gerrit:922921{{!}}[Growth] Deploy Personalized praise to pilot wikis with notifications (T334630)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 23:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:08 urbanecm@deploy1002: Started scap: Backport for [[gerrit:922921{{!}}[Growth] Deploy Personalized praise to pilot wikis with notifications (T334630)]]
* 22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:55 samtar@deploy1002: Finished scap: Backport for [[gerrit:922855{{!}}ipInfo.hooks: Use wgRelevantUserName (T337373)]] (duration: 08m 15s)
* 22:48 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:48 samtar@deploy1002: samtar: Backport for [[gerrit:922855{{!}}ipInfo.hooks: Use wgRelevantUserName (T337373)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 22:45 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:47 samtar@deploy1002: Started scap: Backport for [[gerrit:922855{{!}}ipInfo.hooks: Use wgRelevantUserName (T337373)]]
* 22:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:922854{{!}}ipInfo.hooks: Use wgRelevantUserName (T337373)]] (duration: 08m 31s)
* 22:32 dduvall: re-rolling 1.38.0-wmf.7 to all wikis due to a better of two evil regressions UBN [[phab:T295187|T295187]] (refs [[phab:T293948|T293948]])
* 20:18 samtar@deploy1002: samtar: Backport for [[gerrit:922854{{!}}ipInfo.hooks: Use wgRelevantUserName (T337373)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 22:32 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]]
* 20:16 samtar@deploy1002: Started scap: Backport for [[gerrit:922854{{!}}ipInfo.hooks: Use wgRelevantUserName (T337373)]]
* 22:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:15 ayounsi@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
* 22:21 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0/group1 to 1.38.0-wmf.7  refs [[phab:T293948|T293948]]"
* 20:08 ayounsi@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
* 22:19 dduvall: rolling back 1.38.0-wmf.7 from group1 and group0 due to UBN [[phab:T295187|T295187]] (refs [[phab:T293948|T293948]])
* 19:49 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:17 dduvall@deploy1002: rebuilt and synchronized wikiversions files: Revert "all wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]]"
* 19:49 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbproxy1026.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:09 dduvall: rolling back 1.38.0-wmf.7 from all wikis due to UBN [[phab:T295187|T295187]] (refs [[phab:T293948|T293948]])
* 19:45 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1027.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:41 mutante: removing mediawiki font packages from labweb* (wikitech wiki)
* 19:45 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1026.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:35 XioNoX: cr2-codfw> request chassis fpc online slot 0 - [[phab:T294789|T294789]]
* 19:35 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1025.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:20 legoktm: upgrading scap to 4.0.3 everywhere ([[phab:T294966|T294966]])
* 19:35 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1024.mgmt.eqiad.wmnet with reboot policy FORCED
* 18:01 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti6001.drmrs.wmnet with OS buster
* 19:24 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 17:22 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6001.drmrs.wmnet with OS buster
* 19:24 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 16:52 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6001.drmrs.wmnet with OS buster
* 19:12 demon@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.9  refs [[phab:T330216|T330216]] (duration: 06m 00s)
* 16:30 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6001.drmrs.wmnet with OS buster
* 19:06 demon@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.9  refs [[phab:T330216|T330216]]
* 16:21 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 18:55 demon@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.10  refs [[phab:T330216|T330216]] (duration: 06m 00s)
* 16:01 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001
* 18:49 demon@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.10  refs [[phab:T330216|T330216]]
* 15:38 hnowlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 18:48 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1025.mgmt.eqiad.wmnet with reboot policy FORCED
* 15:38 hnowlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 18:48 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1024.mgmt.eqiad.wmnet with reboot policy FORCED
* 14:30 jayme: published docker-registry.discovery.wmnet/golang1.17:1.17-1
* 18:47 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1023.mgmt.eqiad.wmnet with reboot policy FORCED
* 13:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2001.codfw.wmnet
* 18:41 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1149.mgmt.eqiad.wmnet with reboot policy FORCED
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
* 18:32 jclark@cumin1001: START - Cookbook sre.hosts.provision for host an-worker1149.mgmt.eqiad.wmnet with reboot policy FORCED
* 12:50 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6001.drmrs.wmnet with OS buster
* 17:22 ejegg: civicrm upgraded from {{Gerrit|4251dfa1}} to {{Gerrit|b8cab6f6}}
* 12:22 moritzm: renamed Ganeti group of test cluster from "default" to "row_A" (following conventions in main DCs) [[phab:T286206|T286206]]
* 16:54 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@1603ecf]: Deploying [[phab:T336800|T336800]] on platform_eng Airflow instance (duration: 00m 09s)
* 12:10 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6001.drmrs.wmnet with OS buster
* 16:54 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@1603ecf]: Deploying [[phab:T336800|T336800]] on platform_eng Airflow instance
* 12:01 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2001.codfw.wmnet
* 16:05 elukey: move kafka mirror on kafka main brokers to PKI - [[phab:T337248|T337248]]
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
* 16:01 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:922852{{!}}Personalized praise: Add instrumentation (T325117)]], [[gerrit:922851{{!}}Personalized praise: Add instrumentation (T325117)]] (duration: 08m 33s)
* 11:09 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6001.drmrs.wmnet with OS buster
* 15:56 elukey: move kafka mirror on kafka jumbo brokers to PKI - [[phab:T337248|T337248]]
* 10:29 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6001.drmrs.wmnet with OS buster
* 15:54 urbanecm@deploy1002: urbanecm: Backport for [[gerrit:922852{{!}}Personalized praise: Add instrumentation (T325117)]], [[gerrit:922851{{!}}Personalized praise: Add instrumentation (T325117)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 09:53 ema: cp[4033-4036]: upgrade varnish to 6.0.8-1wm2 [[phab:T295120|T295120]]
* 15:52 urbanecm@deploy1002: Started scap: Backport for [[gerrit:922852{{!}}Personalized praise: Add instrumentation (T325117)]], [[gerrit:922851{{!}}Personalized praise: Add instrumentation (T325117)]]
* 09:43 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2002.codfw.wmnet
* 15:47 ejegg: payments-wiki upgraded from {{Gerrit|e02bc7c5}} to {{Gerrit|c2f9f8b5}}
* 09:39 mmandere@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti6001.drmrs.wmnet with OS buster
* 15:39 aqu@deploy1002: Finished deploy [analytics/refinery@24ff363] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@24ff363] (duration: 01m 35s)
* 09:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet
* 15:38 ejegg: standalone SmashPig upgraded from {{Gerrit|5460dbe2}} to {{Gerrit|db23b998}}
* 09:27 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2001.codfw.wmnet
* 15:37 aqu@deploy1002: Started deploy [analytics/refinery@24ff363] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@24ff363]
* 09:19 Amir1: Upgrade db1151 [[phab:T295026|T295026]]
* 15:37 aqu@deploy1002: Finished deploy [analytics/refinery@24ff363] (thin): Regular analytics weekly train THIN [analytics/refinery@24ff363] (duration: 00m 04s)
* 09:09 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
* 15:37 aqu@deploy1002: Started deploy [analytics/refinery@24ff363] (thin): Regular analytics weekly train THIN [analytics/refinery@24ff363]
* 09:01 ema: apt.wm.org: remove varnish 6.0.8-1wm1 from component main of buster-wikimedia, we use component/varnish6 instead
* 15:35 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1023.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:59 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti6001.drmrs.wmnet with OS buster
* 15:34 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1022.mgmt.eqiad.wmnet with reboot policy FORCED
* 08:52 moritzm: installing set kvm::machine_version for ganeti-test cluster to pc-i440fx-2.8 [[phab:T286206|T286206]]
* 15:32 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:46 Amir1: Upgrade db2142 [[phab:T295026|T295026]]
* 15:31 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:43 moritzm: installing reportbug bugfix updates from Bullseye 11.1 point release
* 15:31 aqu@deploy1002: Finished deploy [analytics/refinery@24ff363]: Regular analytics weekly train [analytics/refinery@24ff363] (duration: 06m 13s)
* 08:41 moritzm: installing tmux bugfix updates from Bullseye 11.1 point release
* 15:31 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 6 hosts with reason: Upgrade x2 masters [[phab:T295026|T295026]]
* 15:30 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 6 hosts with reason: Upgrade x2 masters [[phab:T295026|T295026]]
* 15:26 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 08:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Upgrade x2 masters [[phab:T295026|T295026]]
* 15:26 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 08:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Upgrade x2 masters [[phab:T295026|T295026]]
* 15:25 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 07:44 XioNoX: restart scs-a8-eqiad
* 15:25 aqu@deploy1002: Started deploy [analytics/refinery@24ff363]: Regular analytics weekly train [analytics/refinery@24ff363]
* 05:31 marostegui: Upgrade clouddb1016
* 15:24 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 05:31 marostegui: Upgrade clouddb1020
* 15:23 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 00:16 mutante: phab1001 - sudo systemctl start phabricator_clean_tmp_files.service  because Icinga alerted it had failed... worked fine
* 15:23 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 00:06 mutante: https://labtestwikitech.wikimedia.org - purging mediawiki font packages from backend server
* 15:22 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:22 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:21 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:18 aqu: analytics-refinery, about to deploy
* 15:09 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:30 volans@cumin2002: END (PASS) - Cookbook sre.puppetboard.restart-reboot (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>puppetboard2002.codfw.wmnet<nowiki>}</nowiki> and (A:puppetboard)
* 14:30 volans@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) puppetboard.discovery.wmnet. on all recursors
* 14:30 volans@cumin2002: START - Cookbook sre.dns.wipe-cache puppetboard.discovery.wmnet. on all recursors
* 14:29 volans@cumin2002: START - Cookbook sre.puppetboard.restart-reboot rolling restart_daemons on P<nowiki>{</nowiki>puppetboard2002.codfw.wmnet<nowiki>}</nowiki> and (A:puppetboard)
* 14:26 volans@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 14:26 volans@cumin2002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 14:19 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:922838{{!}}Enable DiscussionTools newtopictool on fiwiki (T317375)]] (duration: 12m 11s)
* 14:13 hashar@deploy1002: Finished deploy [gerrit/gerrit@2d719f3]: wm-patch-demo: initial implementation {{!}} [[phab:T332474|T332474]] (duration: 00m 07s)
* 14:13 hashar@deploy1002: Started deploy [gerrit/gerrit@2d719f3]: wm-patch-demo: initial implementation {{!}} [[phab:T332474|T332474]]
* 14:08 urbanecm@deploy1002: urbanecm and matmarex: Backport for [[gerrit:922838{{!}}Enable DiscussionTools newtopictool on fiwiki (T317375)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 14:06 urbanecm@deploy1002: Started scap: Backport for [[gerrit:922838{{!}}Enable DiscussionTools newtopictool on fiwiki (T317375)]]
* 14:06 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:922405{{!}}MultiPaneDialog: remove attribute hidden instead of class (T337256)]], [[gerrit:920238{{!}}Add maint script to opt out active users from the new topic tool (T317375)]], [[gerrit:920731{{!}}Define $maintClass in maintenance script for compatibility (T317375)]], [[gerrit:920733{{!}}NewTopicOptOutActiveUsers: Skip bot users etc. (T317375)]] (duration: 09m 21s)
* 13:58 urbanecm@deploy1002: matmarex and urbanecm and sgimeno: Backport for [[gerrit:922405{{!}}MultiPaneDialog: remove attribute hidden instead of class (T337256)]], [[gerrit:920238{{!}}Add maint script to opt out active users from the new topic tool (T317375)]], [[gerrit:920731{{!}}Define $maintClass in maintenance script for compatibility (T317375)]], [[gerrit:920733{{!}}NewTopicOptOutActiveUsers: Skip bot users etc. (T317375)]] synced t
* 13:56 urbanecm@deploy1002: Started scap: Backport for [[gerrit:922405{{!}}MultiPaneDialog: remove attribute hidden instead of class (T337256)]], [[gerrit:920238{{!}}Add maint script to opt out active users from the new topic tool (T317375)]], [[gerrit:920731{{!}}Define $maintClass in maintenance script for compatibility (T317375)]], [[gerrit:920733{{!}}NewTopicOptOutActiveUsers: Skip bot users etc. (T317375)]]
* 13:55 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:918500{{!}}[Growth] Add mediawiki.mentor_dashboard.interaction (T325117)]] (duration: 07m 06s)
* 13:48 urbanecm@deploy1002: Started scap: Backport for [[gerrit:918500{{!}}[Growth] Add mediawiki.mentor_dashboard.interaction (T325117)]]
* 13:36 samtar@deploy1002: Finished scap: Backport for [[gerrit:922810{{!}}Enable Kartographer Nearby on remaining wikis (T336834)]] (duration: 08m 04s)
* 13:29 samtar@deploy1002: samtar and wmde-fisch: Backport for [[gerrit:922810{{!}}Enable Kartographer Nearby on remaining wikis (T336834)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:28 samtar@deploy1002: Started scap: Backport for [[gerrit:922810{{!}}Enable Kartographer Nearby on remaining wikis (T336834)]]
* 13:26 samtar@deploy1002: Finished scap: Backport for [[gerrit:801792{{!}}[cirrus] Fix typo in config var]] (duration: 10m 15s)
* 13:17 samtar@deploy1002: samtar and dcausse: Backport for [[gerrit:801792{{!}}[cirrus] Fix typo in config var]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 13:16 samtar@deploy1002: Started scap: Backport for [[gerrit:801792{{!}}[cirrus] Fix typo in config var]]
* 13:14 jclark@cumin1001: START - Cookbook sre.hosts.provision for host dbproxy1022.mgmt.eqiad.wmnet with reboot policy FORCED
* 13:14 samtar@deploy1002: Finished scap: Backport for [[gerrit:920298{{!}}arclamp: switch redis server to arclamp1001 (T327277)]] (duration: 07m 53s)
* 13:07 samtar@deploy1002: herron and samtar: Backport for [[gerrit:920298{{!}}arclamp: switch redis server to arclamp1001 (T327277)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:07 xSavitar: tools.codesearch Deployed https://gerrit.wikimedia.org/r/c/labs/codesearch/+/909258 and also restarted tool instances to core search backend was dead.
* 13:06 samtar@deploy1002: Started scap: Backport for [[gerrit:920298{{!}}arclamp: switch redis server to arclamp1001 (T327277)]]
* 12:55 TheresNoTime: `[samtar@mwmaint1002 ~]$ mwscript findBadBlobs --wiki nowiki --revisions {{Gerrit|5227369}} --mark [[phab:T337392|T337392]]` [[phab:T337392|T337392]]
* 12:47 tgr_: running changeWikiConfig.php on Growth pilot wikis for [[phab:T337348|T337348]]
* 10:56 akosiaris@cumin1001: END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-codfw cluster: Reboot kafka nodes
* 09:42 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2448.codfw.wmnet
* 09:42 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for mw2448.codfw.wmnet
* 09:04 dcausse@deploy1002: Finished deploy [airflow-dags/search@c08e884]: search: build and use a smaller cirrus index dataset (duration: 00m 17s)
* 09:04 dcausse@deploy1002: Started deploy [airflow-dags/search@c08e884]: search: build and use a smaller cirrus index dataset
* 08:52 claime: repooling mw2248.codfw.wmnet - [[phab:T334429|T334429]]
* 08:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:51 akosiaris@cumin1001: START - Cookbook sre.kafka.reboot-workers for Kafka main-codfw cluster: Reboot kafka nodes
* 08:50 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
* 08:49 marostegui: Stop mariadb on db1154 (sanitarium) there will be lag on clouddb* hosts
* 08:36 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:921599{{!}}Migrate GrowthExperiments config to its own file (T308932)]] (duration: 07m 20s)
* 08:28 urbanecm@deploy1002: Started scap: Backport for [[gerrit:921599{{!}}Migrate GrowthExperiments config to its own file (T308932)]]
* 07:42 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
* 07:42 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
* 07:41 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
* 07:40 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
* 07:33 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:33 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:31 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:31 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:11 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:11 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:02 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 07:02 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 05:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 136106
* 05:14 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 136106
* 01:19 mutante: contint2001 - jenkins started again
* 01:10 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on contint2001.wikimedia.org with reason: maintenance
* 01:10 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on contint2001.wikimedia.org with reason: maintenance
* 00:45 mutante: short maintenance on main contint server (jenkins)
* 00:44 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on contint2001.wikimedia.org with reason: maintenance
* 00:44 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on contint2001.wikimedia.org with reason: maintenance
* 00:29 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on contint2001.wikimedia.org with reason: maintenance
* 00:29 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on contint2001.wikimedia.org with reason: maintenance
* 00:16 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on contint2001.wikimedia.org with reason: maintenance
* 00:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on contint2001.wikimedia.org with reason: maintenance
* 00:00 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on contint2002.wikimedia.org with reason: maintenance
* 00:00 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on contint2002.wikimedia.org with reason: maintenance
* 00:00 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on contint1002.wikimedia.org with reason: maintenance
* 00:00 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on contint1002.wikimedia.org with reason: maintenance


== 2021-11-04 ==
== 2023-05-23 ==
* 23:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:52 mutante: releases1002 - jenkins service running again, this is the active host behind releases-jenkins.wikimedia.org - maintenance for releases* done
* 23:51 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: XWD timeout testing [[phab:T293568|T293568]] (duration: 00m 54s)
* 23:44 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases1002.eqiad.wmnet with reason: maintenance
* 23:49 tstarling@deploy1002: Synchronized src/XWikimediaDebug.php: XWD timeout testing (duration: 00m 54s)
* 23:44 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on releases1002.eqiad.wmnet with reason: maintenance
* 23:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:41 mutante: releases1002 (releases.wikimedia.org) stopping jenkins for maintenance
* 23:44 cjming: end of UTC late backport & config window
* 23:30 mutante: contint*, releases* - maintenance - changing UID of jenkins user - jenkins will be stopped for a little bit, releases-jenkins is first though - [[phab:T324659|T324659]]
* 23:44 cjming@deploy1002: Synchronized wmf-config: Config
* 22:00 eileen: civicrm upgraded from {{Gerrit|11538e23}} to {{Gerrit|4251dfa1}}
* 21:26 ejegg: payments-wiki upgraded from {{Gerrit|a7567c6a}} to {{Gerrit|e02bc7c5}}
* 21:06 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 21:06 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d


== 2021-11-03 ==
== 2023-05-22 ==
* 23:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:29 eileen: civicrm upgraded from {{Gerrit|cc9593d0}} to {{Gerrit|7eae24d5}}
* 23:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:16 zabe@deploy1002: Finished scap: Backport for [[gerrit:921614{{!}}Enable VE on new wikis]] (duration: 06m 58s)
* 23:22 legoktm: reverted canaries back to scap 4.0.2
* 23:11 zabe@deploy1002: zabe: Backport for [[gerrit:921614{{!}}Enable VE on new wikis]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 23:20 legoktm: uploaded scap 4.0.3-1+really4.0.2 to apt.wm.o for buster/stretch
* 23:09 zabe@deploy1002: Started scap: Backport for [[gerrit:921614{{!}}Enable VE on new wikis]]
* 23:02 legoktm@deploy1002: Finished deploy [restbase/deploy@664a2f8]: (no justification provided) (duration: 00m 50s)
* 21:38 sbassett: Deployed security mitigations for [[phab:T333140|T333140]] and [[phab:T336027|T336027]]
* 23:01 legoktm@deploy1002: Started deploy [restbase/deploy@664a2f8]: (no justification provided)
* 20:55 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts labstore1004.eqiad.wmnet
* 22:48 ppchelko@deploy1002: Finished deploy [restbase/deploy@664a2f8]: Add new wikis [[phab:T292422|T292422]] [[phab:T294587|T294587]] [[phab:T294588|T294588]] (duration: 00m 10s)
* 20:55 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:48 ppchelko@deploy1002: Started deploy [restbase/deploy@664a2f8]: Add new wikis [[phab:T292422|T292422]] [[phab:T294587|T294587]] [[phab:T294588|T294588]]
* 20:54 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: labstore1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
* 22:47 legoktm: upgraded scap on A:restbase ([[phab:T294936|T294936]])
* 20:53 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: labstore1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
* 22:38 legoktm: upgrading scap on canaries ([[phab:T294966|T294966]])
* 20:51 andrew@cumin1001: START - Cookbook sre.dns.netbox
* 22:34 legoktm: upgraded apache2 on lists1001
* 20:45 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts labstore1004.eqiad.wmnet
* 22:32 legoktm: uploaded scap 4.0.3 to apt.wm.o for buster and stretch ([[phab:T294966|T294966]])
* 20:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts labstore1005.eqiad.wmnet
* 22:24 twentyafterfour: restarted php7.3-fpm on phab1001
* 20:44 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:24 twentyafterfour: restarting phabricator to apply updates.
* 20:44 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: labstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
* 22:12 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=wcqs2002.codfw.wmnet
* 20:43 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: labstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
* 22:12 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=wcqs2001.codfw.wmnet
* 20:40 andrew@cumin1001: START - Cookbook sre.dns.netbox
* 21:56 ryankemper: [[phab:T294961|T294961]] [WCQS] Forcing recheck of `PyBal IPVS diff check` and `PyBal backends health check`
* 20:33 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts labstore1005.eqiad.wmnet
* 21:53 ryankemper: [[phab:T294961|T294961]] [WCQS] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/736564 and successfully ran `ryankemper@cumin1001:~$ sudo cumin 'A:icinga or A:dns-auth' run-puppet-agent`
* 20:27 TheresNoTime: close UTC late backport window
* 21:47 ryankemper: [[phab:T294961|T294961]] [WCQS] DNS changes rolled out, proceeding to the `lvs_setup` step: https://gerrit.wikimedia.org/r/c/operations/puppet/+/736564
* 20:24 samtar@deploy1002: Finished scap: Backport for [[gerrit:921765{{!}}[kaawiki] Enable SandboxLink extension (T336648)]] (duration: 07m 47s)
* 21:45 ryankemper: [[phab:T294961|T294961]] [WCQS] Merged https://gerrit.wikimedia.org/r/c/operations/dns/+/736585, running `ryankemper@authdns1001:~$ sudo -i authdns-update`
* 20:17 samtar@deploy1002: samtar and superpes: Backport for [[gerrit:921765{{!}}[kaawiki] Enable SandboxLink extension (T336648)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 21:38 legoktm: upgrading/restarting apache2 on A:all-mw-eqiad
* 20:16 samtar@deploy1002: Started scap: Backport for [[gerrit:921765{{!}}[kaawiki] Enable SandboxLink extension (T336648)]]
* 21:26 legoktm: upgrading/restarting apache2 on A:all-mw-codfw
* 20:14 samtar@deploy1002: Finished scap: Backport for [[gerrit:921764{{!}}[ruwiki] Add 'abusefilter log/view private' flags to ArbCom (T336625)]] (duration: 08m 22s)
* 21:12 legoktm: upgrading PHP 7.2 on labweb, deployment-servers
* 20:11 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs[2010-2011].codfw.wmnet
* 21:00 legoktm: upgrading PHP 7.2 on A:snapshot
* 20:09 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs[2010-2011].codfw.wmnet
* 20:55 legoktm: upgrading PHP 7.2 on A:parsoid
* 20:08 samtar@deploy1002: superpes and samtar: Backport for [[gerrit:921764{{!}}[ruwiki] Add 'abusefilter log/view private' flags to ArbCom (T336625)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:06 samtar@deploy1002: Started scap: Backport for [[gerrit:921764{{!}}[ruwiki] Add 'abusefilter log/view private' flags to ArbCom (T336625)]]
* 20:04 eileen: civicrm revision changed from {{Gerrit|93caef68ef}} to {{Gerrit|ac6f333db6}}, config revision is {{Gerrit|d3bb9999e7}}
* 19:22 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 20:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:22 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:52 dduvall@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]] (duration: 01m 03s)
* 19:20 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:51 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]]
* 19:20 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:18 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:43 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=wcqs2003.codfw.wmnet
* 19:18 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:18 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:35 mutante: depooled wcqs2003 (pooled=inactive) because Icinga alerts that servers are down but pooled. not in production yet but issues ([[phab:T294961|T294961]])
* 19:18 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 19:33 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=wcqs2003.codfw.wmnet
* 17:04 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@5ee7a62]: (no justification provided) (duration: 00m 17s)
* 19:33 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=wcqs2003.codfw.wmnet
* 17:03 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@5ee7a62]: (no justification provided)
* 19:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:58 XioNoX: push mgmt_junos to all L2 switches
* 19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:35 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs2009.codfw.wmnet
* 19:26 mmandere: pool cp4035.ulsfo.wmnet - [[phab:T290694|T290694]]
* 16:35 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2009.codfw.wmnet
* 19:19 dduvall: 1.38.0-wmf.7 now on group0. no new errors. leaving ~ 30 minutes before promoting group1 ([[phab:T293948|T293948]])
* 15:57 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 19:18 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:56 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs2009.codfw.wmnet
* 19:15 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]]
* 15:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:26 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox
* 19:10 tgr: UTC evening deploys done
* 15:25 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox-canary
* 19:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:25 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox-canary
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "New debmonitor VMs - jmm@cumin2002 - [[phab:T241049|T241049]]"
* 18:59 razzi@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - razzi@cumin1001
* 15:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "New debmonitor VMs - jmm@cumin2002 - [[phab:T241049|T241049]]"
* 18:55 razzi@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - razzi@cumin1001
* 14:32 bking@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:51 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4035.ulsfo.wmnet with OS buster
* 14:31 bking@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 18:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:10 bking@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:10 bking@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 18:40 legoktm: re-enabling puppet on lists1001
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host debmonitor2003.codfw.wmnet with OS bookworm
* 18:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on debmonitor2003.codfw.wmnet with reason: host reimage
* 18:34 urbanecm: Purge https://en.wikipedia.org/.well-known/assetlinks.json, https://www.wikipedia.org/.well-known/assetlinks.json and https://wikipedia.org/.well-known/assetlinks.json ([[phab:T294776|T294776]])
* 12:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on debmonitor2003.codfw.wmnet with reason: host reimage
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host debmonitor2003.codfw.wmnet with OS bookworm
* 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host debmonitor1003.eqiad.wmnet with OS bookworm
* 18:24 volans: rebooting ganeti-test2002 with fixed /etc/network/interfaces
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on debmonitor1003.eqiad.wmnet with reason: host reimage
* 18:22 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'recommendation-api' for release 'production' .
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2124', diff saved to https://phabricator.wikimedia.org/P48456 and previous config saved to /var/cache/conftool/dbconfig/20230522-115936-root.json
* 18:22 urbanecm@deploy1002: Synchronized docroot/wikipedia.org/: {{Gerrit|2331d061b95ba3fc4de8844008fac93ce18f9063}}: Add Android site association file ([[phab:T294776|T294776]]) (duration: 01m 02s)
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on debmonitor1003.eqiad.wmnet with reason: host reimage
* 18:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host debmonitor1003.eqiad.wmnet with OS bookworm
* 18:18 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 10:17 topranks: Un-draining transport circuit from eqsin to codfw, moving traffic back to default path [[phab:T337220|T337220]]
* 18:17 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 10:17 topranks: Un-draining transport circuit from eqsin to codfw, moving traffic back to default path
* 18:15 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 10:06 hashar@deploy1002: Finished scap: Backport for [[gerrit:921558{{!}}Revert "[WikibaseMediaInfo] Add 'main subject of' property"]] (duration: 37m 00s)
* 18:15 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:710126{{!}}Clean up temporary variable wgMathUseRestBase (T274436)]] (duration: 01m 02s)
* 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host debmonitor2003.codfw.wmnet
* 18:15 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM debmonitor2003.codfw.wmnet - jmm@cumin2002"
* 18:15 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' .
* 10:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM debmonitor2003.codfw.wmnet - jmm@cumin2002"
* 18:13 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) debmonitor2003.codfw.wmnet on all recursors
* 18:12 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' .
* 10:04 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache debmonitor2003.codfw.wmnet on all recursors
* 18:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:09 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:710126{{!}}Clean up temporary variable wgMathUseRestBase (T274436)]] (duration: 01m 03s)
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM debmonitor2003.codfw.wmnet - jmm@cumin2002"
* 18:09 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM debmonitor2003.codfw.wmnet - jmm@cumin2002"
* 18:08 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 10:02 moritzm: installing updated usb.ids packages for Bullseye
* 18:08 Amir1: ran set session sql_log_bin=0; RENAME TABLE wb_changes_dispatch TO [[phab:T294121|T294121]]_DROP_wb_changes_dispatch; on db1111 ([[phab:T294121|T294121]])
* 10:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 18:07 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 10:01 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host debmonitor2003.codfw.wmnet
* 18:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host debmonitor1003.eqiad.wmnet
* 18:06 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:736032{{!}}Remove hook set for incident reponse in 2020]] (duration: 01m 03s)
* 09:51 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM debmonitor1003.eqiad.wmnet - jmm@cumin2002"
* 18:04 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 09:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM debmonitor1003.eqiad.wmnet - jmm@cumin2002"
* 18:03 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) debmonitor1003.eqiad.wmnet on all recursors
* 18:02 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'sessionstore' for release 'staging' .
* 09:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache debmonitor1003.eqiad.wmnet on all recursors
* 17:50 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host cp4035.ulsfo.wmnet with OS buster
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:49 vgutierrez: update codfw cp instances to ATS 8.0.8-1wm5 - [[phab:T294897|T294897]]
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM debmonitor1003.eqiad.wmnet - jmm@cumin2002"
* 17:48 mmandere: depool cp4035.ulsfo.wmnet - [[phab:T290694|T290694]]
* 09:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM debmonitor1003.eqiad.wmnet - jmm@cumin2002"
* 17:47 topranks: adding BGP peering session to "Liquid Telecommunications" AS30844 on cr2-esams (AMS-IX)
* 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 17:46 legoktm: upgrading PHP 7.2 on A:all-mw-eqiad
* 09:43 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host debmonitor1003.eqiad.wmnet
* 17:33 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 09:39 hashar@deploy1002: hashar: Backport for [[gerrit:921558{{!}}Revert "[WikibaseMediaInfo] Add 'main subject of' property"]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 17:32 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 09:29 hashar@deploy1002: Started scap: Backport for [[gerrit:921558{{!}}Revert "[WikibaseMediaInfo] Add 'main subject of' property"]]
* 17:31 topranks: adding BGP peering session to "P Foundation" / AS399728 on cr2-eqiad [Equinix Ashburn IXP]
* 08:46 marostegui: Stop mysql on db2160 (haproxy irc alerts will be generated)
* 17:30 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 08:28 elukey: drain Arelion link between cr1-codfw and cr3-eqsin to mitigate packet loss eqiad <-> eqsin
* 17:24 legoktm: upgrading PHP 7.2 on A:all-mw-codfw
* 08:22 moritzm: installing systemd security updates
* 17:06 mmandere: pool cp4033.ulsfo.wmnet - [[phab:T290694|T290694]]
* 08:17 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48455 and previous config saved to /var/cache/conftool/dbconfig/20230522-081724-root.json
* 17:05 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48454 and previous config saved to /var/cache/conftool/dbconfig/20230522-080219-root.json
* 17:02 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 07:59 elukey: restart purged on cp5017 as test to clear out consumer group timeouts and rejoin events
* 17:01 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48453 and previous config saved to /var/cache/conftool/dbconfig/20230522-075613-root.json
* 16:59 razzi@deploy1002: Finished deploy [analytics/superset/deploy@5b8de4c]: Upgrade superset to 1.3.1 (duration: 00m 31s)
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48452 and previous config saved to /var/cache/conftool/dbconfig/20230522-074715-root.json
* 16:58 razzi@deploy1002: Started deploy [analytics/superset/deploy@5b8de4c]: Upgrade superset to 1.3.1
* 07:41 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48451 and previous config saved to /var/cache/conftool/dbconfig/20230522-074109-root.json
* 16:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:37 mvernon@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 16:52 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4033.ulsfo.wmnet with OS buster
* 07:32 mvernon@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 16:31 hnowlan: installing wikidiff2-1.13.0-1 to A:mw-jobrunner
* 07:32 mvernon@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 16:27 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48450 and previous config saved to /var/cache/conftool/dbconfig/20230522-073210-root.json
* 16:23 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 07:28 mvernon@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 16:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:26 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48449 and previous config saved to /var/cache/conftool/dbconfig/20230522-072604-root.json
* 16:17 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48448 and previous config saved to /var/cache/conftool/dbconfig/20230522-071705-root.json
* 16:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48447 and previous config saved to /var/cache/conftool/dbconfig/20230522-071333-root.json
* 16:04 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host cp4033.ulsfo.wmnet with OS buster
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48446 and previous config saved to /var/cache/conftool/dbconfig/20230522-071326-root.json
* 15:59 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48445 and previous config saved to /var/cache/conftool/dbconfig/20230522-071319-root.json
* 15:58 mmandere: depool cp4033.ulsfo.wmnet - [[phab:T290694|T290694]]
* 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48444 and previous config saved to /var/cache/conftool/dbconfig/20230522-071059-root.json
* 15:57 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48443 and previous config saved to /var/cache/conftool/dbconfig/20230522-070200-root.json
* 15:51 hnowlan: rolling restart-php7.2-fpm on A:mw-api-codfw to pick up wikidiff2 upgrade
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48442 and previous config saved to /var/cache/conftool/dbconfig/20230522-065828-root.json
* 15:47 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' .
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48441 and previous config saved to /var/cache/conftool/dbconfig/20230522-065822-root.json
* 15:22 ppchelko@deploy1002: Finished deploy [restbase/deploy@664a2f8]: Add new wikis [[phab:T292422|T292422]] [[phab:T294587|T294587]] [[phab:T294588|T294588]] (duration: 00m 36s)
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48440 and previous config saved to /var/cache/conftool/dbconfig/20230522-065815-root.json
* 15:22 ppchelko@deploy1002: Started deploy [restbase/deploy@664a2f8]: Add new wikis [[phab:T292422|T292422]] [[phab:T294587|T294587]] [[phab:T294588|T294588]]
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48439 and previous config saved to /var/cache/conftool/dbconfig/20230522-065555-root.json
* 15:21 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48438 and previous config saved to /var/cache/conftool/dbconfig/20230522-064656-root.json
* 15:21 ppchelko@deploy1002: Started deploy [restbase/deploy@664a2f8]: Add new wikis [[phab:T292422|T292422]] [[phab:T294587|T294587]] [[phab:T294588|T294588]]
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 [[phab:T337206|T337206]]', diff saved to https://phabricator.wikimedia.org/P48437 and previous config saved to /var/cache/conftool/dbconfig/20230522-064541-root.json
* 15:21 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts bast2002
* 15:11 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:10 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 06:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48436 and previous config saved to /var/cache/conftool/dbconfig/20230522-064323-root.json
* 15:09 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48435 and previous config saved to /var/cache/conftool/dbconfig/20230522-064317-root.json
* 15:08 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48434 and previous config saved to /var/cache/conftool/dbconfig/20230522-064310-root.json
* 15:06 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1121.eqiad.wmnet
* 15:06 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1121.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
* 14:54 elukey@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48433 and previous config saved to /var/cache/conftool/dbconfig/20230522-064050-root.json
* 14:40 moritzm: installing elfutils security updates on stretch
* 06:40 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1121.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
* 14:37 elukey@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 06:38 marostegui@cumin1001: START - Cookbook sre.dns.netbox
* 14:37 elukey@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 06:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast2002
* 14:33 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' .
* 06:33 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1121.eqiad.wmnet
* 14:32 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'es2023 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48432 and previous config saved to /var/cache/conftool/dbconfig/20230522-063151-root.json
* 14:31 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48431 and previous config saved to /var/cache/conftool/dbconfig/20230522-062818-root.json
* 14:31 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48430 and previous config saved to /var/cache/conftool/dbconfig/20230522-062812-root.json
* 14:30 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48429 and previous config saved to /var/cache/conftool/dbconfig/20230522-062805-root.json
* 14:21 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 3%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48428 and previous config saved to /var/cache/conftool/dbconfig/20230522-062545-root.json
* 14:20 hnowlan: rolling restart-php7.2-fpm on A:mw-eqiad and A:mw-api-eqiad
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Give weight to es2024', diff saved to https://phabricator.wikimedia.org/P48427 and previous config saved to /var/cache/conftool/dbconfig/20230522-061947-marostegui.json
* 14:17 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'apertium' for release 'staging' .
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2023 [[phab:T337204|T337204]]', diff saved to https://phabricator.wikimedia.org/P48426 and previous config saved to /var/cache/conftool/dbconfig/20230522-061925-root.json
* 14:16 hnowlan: deploying wikidiff2-1.13.0-1 to A:mw-eqiad and A:mw-api-eqiad
* 06:17 marostegui: Starting es5 codfw failover from es2023 to es2024 - [[phab:T337204|T337204]]
* 14:13 moritzm: installing remaining tiff security updates for buster
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 [[phab:T337204|T337204]]
* 14:10 moritzm: initialising ganeti-test01.svc.codfw.wmnet cluster on ganeti-test2001 [[phab:T286206|T286206]]
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Set es2024 with weight 0 [[phab:T337204|T337204]]', diff saved to https://phabricator.wikimedia.org/P48425 and previous config saved to /var/cache/conftool/dbconfig/20230522-061524-root.json
* 14:07 XioNoX: move cr2-codfw access switches link to working linecard - [[phab:T289241|T289241]]
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 [[phab:T337204|T337204]]
* 14:04 vgutierrez: update eqsin and ulsfo cp instances to ATS 8.0.8-1wm5 - [[phab:T294897|T294897]]
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48424 and previous config saved to /var/cache/conftool/dbconfig/20230522-061314-root.json
* 13:38 jelto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48423 and previous config saved to /var/cache/conftool/dbconfig/20230522-061307-root.json
* 13:34 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp403[3456].*,service=ats-be
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48422 and previous config saved to /var/cache/conftool/dbconfig/20230522-061300-root.json
* 13:34 bblack: cp403[3456] - depool ats-be service (upcoming re-reimage)
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48421 and previous config saved to /var/cache/conftool/dbconfig/20230522-061040-root.json
* 12:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es2021', diff saved to https://phabricator.wikimedia.org/P48420 and previous config saved to /var/cache/conftool/dbconfig/20230522-061033-marostegui.json
* 12:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48419 and previous config saved to /var/cache/conftool/dbconfig/20230522-055809-root.json
* 12:21 vgutierrez: update trafficserver on cp4027 to 8.0.8-1wm5 - [[phab:T294897|T294897]]
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48418 and previous config saved to /var/cache/conftool/dbconfig/20230522-055803-root.json
* 12:20 vgutierrez: update trafficserver on cp4021 to 8.0.8-1wm5 - [[phab:T294897|T294897]]
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48417 and previous config saved to /var/cache/conftool/dbconfig/20230522-055756-root.json
* 12:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'es2021 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48416 and previous config saved to /var/cache/conftool/dbconfig/20230522-055120-root.json
* 12:18 vgutierrez: upload trafficserver 8.0.8-1wm5 to apt.wm.org (buster) - [[phab:T294897|T294897]]
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48415 and previous config saved to /var/cache/conftool/dbconfig/20230522-054304-root.json
* 12:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48414 and previous config saved to /var/cache/conftool/dbconfig/20230522-054258-root.json
* 12:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9ca753bf4b7afea41c29225d4f32e3ba01bf7c30}}: Revert "Adjust AF config for ukwiki" ([[phab:T272330|T272330]]) (duration: 01m 03s)
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48413 and previous config saved to /var/cache/conftool/dbconfig/20230522-054251-root.json
* 12:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|667ef0b6e9e8d1d70061cc904ce49e7632300b75}}: foundationwiki: Increase AF throttle requirements (duration: 01m 13s)
* 05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2021 [[phab:T337203|T337203]]', diff saved to https://phabricator.wikimedia.org/P48412 and previous config saved to /var/cache/conftool/dbconfig/20230522-053705-marostegui.json
* 11:58 hnowlan: rolling restart-php7.2-fpm on A:mw-codfw and A:mw-api-codfw
* 05:35 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2020 to es4 codfw primaryT337203', diff saved to https://phabricator.wikimedia.org/P48411 and previous config saved to /var/cache/conftool/dbconfig/20230522-053554-marostegui.json
* 11:56 hnowlan: deploying wikidiff2-1.13.0-1 to A:mw-codfw and A:mw-api-codfw
* 05:34 marostegui: Starting es4 codfw failover from es2021 to es2020 - [[phab:T337203|T337203]]
* 11:37 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Set es2020 with weight 0 [[phab:T337203|T337203]]', diff saved to https://phabricator.wikimedia.org/P48410 and previous config saved to /var/cache/conftool/dbconfig/20230522-052938-root.json
* 11:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 [[phab:T337203|T337203]]
* 11:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7fdf3f5476d9d8ab45eb793090613e328a91bb7a}}: Wikisource: allow copy-uploads from Commons ([[phab:T294824|T294824]]) (duration: 01m 04s)
* 05:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 [[phab:T337203|T337203]]
* 11:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48409 and previous config saved to /var/cache/conftool/dbconfig/20230522-052800-root.json
* 09:23 XioNoX: re-enable eqiad Equinix IXP peerings - [[phab:T290877|T290877]]
* 05:27 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48408 and previous config saved to /var/cache/conftool/dbconfig/20230522-052753-root.json
* 08:55 XioNoX: Disable eqiad Equinix IXP peerings - [[phab:T290877|T290877]]
* 05:27 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48407 and previous config saved to /var/cache/conftool/dbconfig/20230522-052746-root.json
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager replicas from s6 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17660 and previous config saved to /var/cache/conftool/dbconfig/20211103-075801-marostegui.json
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1029, es1030, es1031 for kernel reboots', diff saved to https://phabricator.wikimedia.org/P48406 and previous config saved to /var/cache/conftool/dbconfig/20230522-051957-root.json
* 07:58 elukey@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Failover es1, es2 and es3 masters for kernel reboots', diff saved to https://phabricator.wikimedia.org/P48405 and previous config saved to /var/cache/conftool/dbconfig/20230522-051723-marostegui.json
* 07:57 elukey@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 07:50 marostegui: Drop oauth2_access_tokens oauth_accepted_consumer oauth_registered_consumer from foundationwiki [[phab:T294595|T294595]]
* 06:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS buster
* 06:39 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:35 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:35 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|34888b034e54ec35ca3b6745336fc0881e50c9b0}}: Growth IP research survey: Fix coverage ([[phab:T294568|T294568]]) (duration: 01m 04s)
* 06:13 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS buster
* 06:10 marostegui: Stop replication on db1163 [[phab:T290865|T290865]]
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1163 until it's reimaged to buster [[phab:T293964|T293964]]', diff saved to https://phabricator.wikimedia.org/P17659 and previous config saved to /var/cache/conftool/dbconfig/20211103-060644-root.json
* 06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1118 to s1 primary and set section read-write [[phab:T293964|T293964]]', diff saved to https://phabricator.wikimedia.org/P17658 and previous config saved to /var/cache/conftool/dbconfig/20211103-060201-root.json
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T293964|T293964]]', diff saved to https://phabricator.wikimedia.org/P17657 and previous config saved to /var/cache/conftool/dbconfig/20211103-060114-root.json
* 06:00 marostegui: Starting s1 eqiad failover from db1163 to db1118 - [[phab:T293964|T293964]]
* 05:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s1 [[phab:T293964|T293964]]
* 05:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 32 hosts with reason: Primary switchover s1 [[phab:T293964|T293964]]
* 02:22 milimetric@deploy1002: Finished deploy [analytics/refinery@cf6095c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@cf6095c] (duration: 05m 36s)
* 02:16 milimetric@deploy1002: Started deploy [analytics/refinery@cf6095c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@cf6095c]
* 02:16 milimetric@deploy1002: Finished deploy [analytics/refinery@cf6095c] (thin): Regular analytics weekly train THIN [analytics/refinery@cf6095c] (duration: 00m 07s)
* 02:16 milimetric@deploy1002: Started deploy [analytics/refinery@cf6095c] (thin): Regular analytics weekly train THIN [analytics/refinery@cf6095c]
* 02:15 milimetric@deploy1002: Finished deploy [analytics/refinery@cf6095c]: Regular analytics weekly train [analytics/refinery@cf6095c] (duration: 22m 30s)
* 01:53 milimetric@deploy1002: Started deploy [analytics/refinery@cf6095c]: Regular analytics weekly train [analytics/refinery@cf6095c]


== 2021-11-02 ==
== 2023-05-21 ==
* 23:47 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:45 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 23:46 tgr: UTC late deploys done
* 07:44 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 23:45 tgr@deploy1002: Synchronized wmf-config: Config: Use page id for GrowthExperiments image recommendations, except for testwiki ([[gerrit:736314{{!}}736314]] [[gerrit:736317{{!}}736317]] ([[phab:T290949|T290949]] [[phab:T292154|T292154]]) (duration: 01m 03s)
* 07:43 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 23:44 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:42 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 23:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:41 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 23:34 tgr@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:735094{{!}}Use url-downloader proxy for GrowthExperiments (T290949)]] (duration: 01m 14s)
* 07:40 jelto@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 23:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:14 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-db1002.eqiad.wmnet with OS buster
* 21:50 robh@cumin1001: START - Cookbook sre.hosts.reimage for host an-db1002.eqiad.wmnet with OS buster
* 21:32 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-db1002.eqiad.wmnet with OS buster
* 21:03 robh@cumin1001: START - Cookbook sre.hosts.reimage for host an-db1002.eqiad.wmnet with OS buster
* 20:52 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-db1001.eqiad.wmnet with OS buster
* 20:28 robh@cumin1001: START - Cookbook sre.hosts.reimage for host an-db1001.eqiad.wmnet with OS buster
* 20:01 thcipriani: 1.38.0-wmf.7 on testwikis, leaving it there for today for US holiday ([[phab:T293948|T293948]])
* 19:58 thcipriani@deploy1002: Pruned MediaWiki: 1.38.0-wmf.5 (duration: 04m 08s)
* 19:53 thcipriani@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]] (duration: 50m 13s)
* 19:50 moritzm: imported ganeti 2.16.0-1~bpo9+1+wmf1to component/ganeti216 for stretch-wikimedia (with additional cherrypicked patches for compat with KVM 3.1)  [[phab:T284811|T284811]]
* 19:47 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:35 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-db1002.eqiad.wmnet
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:08 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-db1001.eqiad.wmnet with OS buster
* 19:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:02 thcipriani@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.7  refs [[phab:T293948|T293948]]
* 18:46 thcipriani: starting to stage train for 1.38.0-wmf.7 ([[phab:T293948|T293948]])
* 18:33 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts an-db1002.eqiad.wmnet
* 18:32 robh@cumin1001: START - Cookbook sre.hosts.reimage for host an-db1001.eqiad.wmnet with OS buster
* 18:23 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:18 robh@cumin1001: START - Cookbook sre.dns.netbox
* 18:15 robh@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-db1001.eqiad.wmnet
* 18:14 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:11 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:01 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:59 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/DiscussionTools/modules/dt-ve/dt.ui.UsernameCompletionAction.js: {{Gerrit|494af124b95e2eabff94fde79aed6b6f6f81feab}}: UsernameCompletion: Filter out users with indefinite sitewide blocks from API results ([[phab:T294783|T294783]]) (duration: 00m 55s)
* 17:58 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:57 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts an-db1001.eqiad.wmnet
* 17:48 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:45 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:44 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|339be07a35de1fa3846b845376695d68a9d743fd}}: foundationwiki: Set wgCentralAuthCookies to true ([[phab:T205347|T205347]]) (duration: 00m 54s)
* 17:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:33 moritzm: installing opencv security updates
* 17:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:24 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e3227703a662ecda744bb159f39b128ed289c76d}}: Revert "Revert "foundationwiki: Enable Translate extension"" ([[phab:T205349|T205349]]) (duration: 00m 55s)
* 17:22 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/includes/cache/LinkCache.php: {{Gerrit|1e78aeabd682537d8c284559e1356d15c62da810}}: LinkCache: Try invalidating cache before throwing ([[phab:T205349|T205349]]) (duration: 00m 56s)
* 17:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:38 mmandere: pool cp4036.ulsfo.wmnet - [[phab:T290694|T290694]]
* 16:30 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4036.ulsfo.wmnet with OS buster
* 15:41 mmandere: pool cp4034.ulsfo.wmnet - [[phab:T290694|T290694]]
* 15:38 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host cp4036.ulsfo.wmnet with OS buster
* 15:32 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4034.ulsfo.wmnet with OS buster
* 15:12 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:11 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:07 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 14:34 mmandere: pool cp4035.ulsfo.wmnet - [[phab:T290694|T290694]]
* 14:31 mmandere@cumin1001: START - Cookbook sre.hosts.reimage for host cp4034.ulsfo.wmnet with OS buster
* 14:24 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4035.ulsfo.wmnet with OS buster
* 14:19 hnowlan: roll-restart restart-php7.2-fpm on A:mw-app-canary and A:mw-api-canary
* 14:15 hnowlan: debdeploying wikidiff2-1.13.0-1 to A:mw-app-canary and A:mw-api-canary for [[phab:T285857|T285857]]
* 14:05 hashar@deploy1002: Finished deploy [integration/docroot@4e4d14a]: Add landing page for code metrics (duration: 00m 09s)
* 14:05 hashar@deploy1002: Started deploy [integration/docroot@4e4d14a]: Add landing page for code metrics
* 13:45 mmandere: pool cp4033.ulsfo.wmnet  - [[phab:T290694|T290694]]
* 11:26 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1002.eqiad.wmnet
* 11:06 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1003.eqiad.wmnet
* 11:00 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1003.eqiad.wmnet
* 11:00 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1004.eqiad.wmnet
* 10:57 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1008.eqiad.wmnet
* 10:54 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1004.eqiad.wmnet
* 10:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2002-dev.codfw.wmnet
* 10:48 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw2002-dev.codfw.wmnet
* 10:48 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2001-dev.codfw.wmnet
* 10:46 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host stat1008.eqiad.wmnet
* 10:46 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1005.eqiad.wmnet
* 10:41 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw2001-dev.codfw.wmnet
* 10:40 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudgw2001-dev.codfw.wmnet
* 10:40 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw2001-dev.codfw.wmnet
* 10:36 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host stat1005.eqiad.wmnet
* 10:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:30 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dbff998f40e438556345185408e495f429440a1b}}: dewiki: Set wgGEHomepageDefaultVariant to control ([[phab:T294712|T294712]]) (duration: 00m 55s)
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1118 with weight 0 [[phab:T293964|T293964]]', diff saved to https://phabricator.wikimedia.org/P17652 and previous config saved to /var/cache/conftool/dbconfig/20211102-100348-root.json
* 09:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:40 legoktm: restarted apache2 on lists1001
* 09:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b2594347041ae61ef88661bc0d5aa57fc501540d}}: QuickSurveys: Show Growth IP editors survey to 0.1% of users ([[phab:T294568|T294568]]) (duration: 00m 57s)
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges replicas from s6 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17651 and previous config saved to /var/cache/conftool/dbconfig/20211102-090306-marostegui.json
* 08:29 moritzm: installing sdl2 security updates
* 07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked replicas from s6 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17650 and previous config saved to /var/cache/conftool/dbconfig/20211102-072320-marostegui.json
* 07:13 elukey: `apt-get purge dkms` (rc state) on stat100[5,8]
* 06:45 marostegui: Rename oauth2_access_tokens oauth_accepted_consumer oauth_registered_consumer tables on db1123 [[phab:T294595|T294595]]
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:56 cstone: civicrm revision changed from {{Gerrit|403be9ce05}} to {{Gerrit|93caef68ef}}
* 01:21 ejegg: updated SmashPig standalone deploy from {{Gerrit|dd3a81c7c2}} to {{Gerrit|be68299b92}}
* 01:18 ejegg: updated payments-wiki from {{Gerrit|5b9fdd0fe1}} to {{Gerrit|73de4731bd}}
* 00:45 mutante: upgraded php-fpm on cloudweb2001-dev - https://labtestwikitech.wikimedia.org/wiki/Main_Page
* 00:24 mutante: parsoid-canary (scandium, wtp1025, wtp1026, parse2001, parse2002) - upgrading php-fpm and php-* packages
* 00:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:07 mutante: scandium - installing package upgrades, incl. apache, php7.2- packages
* 00:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:02 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add event stream config for discussiontools ([[phab:T286076|T286076]]) (duration: 00m 55s)
* 00:00 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder for kswiki ([[phab:T294632|T294632]]) (duration: 00m 55s)
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2021-11-01 ==
== 2023-05-20 ==
* 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:25 effie: restart varnish cp3061
* 21:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:39 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=parse1018.eqiad.wmnet
* 21:30 urbanecm: Deploy a security patch for [[phab:T290808|T290808]]
* 15:17 hoo@deploy1002: Finished scap: Backport for [[gerrit:921549{{!}}Remove linkitem dependency on jquery.wikibase.wbtooltip (T337081)]] (duration: 08m 47s)
* 21:28 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|8f5008d9a043c96cd1dba18bdb38e168b01d63d0}}: votewiki: Grant election admins securepoll-view-voter-pii ([[phab:T290808|T290808]]) (duration: 00m 55s)
* 15:10 hoo@deploy1002: hoo: Backport for [[gerrit:921549{{!}}Remove linkitem dependency on jquery.wikibase.wbtooltip (T337081)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:59 mutante: mwmaint1002:/# systemctl start mediawiki_job_growthexperiments-purgeExpiredMentorStatus ([[phab:T280307|T280307]])
* 15:08 hoo@deploy1002: Started scap: Backport for [[gerrit:921549{{!}}Remove linkitem dependency on jquery.wikibase.wbtooltip (T337081)]]
* 20:56 legoktm: upgrading PHP 7.2 on A:mw-canary servers
* 14:41 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=parse1018.eqiad.wmnet
* 20:44 legoktm: upgrading PHP 7.2 on mwdebug* servers
* 09:08 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:34 mutante: mwmaint* - new timer/service mediawiki_job_growthexperiments-purgeExpiredMentorStatus created by puppet - [[phab:T280307|T280307]]
* 09:08 volans@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Added records for the new private.codfw.wikimedia.cloud domain - volans@cumin1001"
* 20:33 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 09:07 volans@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Added records for the new private.codfw.wikimedia.cloud domain - volans@cumin1001"
* 20:32 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 09:00 volans@cumin1001: START - Cookbook sre.dns.netbox
* 20:30 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 20:24 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 20:22 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 20:18 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 20:14 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 20:12 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 20:10 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 20:08 mutante: planet1002 - systemctl start update-en-planet  after merging config change btw. legoktm: it should be included in a sec
* 19:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:29 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cba805cb8aaa88d814bfff19b82e8f57ace4fafd}}: Prepare a QuickSurvey for Growth IP research ([[phab:T294568|T294568]]) (duration: 00m 55s)
* 19:26 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 19:23 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 19:19 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 18:49 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 18:37 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 18:26 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|fb433d67f738a2b7dd436e9298f716b14d66c155}}: Amend wordmark for the Meetei (Manipuri) Wikipedia ([[phab:T294189|T294189]]; 2/2) (duration: 00m 55s)
* 18:09 urbanecm: Purge https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-mni.svg ([[phab:T294189|T294189]])
* 18:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:08 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-mni.svg: {{Gerrit|fb433d67f738a2b7dd436e9298f716b14d66c155}}: Amend wordmark for the Meetei (Manipuri) Wikipedia ([[phab:T294189|T294189]]; 1/2) (duration: 00m 55s)
* 18:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:52 topranks: force-resetting FPC 0 on cr2-codfw as it appears hard down.
* 17:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:46 mutante: removing mediawiki font packages from the 8 canary API servers, in addition to 11 canary appservers [[phab:T294378|T294378]]
* 17:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:06 mutante: removing font packages from canary appservers ([[phab:T294378|T294378]], gerrit:735685)
* 16:53 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 16:53 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:52 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 15:52 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:50 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:49 moritzm: installing opencv security updates on stretch
* 15:28 moritzm: rolling restart of mw canaries to pick up tiff security updates
* 15:12 moritzm: installing tiff security updates
* 14:54 moritzm: uploaded PHP 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf3 to apt.wikimedia.org (buster-wikimedia/component/php72) [[phab:T294317|T294317]]
* 14:37 moritzm: updating PHP on mwdebug1001
* 13:31 moritzm: installing jbig2dec security updates
* 12:25 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1101.eqiad.wmnet
* 12:18 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1101.eqiad.wmnet
* 12:08 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1100.eqiad.wmnet
* 12:08 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/GrowthExperiments/includes/Mentorship/QuitMentorship.php: {{Gerrit|4671528977db15b4e287d50980a684223ab6f611}}: QuitMentorship: Pass a logger ([[phab:T294665|T294665]]; 2/2) (duration: 00m 55s)
* 12:07 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/GrowthExperiments/includes/Mentorship/QuitMentorshipFactory.php: {{Gerrit|4671528977db15b4e287d50980a684223ab6f611}}: QuitMentorship: Pass a logger ([[phab:T294665|T294665]]; 1/2) (duration: 00m 56s)
* 11:59 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1100.eqiad.wmnet
* 11:58 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1099.eqiad.wmnet
* 11:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1099.eqiad.wmnet
* 11:48 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1098.eqiad.wmnet
* 11:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:41 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1098.eqiad.wmnet
* 11:31 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1097.eqiad.wmnet
* 11:22 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1097.eqiad.wmnet
* 11:20 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1096.eqiad.wmnet
* 11:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:01 urbanecm: 11:01:21 Synchronized wmf-config/CommonSettings.php: {{Gerrit|b9aa3d21bfb16aaa9605e7abe311eb122009d6ed}}: Add edit-legal to editprotected grant (duration: 00m 54s)
* 11:00 urbanecm: 10:59:03 Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c236232bc48f4a61e98ffd2a93a23375bbb46287}}: foundationwiki: Disable direct account creation ([[phab:T205347|T205347]]) (duration: 00m 56s)
* 10:46 moritzm: installing libdatetime-timezone-perl updates (updates for latest tz changes)
* 10:17 urbanecm: Deploy a security patch for [[phab:T294686|T294686]]
* 09:03 dcausse: restarting blazegraph on wdqs2003 (jvm stuck for the last 22hours)
* 02:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:24 reedy@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 01m 49s)
* 02:22 reedy@deploy1002: Synchronized langlist: Add ami to langlist [[phab:T294717|T294717]] [[phab:T292414|T292414]] (duration: 00m 55s)


==Archives==
== 2023-05-19 ==
* 21:22 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:22 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for ssw link addresses in eqiad - cmooney@cumin1001"
* 21:21 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for ssw link addresses in eqiad - cmooney@cumin1001"
* 21:19 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 20:52 dzahn@cumin1001: conftool action : set/pooled=no; selector: cluster=jobrunner,name=mw1495.eqiad.wmnet
* 19:46 mutante: mw1469 - sudo pkill ffmpeg (per runbook)
* 19:45 dzahn@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,name=mw1469.eqiad.wmnet
* 19:45 mutante: depooled mw1469 from videoscaler, dedicating to just jobrunner
* 19:45 dzahn@cumin1001: conftool action : set/pooled=no; selector: cluster=videoscaler,name=mw1469.eqiad.wmnet
* 19:36 htriedman@deploy1002: Finished deploy [airflow-dags/platform_eng@b34c529]: (no justification provided) (duration: 00m 09s)
* 19:36 htriedman@deploy1002: Started deploy [airflow-dags/platform_eng@b34c529]: (no justification provided)
* 16:55 mutante: mw2448 - scap pull - [[phab:T2334429|T2334429]]
* 15:31 taavi@deploy1002: Finished scap: Backport for [[gerrit:921150{{!}}i18n: Add link to help page (T322717)]], [[gerrit:921326{{!}}Enable RealMe (T324535)]] (duration: 22m 02s)
* 15:21 taavi@deploy1002: legoktm and taavi: Backport for [[gerrit:921150{{!}}i18n: Add link to help page (T322717)]], [[gerrit:921326{{!}}Enable RealMe (T324535)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 15:09 taavi@deploy1002: Started scap: Backport for [[gerrit:921150{{!}}i18n: Add link to help page (T322717)]], [[gerrit:921326{{!}}Enable RealMe (T324535)]]
* 15:06 legoktm@deploy1002: Finished scap: Backport for [[gerrit:921252{{!}}Disable GWToolset from Commons (T270911)]] (duration: 09m 46s)
* 15:06 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:59 elukey@cumin1001: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:58 legoktm@deploy1002: legoktm: Backport for [[gerrit:921252{{!}}Disable GWToolset from Commons (T270911)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 14:57 legoktm@deploy1002: Started scap: Backport for [[gerrit:921252{{!}}Disable GWToolset from Commons (T270911)]]
* 14:40 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:36 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on stat1009.eqiad.wmnet with reason: Bringing stat1009 into service
* 14:36 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on stat1009.eqiad.wmnet with reason: Bringing stat1009 into service
* 14:35 sukhe: enable puppet on A:lvs, finished rolling out change
* 14:20 sukhe: disable puppet on A:lvs to roll out CR 910566
* 14:17 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wdqs1014.eqiad.wmnet with reason: firmware update
* 14:16 bking@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on wdqs1014.eqiad.wmnet with reason: firmware update
* 13:35 mforns@deploy1002: Finished deploy [airflow-dags/analytics_test@be05071]: (no justification provided) (duration: 00m 10s)
* 13:34 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs1020.eqiad.wmnet with reason: Move lvs1020 handoff port to row e/f from lsw1-f1 to ssw1-f1
* 13:34 mforns@deploy1002: Started deploy [airflow-dags/analytics_test@be05071]: (no justification provided)
* 13:34 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs1020.eqiad.wmnet with reason: Move lvs1020 handoff port to row e/f from lsw1-f1 to ssw1-f1
* 13:26 topranks: Adding vlan config for row e/f vlans on ssw1-f1-eqiad ([[phab:T322937|T322937]])
* 13:17 hashar@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.9  refs [[phab:T330215|T330215]]
* 12:19 elukey@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 11:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host testvm2004.codfw.wmnet with OS bullseye
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts bast2002
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast2002 decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2004.codfw.wmnet with reason: host reimage
* 10:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast2002 decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2004.codfw.wmnet with reason: host reimage
* 10:45 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 10:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:38 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast2002
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.reimage for host testvm2004.codfw.wmnet with OS bullseye
* 10:07 moritzm: installing ncurses security updates
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
* 09:49 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:49 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:48 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
* 09:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
* 09:31 jmm@cumin2002: START - Cookbook sre.ganeti.reimage for host testvm2002.codfw.wmnet with OS bullseye
* 09:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[2040-2043].codfw.wmnet
* 09:21 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:21 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[2040-2043].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
* 09:21 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:18 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[2040-2043].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
* 09:15 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:08 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:02 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 08:59 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-be[2040-2043].codfw.wmnet
* 08:58 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 08:52 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 08:45 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 08:41 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 08:38 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 08:38 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:34 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 08:31 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 08:27 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 08:18 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-cache2003.codfw.wmnet
* 08:14 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host netflow2003.codfw.wmnet with OS bookworm
* 08:11 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-cache2003.codfw.wmnet
* 08:10 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-cache2002.codfw.wmnet
* 08:09 moritzm: copy samplicator from bullseye-wikimedia to bookworm-wikimedia [[phab:T330884|T330884]]
* 08:03 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-cache2002.codfw.wmnet
* 07:58 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-cache2001.codfw.wmnet
* 07:52 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-cache2001.codfw.wmnet
* 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48397 and previous config saved to /var/cache/conftool/dbconfig/20230519-074256-root.json
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48396 and previous config saved to /var/cache/conftool/dbconfig/20230519-074044-root.json
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48395 and previous config saved to /var/cache/conftool/dbconfig/20230519-073959-root.json
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow2003.codfw.wmnet with reason: host reimage
* 07:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow2003.codfw.wmnet with reason: host reimage
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48394 and previous config saved to /var/cache/conftool/dbconfig/20230519-072751-root.json
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48393 and previous config saved to /var/cache/conftool/dbconfig/20230519-072539-root.json
* 07:24 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 75%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48392 and previous config saved to /var/cache/conftool/dbconfig/20230519-072454-root.json
* 07:21 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: prometheus4001.ulsfo.wmnet
* 07:21 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: prometheus4001.ulsfo.wmnet
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48391 and previous config saved to /var/cache/conftool/dbconfig/20230519-071247-root.json
* 07:11 moritzm: installing emacs security updates
* 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48390 and previous config saved to /var/cache/conftool/dbconfig/20230519-071034-root.json
* 07:09 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 50%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48389 and previous config saved to /var/cache/conftool/dbconfig/20230519-070949-root.json
* 06:59 jmm@cumin2002: START - Cookbook sre.ganeti.reimage for host netflow2003.codfw.wmnet with OS bookworm
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48388 and previous config saved to /var/cache/conftool/dbconfig/20230519-065742-root.json
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48387 and previous config saved to /var/cache/conftool/dbconfig/20230519-065530-root.json
* 06:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 25%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48386 and previous config saved to /var/cache/conftool/dbconfig/20230519-065445-root.json
* 06:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6002.wikimedia.org
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48385 and previous config saved to /var/cache/conftool/dbconfig/20230519-064237-root.json
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6002.wikimedia.org
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48384 and previous config saved to /var/cache/conftool/dbconfig/20230519-064025-root.json
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 10%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48383 and previous config saved to /var/cache/conftool/dbconfig/20230519-063940-root.json
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48382 and previous config saved to /var/cache/conftool/dbconfig/20230519-062733-root.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48381 and previous config saved to /var/cache/conftool/dbconfig/20230519-062520-root.json
* 06:24 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 5%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48380 and previous config saved to /var/cache/conftool/dbconfig/20230519-062435-root.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48379 and previous config saved to /var/cache/conftool/dbconfig/20230519-061228-root.json
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48378 and previous config saved to /var/cache/conftool/dbconfig/20230519-061016-root.json
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 2%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48377 and previous config saved to /var/cache/conftool/dbconfig/20230519-060931-root.json
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'es2027 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48376 and previous config saved to /var/cache/conftool/dbconfig/20230519-055723-root.json
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48375 and previous config saved to /var/cache/conftool/dbconfig/20230519-055511-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 1%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P48374 and previous config saved to /var/cache/conftool/dbconfig/20230519-055426-root.json
* 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2027', diff saved to https://phabricator.wikimedia.org/P48373 and previous config saved to /var/cache/conftool/dbconfig/20230519-054952-root.json
* 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2034 to es3 master', diff saved to https://phabricator.wikimedia.org/P48372 and previous config saved to /var/cache/conftool/dbconfig/20230519-054923-marostegui.json
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2031', diff saved to https://phabricator.wikimedia.org/P48371 and previous config saved to /var/cache/conftool/dbconfig/20230519-054758-root.json
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2033 to es2 master', diff saved to https://phabricator.wikimedia.org/P48370 and previous config saved to /var/cache/conftool/dbconfig/20230519-054737-marostegui.json
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2030', diff saved to https://phabricator.wikimedia.org/P48369 and previous config saved to /var/cache/conftool/dbconfig/20230519-054503-root.json
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2032 to es1 master', diff saved to https://phabricator.wikimedia.org/P48368 and previous config saved to /var/cache/conftool/dbconfig/20230519-054403-marostegui.json
* 05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1121 from dbctl [[phab:T336725|T336725]]', diff saved to https://phabricator.wikimedia.org/P48367 and previous config saved to /var/cache/conftool/dbconfig/20230519-053719-marostegui.json
 
== 2023-05-18 ==
* 23:26 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.9  refs [[phab:T330215|T330215]]
* 22:59 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad plugin upgrade - bking@cumin1001 - [[phab:T332355|T332355]]
* 22:21 mutante: contint2001 - moving files owned by zuul to new UID/GID - in progress
* 22:20 mutante: short down-time for zuul-merger on contint2001
* 21:47 mutante: maintenance for zuul (CI) on contint servers
* 21:31 brennen@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.9  refs [[phab:T330215|T330215]]
* 21:13 brennen@deploy1002: Finished scap: Backport for [[gerrit:920744{{!}}cache: Do not throw on empty set in LinkBatch::constructSet (T336964)]] (duration: 09m 38s)
* 21:05 brennen@deploy1002: brennen: Backport for [[gerrit:920744{{!}}cache: Do not throw on empty set in LinkBatch::constructSet (T336964)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 21:03 brennen@deploy1002: Started scap: Backport for [[gerrit:920744{{!}}cache: Do not throw on empty set in LinkBatch::constructSet (T336964)]]
* 21:01 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:920743{{!}}Silently ignore istype-depicts image suggestion type (T336962)]] (duration: 08m 09s)
* 20:54 urbanecm@deploy1002: urbanecm: Backport for [[gerrit:920743{{!}}Silently ignore istype-depicts image suggestion type (T336962)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:53 urbanecm@deploy1002: Started scap: Backport for [[gerrit:920743{{!}}Silently ignore istype-depicts image suggestion type (T336962)]]
* 20:36 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad plugin upgrade - bking@cumin1001 - [[phab:T332355|T332355]]
* 20:33 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade - bking@cumin1001 - [[phab:T332355|T332355]]
* 20:16 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:921059{{!}}Reverts hewiki A/B test (T335309)]] (duration: 10m 25s)
* 20:07 urbanecm@deploy1002: ksarabia and urbanecm: Backport for [[gerrit:921059{{!}}Reverts hewiki A/B test (T335309)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:06 urbanecm@deploy1002: Started scap: Backport for [[gerrit:921059{{!}}Reverts hewiki A/B test (T335309)]]
* 18:57 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@502ddae]: [[phab:T333001|T333001]] (duration: 00m 35s)
* 18:56 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@502ddae]: [[phab:T333001|T333001]]
* 18:55 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (2 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin1001 - [[phab:T332355|T332355]]
* 18:50 brennen@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.8  refs [[phab:T330215|T330215]]
* 18:33 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts gitlab-runner1003.eqiad.wmnet
* 18:31 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1 irb int dns - cmooney@cumin1001"
* 18:30 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1 irb int dns - cmooney@cumin1001"
* 18:27 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 18:20 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:20 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1 irb int dns - cmooney@cumin1001"
* 18:19 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add ssw1 irb int dns - cmooney@cumin1001"
* 18:18 brennen@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.9  refs [[phab:T330215|T330215]]
* 18:11 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade - bking@cumin1001 - [[phab:T332355|T332355]]
* 18:09 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (2 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin1001 - [[phab:T332355|T332355]]
* 18:07 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - bking@cumin1001 - [[phab:T274204|T274204]]
* 18:04 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 17:59 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - bking@cumin1001 - [[phab:T274204|T274204]]
* 17:38 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:37 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:36 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:35 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:29 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:29 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:27 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:26 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:26 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:26 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:26 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:26 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:55 XioNoX: push new pfw policies - [[phab:T336896|T336896]]
* 16:21 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 16:21 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 16:10 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1002.eqiad.wmnet with OS bullseye
* 15:58 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 15:58 otto@deploy1002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 15:57 inflatador: bking@cumin1001 starting rolling restart of wcqs for java updates [[phab:T334470|T334470]]
* 15:53 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
* 15:50 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
* 15:47 mforns@deploy1002: Finished deploy [airflow-dags/analytics_test@6e3358d]: (no justification provided) (duration: 00m 10s)
* 15:47 mforns@deploy1002: Started deploy [airflow-dags/analytics_test@6e3358d]: (no justification provided)
* 15:37 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 15:37 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bullseye
* 15:31 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 15:29 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 15:25 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 15:23 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 15:20 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-etcd2003.codfw.wmnet
* 15:19 elukey@cumin1001: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-staging-worker
* 15:18 otto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 15:18 otto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 15:17 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:16 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-etcd2003.codfw.wmnet
* 15:15 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 15:13 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-etcd2002.codfw.wmnet
* 15:09 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-etcd2002.codfw.wmnet
* 15:08 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-etcd2001.codfw.wmnet
* 15:04 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-etcd2001.codfw.wmnet
* 15:03 stevemunene@deploy1002: Finished deploy [airflow-dags/analytics_product@6e3358d]: (no justification provided) (duration: 00m 06s)
* 15:02 stevemunene@deploy1002: Started deploy [airflow-dags/analytics_product@6e3358d]: (no justification provided)
* 14:59 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:59 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:56 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:38 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts gitlab-runner1003.eqiad.wmnet
* 14:34 elukey@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 14:31 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 14:31 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 14:30 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:30 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:01 elukey@cumin1001: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-serve-worker-codfw
* 13:59 elukey@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 13:52 elukey@cumin1001: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-staging-worker
* 13:50 elukey@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 13:49 elukey@cumin1001: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-staging-worker
* 13:47 elukey@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 13:18 TheresNoTime: closing backport window
* 13:14 samtar@deploy1002: Finished scap: Backport for [[gerrit:919023{{!}}InitialiseSettings: Set wgWatchersMaxAge=30days (T336250)]] (duration: 08m 45s)
* 13:07 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 13:07 samtar@deploy1002: samtar and s-mukuti: Backport for [[gerrit:919023{{!}}InitialiseSettings: Set wgWatchersMaxAge=30days (T336250)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 13:06 samtar@deploy1002: Started scap: Backport for [[gerrit:919023{{!}}InitialiseSettings: Set wgWatchersMaxAge=30days (T336250)]]
* 13:02 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 12:59 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: Revert Enable First Input Delay events. This is causing validation errors as well as breakages in the hadoop ingestion pipepine - [[phab:T332012|T332012]] (duration: 06m 19s)
* 12:57 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 12:56 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 12:54 elukey@cumin1001: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-staging-worker
* 12:51 elukey@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 12:51 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 12:51 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 12:46 otto@deploy1002: Synchronized wmf-config/ext-EventLogging.php: Revert Enable First Input Delay events. This is causing validation errors as well as breakages in the hadoop ingestion pipepine - [[phab:T332012|T332012]] (duration: 07m 00s)
* 12:46 elukey: clean up old jupyterhub.service references (crash looping) on stat* nodes that had it
* 12:44 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 12:44 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 12:41 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 12:35 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 12:35 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 12:35 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:34 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:28 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 12:24 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 12:24 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 12:20 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 12:19 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 12:17 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
* 12:16 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 12:15 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 12:12 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
* 12:11 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 12:06 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 12:02 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 11:56 topranks: reconfiguring DHCP relay function on eqiad core routers ([[phab:T320508|T320508]])
* 11:55 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd1001.eqiad.wmnet
* 11:51 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-etcd1001.eqiad.wmnet
* 11:36 kart_: MinT: Update to 2023-05-18-060931-production and Set CT2_INTRA_THREADS to 0 ([[phab:T336483|T336483]])
* 11:34 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 11:28 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 11:23 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 11:20 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 11:11 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 11:09 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 11:07 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-cache1003.eqiad.wmnet
* 11:00 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-cache1003.eqiad.wmnet
* 10:57 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-cache1002.eqiad.wmnet
* 10:50 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-cache1002.eqiad.wmnet
* 10:32 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-cache1001.eqiad.wmnet
* 10:30 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-worker1110.eqiad.wmnet with reason: Troubleshooting failed disk
* 10:29 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on an-worker1110.eqiad.wmnet with reason: Troubleshooting failed disk
* 10:25 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-cache1001.eqiad.wmnet
* 10:24 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-cache1001.eqiad.wmnet
* 10:24 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-cache1001.eqiad.wmnet
* 10:06 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 10:05 elukey@deploy1002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 08:30 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:29 akosiaris: upgrade docker-registry to 2.8.2 on all registry hosts
* 08:28 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:27 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:26 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=registry2003.codfw.wmnet
* 08:24 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:24 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:19 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:19 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:00 akosiaris: upgrade registry on registry2003 to 2.8.2
* 07:59 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=registry2003.codfw.wmnet
* 07:25 apergos: UTC morning backport and config training window done
* 07:15 kartik@deploy1002: Finished scap: Backport for [[gerrit:920577{{!}}Enable the new Special:Contribute page entry point for desktop on selected wikis (T327868)]] (duration: 09m 18s)
* 07:07 kartik@deploy1002: kartik: Backport for [[gerrit:920577{{!}}Enable the new Special:Contribute page entry point for desktop on selected wikis (T327868)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 07:06 kartik@deploy1002: Started scap: Backport for [[gerrit:920577{{!}}Enable the new Special:Contribute page entry point for desktop on selected wikis (T327868)]]
* 06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2134,2160].codfw.wmnet,db[1159,1217].eqiad.wmnet with reason: maintenance
* 06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2134,2160].codfw.wmnet,db[1159,1217].eqiad.wmnet with reason: maintenance
* 06:07 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1122 from dbctl [[phab:T336833|T336833]]', diff saved to https://phabricator.wikimedia.org/P48362 and previous config saved to /var/cache/conftool/dbconfig/20230518-060734-marostegui.json
* 04:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2096,2101,2115,2131].codfw.wmnet with reason: maintenance
* 04:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2096,2101,2115,2131].codfw.wmnet with reason: maintenance
 
== 2023-05-17 ==
* 22:30 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:30 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove new openstack.codfw1dev.wikimediacloud.org name server A records. - cmooney@cumin1001"
* 22:29 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove new openstack.codfw1dev.wikimediacloud.org name server A records. - cmooney@cumin1001"
* 22:26 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 22:15 krinkle@deploy1002: Synchronized wmf-config/: [[phab:T332012|T332012]] (duration: 06m 51s)
* 21:44 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 21:26 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
* 21:26 bking@cumin1001: START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
* 21:01 zabe: mwscript extensions/Translate/scripts/moveTranslatableBundle.php --wiki metawiki "Public policy" "Global Advocacy" "Zabe" --reason "per request [[:phab:T333842{{!}}T333842]]"
* 20:59 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 20:32 urbanecm: UTC late B&C window done
* 20:29 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:920784{{!}}GrowthExperiments: amend wrong wiki prefix for jbowiki (T308134)]], [[gerrit:920732{{!}}NewTopicOptOutActiveUsers: Skip bot users etc. (T317375)]], [[gerrit:920386{{!}}Enable zebra ab test in hewiki (T335972)]] (duration: 11m 36s)
* 20:19 urbanecm@deploy1002: urbanecm and matmarex and ksarabia and sgimeno: Backport for [[gerrit:920784{{!}}GrowthExperiments: amend wrong wiki prefix for jbowiki (T308134)]], [[gerrit:920732{{!}}NewTopicOptOutActiveUsers: Skip bot users etc. (T317375)]], [[gerrit:920386{{!}}Enable zebra ab test in hewiki (T335972)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.
* 20:17 urbanecm@deploy1002: Started scap: Backport for [[gerrit:920784{{!}}GrowthExperiments: amend wrong wiki prefix for jbowiki (T308134)]], [[gerrit:920732{{!}}NewTopicOptOutActiveUsers: Skip bot users etc. (T317375)]], [[gerrit:920386{{!}}Enable zebra ab test in hewiki (T335972)]]
* 20:15 urbanecm@deploy1002: Finished scap: Backport for [[gerrit:920722{{!}}GrowthExperiments: enable add link frontend in 9th round wikis (T308134)]] (duration: 12m 06s)
* 20:13 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs2012.codfw.wmnet
* 20:12 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs2012.codfw.wmnet
* 20:07 bking@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts wdqs2012.codfw.wmnet
* 20:04 urbanecm@deploy1002: sgimeno and urbanecm: Backport for [[gerrit:920722{{!}}GrowthExperiments: enable add link frontend in 9th round wikis (T308134)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:03 urbanecm@deploy1002: Started scap: Backport for [[gerrit:920722{{!}}GrowthExperiments: enable add link frontend in 9th round wikis (T308134)]]
* 19:55 otto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 19:54 otto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 19:54 bking@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts wdqs2012.codfw.wmnet
* 19:50 ejegg: payments-wiki upgraded from {{Gerrit|8988a598}} to {{Gerrit|a7567c6a}}
* 19:41 inflatador: bking@wdqs2012 depooling to attempt firmware update [[phab:T331297|T331297]]
* 19:01 Amir1: Removing db1112 from zarcillo [[phab:T336332|T336332]]
* 18:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1112.eqiad.wmnet
* 18:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1112.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ladsgroup@cumin1001"
* 18:58 ladsgroup@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1112.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ladsgroup@cumin1001"
* 18:48 ladsgroup@cumin1001: START - Cookbook sre.dns.netbox
* 18:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1112.eqiad.wmnet
* 18:34 brennen@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.9  refs [[phab:T330215|T330215]] (duration: 06m 22s)
* 18:27 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.9  refs [[phab:T330215|T330215]]
* 18:11 otto@deploy1002: Finished deploy [analytics/refinery@fb22795]: Deploy for ProduceCanaryEvents fix - [analytics/refinery@fb22795] (duration: 09m 14s)
* 18:03 brennen: train 1.41.0-wmf.9 ([[phab:T330215|T330215]]): no current blockers, rolling to group1 as backup-backup conductor
* 18:02 otto@deploy1002: Started deploy [analytics/refinery@fb22795]: Deploy for ProduceCanaryEvents fix - [analytics/refinery@fb22795]
* 17:44 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 17:44 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 17:44 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 17:44 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 17:43 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: sync
* 17:43 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: sync
* 17:19 brett: Maglev LVS scheduler rollout finished in esams - [[phab:T263797|T263797]]
* 16:58 Guest4300: Running `foreachwiki extensions/TimedMediaHandler/maintenance/requeueTranscodes.php --video --mime=video/mpeg --missing --error --stalled --throttle` on mwmaint1002 for [[phab:T244570|T244570]]
* 16:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox
* 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1032 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48356 and previous config saved to /var/cache/conftool/dbconfig/20230517-162444-ladsgroup.json
* 16:21 ayounsi@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox
* 16:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2032 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48355 and previous config saved to /var/cache/conftool/dbconfig/20230517-161929-ladsgroup.json
* 16:18 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 16:17 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 16:14 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 16:13 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1032', diff saved to https://phabricator.wikimedia.org/P48354 and previous config saved to /var/cache/conftool/dbconfig/20230517-160937-ladsgroup.json
* 16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2032', diff saved to https://phabricator.wikimedia.org/P48353 and previous config saved to /var/cache/conftool/dbconfig/20230517-160423-ladsgroup.json
* 16:00 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 16:00 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
* 15:57 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:56 jelto@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1032', diff saved to https://phabricator.wikimedia.org/P48352 and previous config saved to /var/cache/conftool/dbconfig/20230517-155431-ladsgroup.json
* 15:52 brett: Rolling out maglev LVS scheduler in esams - [[phab:T263797|T263797]]
* 15:52 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:50 jelto@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2032', diff saved to https://phabricator.wikimedia.org/P48351 and previous config saved to /var/cache/conftool/dbconfig/20230517-154916-ladsgroup.json
* 15:46 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1032 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48350 and previous config saved to /var/cache/conftool/dbconfig/20230517-153925-ladsgroup.json
* 15:38 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2032 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48349 and previous config saved to /var/cache/conftool/dbconfig/20230517-153410-ladsgroup.json
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1032 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48348 and previous config saved to /var/cache/conftool/dbconfig/20230517-153042-ladsgroup.json
* 15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1032.eqiad.wmnet with reason: Maintenance
* 15:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1032.eqiad.wmnet with reason: Maintenance
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es2032 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48347 and previous config saved to /var/cache/conftool/dbconfig/20230517-153010-ladsgroup.json
* 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1027 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48346 and previous config saved to /var/cache/conftool/dbconfig/20230517-153004-ladsgroup.json
* 15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2032.codfw.wmnet with reason: Maintenance
* 15:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2032.codfw.wmnet with reason: Maintenance
* 15:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2028 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48345 and previous config saved to /var/cache/conftool/dbconfig/20230517-152945-ladsgroup.json
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2002.wikimedia.org
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2002.wikimedia.org
* 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1002.wikimedia.org
* 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1027', diff saved to https://phabricator.wikimedia.org/P48344 and previous config saved to /var/cache/conftool/dbconfig/20230517-151458-ladsgroup.json
* 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1002.wikimedia.org
* 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2028', diff saved to https://phabricator.wikimedia.org/P48343 and previous config saved to /var/cache/conftool/dbconfig/20230517-151438-ladsgroup.json
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet
* 15:07 aikochou@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet
* 14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1027', diff saved to https://phabricator.wikimedia.org/P48342 and previous config saved to /var/cache/conftool/dbconfig/20230517-145952-ladsgroup.json
* 14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2028', diff saved to https://phabricator.wikimedia.org/P48341 and previous config saved to /var/cache/conftool/dbconfig/20230517-145932-ladsgroup.json
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs101[6-9]*<nowiki>}</nowiki> and A:aqs
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1027 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48340 and previous config saved to /var/cache/conftool/dbconfig/20230517-144446-ladsgroup.json
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2028 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48339 and previous config saved to /var/cache/conftool/dbconfig/20230517-144425-ladsgroup.json
* 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es2028 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48338 and previous config saved to /var/cache/conftool/dbconfig/20230517-144025-ladsgroup.json
* 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: Maintenance
* 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: Maintenance
* 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1027 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48337 and previous config saved to /var/cache/conftool/dbconfig/20230517-143949-ladsgroup.json
* 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1027.eqiad.wmnet with reason: Maintenance
* 14:39 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - EventBus: produce to mediawiki.page_change.v1 stream - [[phab:T336817|T336817]] (duration: 06m 20s)
* 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1027.eqiad.wmnet with reason: Maintenance
* 14:38 btullis@cumin1001: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker
* 14:36 moritzm: installing jackson-databind security updates
* 14:34 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@ad1cc7c]: deploying hotfix for [[phab:T336800|T336800]] (duration: 00m 09s)
* 14:34 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@ad1cc7c]: deploying hotfix for [[phab:T336800|T336800]]
* 14:33 ottomata: EventBus: produce to mediawiki.page_change.v1 stream - [[phab:T336817|T336817]]
* 14:30 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 14:30 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 14:28 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 14:28 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 14:27 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 14:27 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 14:27 ottomata: rolling restart of eventgate-main to pick up new mediawiki.page_change.v1 stream config - [[phab:T336817|T336817]]
* 14:17 elukey: run authdns-update for new ml-serve/ores discovery endpoints - [[phab:T336726|T336726]]
* 14:15 jmm@cumin2002: START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on P<nowiki>{</nowiki>aqs101[6-9]*<nowiki>}</nowiki> and A:aqs
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs101[2-5]*<nowiki>}</nowiki> and A:aqs
* 14:14 otto@deploy1002: Synchronized wmf-config/ext-EventStreamConfig.php: wgEventStreams - Declare mediawiki.page_change.v1 stream - [[phab:T336817|T336817]] (duration: 07m 30s)
* 14:10 bking@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 14:09 bking@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 14:09 bking@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 14:08 bking@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 14:07 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1101.eqiad.wmnet
* 13:59 taavi@deploy1002: Finished scap: Backport for [[gerrit:920582{{!}}Define $maintClass in maintenance script for compatibility (T317375)]] (duration: 07m 24s)
* 13:59 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1101.eqiad.wmnet
* 13:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1100.eqiad.wmnet
* 13:54 taavi@deploy1002: matmarex and taavi: Backport for [[gerrit:920582{{!}}Define $maintClass in maintenance script for compatibility (T317375)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:52 taavi@deploy1002: Started scap: Backport for [[gerrit:920582{{!}}Define $maintClass in maintenance script for compatibility (T317375)]]
* 13:50 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1100.eqiad.wmnet
* 13:50 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1099.eqiad.wmnet
* 13:47 taavi@deploy1002: Finished scap: Backport for [[gerrit:920244{{!}}dblists: Close akwiki (T336675)]] (duration: 08m 11s)
* 13:42 jmm@cumin2002: START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on P<nowiki>{</nowiki>aqs101[2-5]*<nowiki>}</nowiki> and A:aqs
* 13:42 jmm@cumin2002: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs102[0-1]*<nowiki>}</nowiki> and A:aqs
* 13:41 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1099.eqiad.wmnet
* 13:41 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1098.eqiad.wmnet
* 13:40 taavi@deploy1002: taavi and maurelio: Backport for [[gerrit:920244{{!}}dblists: Close akwiki (T336675)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 13:38 taavi@deploy1002: Started scap: Backport for [[gerrit:920244{{!}}dblists: Close akwiki (T336675)]]
* 13:38 taavi@deploy1002: Finished scap: Backport for [[gerrit:920396{{!}}plwiki: Show language selector in main page header (T336707)]] (duration: 07m 39s)
* 13:33 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1098.eqiad.wmnet
* 13:33 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1097.eqiad.wmnet
* 13:32 taavi@deploy1002: stang and taavi: Backport for [[gerrit:920396{{!}}plwiki: Show language selector in main page header (T336707)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 13:30 taavi@deploy1002: Started scap: Backport for [[gerrit:920396{{!}}plwiki: Show language selector in main page header (T336707)]]
* 13:29 taavi@deploy1002: Finished scap: Backport for [[gerrit:920296{{!}}Enable wmgWikibaseTmpWbsubscribersSensibleOutput on wikidata (T336760)]], [[gerrit:920306{{!}}Enable wmgWikibaseTmpEnableLabelsInApiSummaries on Wikidata (T335099)]] (duration: 09m 15s)
* 13:25 jmm@cumin2002: START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on P<nowiki>{</nowiki>aqs102[0-1]*<nowiki>}</nowiki> and A:aqs
* 13:25 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1097.eqiad.wmnet
* 13:25 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1096.eqiad.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs1011*<nowiki>}</nowiki> and A:aqs
* 13:24 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:23 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:22 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:22 taavi@deploy1002: gtzatchkova and taavi: Backport for [[gerrit:920296{{!}}Enable wmgWikibaseTmpWbsubscribersSensibleOutput on wikidata (T336760)]], [[gerrit:920306{{!}}Enable wmgWikibaseTmpEnableLabelsInApiSummaries on Wikidata (T335099)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:22 btullis@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker
* 13:20 taavi@deploy1002: Started scap: Backport for [[gerrit:920296{{!}}Enable wmgWikibaseTmpWbsubscribersSensibleOutput on wikidata (T336760)]], [[gerrit:920306{{!}}Enable wmgWikibaseTmpEnableLabelsInApiSummaries on Wikidata (T335099)]]
* 13:20 klausman@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:19 klausman@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:18 daniel@deploy1002: Finished scap: Backport for [[gerrit:920230{{!}}Revert "Revert "Add getMultiHttpClient function to make HTTP requests to Mathoid."" (T335347)]], [[gerrit:920231{{!}}Use MultiHttpClient instead of VirtualRESTService. (T335347)]] (duration: 11m 52s)
* 13:17 jmm@cumin2002: START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on P<nowiki>{</nowiki>aqs1011*<nowiki>}</nowiki> and A:aqs
* 13:16 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1096.eqiad.wmnet
* 13:08 jmm@cumin2002: END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling reboot on A:aqs-canary
* 13:07 daniel@deploy1002: daniel: Backport for [[gerrit:920230{{!}}Revert "Revert "Add getMultiHttpClient function to make HTTP requests to Mathoid."" (T335347)]], [[gerrit:920231{{!}}Use MultiHttpClient instead of VirtualRESTService. (T335347)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:06 daniel@deploy1002: Started scap: Backport for [[gerrit:920230{{!}}Revert "Revert "Add getMultiHttpClient function to make HTTP requests to Mathoid."" (T335347)]], [[gerrit:920231{{!}}Use MultiHttpClient instead of VirtualRESTService. (T335347)]]
* 13:03 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-coord1004.eqiad.wmnet
* 13:00 jmm@cumin2002: START - Cookbook sre.aqs.roll-restart-reboot rolling reboot on A:aqs-canary
* 12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1034 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48335 and previous config saved to /var/cache/conftool/dbconfig/20230517-125952-ladsgroup.json
* 12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1033 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48334 and previous config saved to /var/cache/conftool/dbconfig/20230517-125824-ladsgroup.json
* 12:56 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-coord1004.eqiad.wmnet
* 12:56 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-coord1003.eqiad.wmnet
* 12:54 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:54 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS records following puppetdb bulk import - cmooney@cumin1001"
* 12:52 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS records following puppetdb bulk import - cmooney@cumin1001"
* 12:50 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 12:49 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-coord1003.eqiad.wmnet
* 12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1034', diff saved to https://phabricator.wikimedia.org/P48333 and previous config saved to /var/cache/conftool/dbconfig/20230517-124446-ladsgroup.json
* 12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1033', diff saved to https://phabricator.wikimedia.org/P48332 and previous config saved to /var/cache/conftool/dbconfig/20230517-124318-ladsgroup.json
* 12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1034', diff saved to https://phabricator.wikimedia.org/P48331 and previous config saved to /var/cache/conftool/dbconfig/20230517-122940-ladsgroup.json
* 12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1033', diff saved to https://phabricator.wikimedia.org/P48330 and previous config saved to /var/cache/conftool/dbconfig/20230517-122812-ladsgroup.json
* 12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1034 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48329 and previous config saved to /var/cache/conftool/dbconfig/20230517-121434-ladsgroup.json
* 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1033 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48328 and previous config saved to /var/cache/conftool/dbconfig/20230517-121306-ladsgroup.json
* 12:12 cmooney@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox
* 12:11 cmooney@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox
* 12:06 topranks: Merging CR822439 and beginning bulk puppetdb -> netbox import to update host interfaces
* 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1034 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48327 and previous config saved to /var/cache/conftool/dbconfig/20230517-115943-ladsgroup.json
* 11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1034.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1034.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1028 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48326 and previous config saved to /var/cache/conftool/dbconfig/20230517-115908-ladsgroup.json
* 11:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1033 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48325 and previous config saved to /var/cache/conftool/dbconfig/20230517-115612-ladsgroup.json
* 11:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 11:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 11:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1026 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48324 and previous config saved to /var/cache/conftool/dbconfig/20230517-115538-ladsgroup.json
* 11:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2034 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48323 and previous config saved to /var/cache/conftool/dbconfig/20230517-115303-ladsgroup.json
* 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1028', diff saved to https://phabricator.wikimedia.org/P48322 and previous config saved to /var/cache/conftool/dbconfig/20230517-114402-ladsgroup.json
* 11:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1026', diff saved to https://phabricator.wikimedia.org/P48321 and previous config saved to /var/cache/conftool/dbconfig/20230517-114032-ladsgroup.json
* 11:38 kart_: Update MinT to 2023-05-17-052844-production: Set CT2_USE_EXPERIMENTAL_PACKED_GEMM for better performance
* 11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2034', diff saved to https://phabricator.wikimedia.org/P48320 and previous config saved to /var/cache/conftool/dbconfig/20230517-113757-ladsgroup.json
* 11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2033 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48319 and previous config saved to /var/cache/conftool/dbconfig/20230517-113531-ladsgroup.json
* 11:33 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1028', diff saved to https://phabricator.wikimedia.org/P48318 and previous config saved to /var/cache/conftool/dbconfig/20230517-112856-ladsgroup.json
* 11:28 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 11:26 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 11:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1026', diff saved to https://phabricator.wikimedia.org/P48317 and previous config saved to /var/cache/conftool/dbconfig/20230517-112526-ladsgroup.json
* 11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2034', diff saved to https://phabricator.wikimedia.org/P48316 and previous config saved to /var/cache/conftool/dbconfig/20230517-112251-ladsgroup.json
* 11:22 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 11:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2033', diff saved to https://phabricator.wikimedia.org/P48315 and previous config saved to /var/cache/conftool/dbconfig/20230517-112024-ladsgroup.json
* 11:15 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 11:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1028 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48314 and previous config saved to /var/cache/conftool/dbconfig/20230517-111350-ladsgroup.json
* 11:13 kartik@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 11:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1026 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48313 and previous config saved to /var/cache/conftool/dbconfig/20230517-111020-ladsgroup.json
* 11:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2034 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48312 and previous config saved to /var/cache/conftool/dbconfig/20230517-110745-ladsgroup.json
* 11:07 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 11:06 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 11:05 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es2033', diff saved to https://phabricator.wikimedia.org/P48311 and previous config saved to /var/cache/conftool/dbconfig/20230517-110518-ladsgroup.json
* 11:05 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 11:04 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 11:04 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es2034 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48310 and previous config saved to /var/cache/conftool/dbconfig/20230517-110251-ladsgroup.json
* 11:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2034.codfw.wmnet with reason: Maintenance
* 11:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2034.codfw.wmnet with reason: Maintenance
* 11:02 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 11:01 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 11:01 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 11:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1026 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48309 and previous config saved to /var/cache/conftool/dbconfig/20230517-110130-ladsgroup.json
* 11:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1026.eqiad.wmnet with reason: Maintenance
* 11:01 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 11:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1026.eqiad.wmnet with reason: Maintenance
* 11:00 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 11:00 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1028 ([[phab:T335845|T335845]])', diff saved to https://phabricator.wikimedia.org/P48308 and previous config saved to /var/cache/conftool/dbconfig/20230517-105957-ladsgroup.json
* 10:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1028.eqiad.wmnet with reason: Maintenance
* 10:59 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply