You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(eileen: process-control config revision is dd541a25dc)
imported>Stashbot
(mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .)
 
(370 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== 2020-08-28 ==
== 2021-10-22 ==
* 00:06 eileen: process-control config revision is {{Gerrit|dd541a25dc}}
* 23:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:57 bblack: re-pooling eqiad in DNS
* 20:54 legoktm: <XioNoX> I disabled the interface on cr1, going to re-enabled the active on on cr2
* 20:48 legoktm: bblack has temporarily depooled eqiad https://gerrit.wikimedia.org/r/733043
* 20:41 XioNoX: disable sessions to equinix eqiad IXP
* 19:17 urbanecm: Start server-side upload of 1 video file ([[phab:T294134|T294134]])
* 15:06 jbond: upload puppetboard_3.1.0-1_all.deb to ullseye-wikimedia
* 13:42 ema: deployment-cache-upload06: restart varnish-frontend, package got upgraded to 6.0.8 [[phab:T294116|T294116]]
* 13:30 jbond: upload python3-pypuppetdb_2.4.0-1_all.deb to bullseye
* 10:46 jbond: upload cas_6.4.2-1+wmf10u1
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 10:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 09:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294029|T294029]]
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2025.codfw.wmnet with OS buster
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 08:27 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:23 ema: cp3062: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 08:00 ema: deployment-cache-text06: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17580 and previous config saved to /var/cache/conftool/dbconfig/20211022-055403-root.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17579 and previous config saved to /var/cache/conftool/dbconfig/20211022-053900-root.json
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17578 and previous config saved to /var/cache/conftool/dbconfig/20211022-052356-root.json
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17577 and previous config saved to /var/cache/conftool/dbconfig/20211022-050852-root.json
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17576 and previous config saved to /var/cache/conftool/dbconfig/20211022-045349-root.json
* 04:46 marostegui_: Deploy schema change on s8 codfw - [[phab:T291719|T291719]]
* 04:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17575 and previous config saved to /var/cache/conftool/dbconfig/20211022-043845-root.json
* 02:59 ejegg: updated payments-wiki from {{Gerrit|088a8cda1e}} to {{Gerrit|6e810fb401}}


== 2020-08-27 ==
== 2021-10-21 ==
* 23:53 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:51 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 23:38 jforrester@deploy1002: Synchronized w/fatal-error.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:48 eileen: civicrm revision changed from {{Gerrit|a942537984}} to {{Gerrit|3d501e71d9}}, config revision is {{Gerrit|dd541a25dc}}
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:54 eileen: civicrm revision changed from {{Gerrit|481ab742db}} to {{Gerrit|a942537984}}, config revision is {{Gerrit|e2ab4d7c1f}}
* 23:37 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 22:28 tzatziki: removing one file for legal compliance
* 23:36 jforrester@deploy1002: Synchronized multiversion/: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 22:18 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 23:34 jforrester@deploy1002: Synchronized docroot/noc/conf/index.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 22:18 volans: uploaded spicerack_0.0.40-1_amd64.deb to apt.wikimedia.org buster-wikimedia
* 23:33 jforrester@deploy1002: Synchronized wmf-config: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 22:17 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 23:32 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 21:59 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 23:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:57 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 23:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:46 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 23:25 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:730946{{!}}CommonSettings: Drop legacy CentralAuth config flag, never read (T277932)]] (duration: 00m 55s)
* 21:43 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 23:18 thcipriani@deploy1002: Synchronized tests/multiversion/StaticSettingsTest.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 21:43 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 23:15 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 21:41 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:34 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:33 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 22:42 mutante: [[phab:T294038|T294038]] [krb1001:~] $ sudo manage_principals.py create effeietsanders ... Principal successfully created.  . .Successfully sent email
* 21:32 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 21:44 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 47s)
* 21:29 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 21:41 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 21:28 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 20:54 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 13s)
* 21:25 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:53 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 21:25 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 20:53 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 35s)
* 21:23 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:52 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 21:23 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 21:22 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 21:22 dzahn@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 21:20 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 21:20 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:17 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:17 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 19:42 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Update $wgTimelineFonts for new path to unifont in Shellbox container ([[phab:T293050|T293050]]) (duration: 00m 55s)
* 21:17 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:38 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 21:17 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 19:35 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 21:16 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:31 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 21:15 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 19:23 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 21:14 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:10 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs (duration: 00m 23s)
* 21:13 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 19:09 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs
* 21:13 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:07 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: (no justification provided) (duration: 00m 08s)
* 21:10 dzahn@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)
* 19:07 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: (no justification provided)
* 21:07 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:53 urbanecm: Deploy security patch for [[phab:T285116|T285116]] (wmf.4, wmf.5)
* 20:50 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:53 mutante: dumpsdata1003 - sudo systemctl reset-failed to clear Icinga alert about failed cleanup_tmpdumps.service
* 20:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:55 mutante: that's a key for https://www.worldcat.org/whatis/default.jsp btw for those wondering
* 20:50 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=api_appserver,name=mw221[0-4].codfw.wmnet
* 17:53 mutante: citoid - replaced "wskey" for worldcat in private repo as requested on [[phab:T294010|T294010]] (is in 4 places, 3 for deployment_server/k8s and one remnant for scb)
* 20:49 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:53 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 20:49 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:52 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 20:49 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=api_appserver,name=mw220[0-9].codfw.wmnet
* 17:50 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 20:48 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:48 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:48 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=api_appserver,name=mw214[0-7].codfw.wmnet
* 16:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 20:48 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:12 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 20:48 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:07 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 20:47 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=api_appserver,name=mw213[0-9].codfw.wmnet
* 16:06 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (2/3) (duration: 00m 54s)
* 20:43 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams: Streams for testing MEP-based analytics instruments - [[phab:T259714|T259714]] (duration: 00m 55s)
* 16:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:04 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 19:58 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 19:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:58 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:01 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 19:57 marxarelli: 1.36.0-wmf.6 promoted to all wikis ([[phab:T257974|T257974]]). new errors appear to be related to [[phab:T261345|T261345]] but are known since 1.36.0-wmf.5
* 15:59 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (2/3) (duration: 00m 55s)
* 19:57 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=appserver,name=mw21[8-9][0-9]*.codfw.wmnet
* 15:58 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (1/3) (duration: 00m 57s)
* 19:41 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.6
* 15:43 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 11s)
* 15:21 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:20 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating apiportalwiki ([[phab:T246945|T246945]]) (duration: 01m 03s)
* 15:14 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 19:19 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating apiportalwiki ([[phab:T246945|T246945]]) (duration: 01m 03s)
* 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:16 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating apiportalwiki ([[phab:T246945|T246945]])
* 15:13 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 54s)
* 19:15 urbanecm@deploy1001: Synchronized dblists: Creating apiportalwiki ([[phab:T246945|T246945]]) (duration: 01m 03s)
* 15:12 Lucas_WMDE: my next message accidentally says 1/3 again but it’s 2/3, sorry
* 19:14 urbanecm@deploy1001: Synchronized multiversion/MWMultiVersion.php: Creating apiportalwiki ([[phab:T246945|T246945]]) (duration: 01m 03s)
* 15:11 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 19:13 urbanecm@deploy1001: Synchronized wmf-config/db-codfw.php: Creating apiportalwiki ([[phab:T246945|T246945]]) (duration: 01m 03s)
* 15:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:11 urbanecm@deploy1001: Synchronized wmf-config/db-eqiad.php: Creating apiportalwiki ([[phab:T246945|T246945]]) (duration: 01m 03s)
* 14:56 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 18:54 mforns@deploy1001: Finished deploy [analytics/refinery@e85191b] (thin): Regular analytics weekly train THIN [analytics/refinery@e85191bb80c13781b41031340ea318b5f854b6a9] (duration: 00m 08s)
* 14:42 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/Wikibase.default.php: Backport: [[gerrit:732666{{!}}Enable dispatching via jobs by default (T291828)]] (duration: 00m 55s)
* 18:54 mforns@deploy1001: Started deploy [analytics/refinery@e85191b] (thin): Regular analytics weekly train THIN [analytics/refinery@e85191bb80c13781b41031340ea318b5f854b6a9]
* 14:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:53 mforns@deploy1001: Finished deploy [analytics/refinery@e85191b]: Regular analytics weekly train [analytics/refinery@e85191bb80c13781b41031340ea318b5f854b6a9] (duration: 10m 01s)
* 14:39 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client/: Backport: [[gerrit:732674{{!}}Fix ExternalUserNames service wiring for local database]] (duration: 00m 57s)
* 18:43 mforns@deploy1001: Started deploy [analytics/refinery@e85191b]: Regular analytics weekly train [analytics/refinery@e85191bb80c13781b41031340ea318b5f854b6a9]
* 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:43 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Assign all homepage users to variant A (duration: 01m 03s)
* 14:33 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 18:31 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable GrowthExperiments on ruwiki ([[phab:T257490|T257490]]) (duration: 01m 03s)
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:17 dzahn@cumin1001: conftool action : set/weight=1; selector: dc=codfw,cluster=jobrunner,name=mw2250.codfw.wmnet,service=canary
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:17 dzahn@cumin1001: conftool action : set/weight=1; selector: dc=codfw,cluster=jobrunner,name=mw2249.codfw.wmnet,service=canary
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:16 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:16 cdanis@cumin1001: START - Cookbook sre.network.cf
* 13:56 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:14 dzahn@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=jobrunner,name=mw1318.eqiad.wmnet
* 13:55 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:07 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw229[1-9].codfw.wmnet,cluster=api_appserver
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:06 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw2290.codfw.wmnet,cluster=api_appserver
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:05 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw22[6-8][0-9].codfw.wmnet,cluster=api_appserver
* 13:34 volans: uploaded spicerack_1.0.6 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 18:03 Urbanecm: Creating jawikivoyage is done ([[phab:T260320|T260320]])
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:02 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 01m 59s)
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:02 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw225[0-9].codfw.wmnet,cluster=api_appserver
* 13:04 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 18:00 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating jawikivoyage ([[phab:T260320|T260320]]) (duration: 01m 02s)
* 12:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 17:59 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw224[4-5].codfw.wmnet,service=canary
* 12:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 17:59 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw224[4-5].codfw.wmnet
* 12:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 17:59 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating jawikivoyage ([[phab:T260320|T260320]]) (duration: 01m 03s)
* 12:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 17:58 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating jawikivoyage ([[phab:T260320|T260320]])
* 12:48 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 17:57 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw222[0-3].codfw.wmnet
* 12:48 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 17:56 urbanecm@deploy1001: Synchronized dblists: Creating jawikivoyage ([[phab:T260320|T260320]]) (duration: 00m 58s)
* 12:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 17:56 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw221[5-9].codfw.wmnet,service=canary
* 12:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 17:55 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw221[5-9].codfw.wmnet
* 12:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 17:55 urbanecm@deploy1001: Synchronized wmf-config/db-codfw.php: Creating jawikivoyage ([[phab:T260320|T260320]]) (duration: 01m 03s)
* 12:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 17:54 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw221[0-4].codfw.wmnet
* 11:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 17:54 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw221[0-4].codfw.wmnet
* 11:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 17:54 urbanecm@deploy1001: Synchronized wmf-config/db-eqiad.php: Creating jawikivoyage ([[phab:T260320|T260320]]) (duration: 01m 07s)
* 11:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 17:52 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw220[1-9].codfw.wmnet
* 11:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 17:52 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw220[1-9].codfw.wmnet
* 11:13 Lucas_WMDE: UTC morning backport+config window done
* 17:51 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2200.codfw.wmnet
* 11:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294008|T294008]]
* 17:50 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw2200.codfw.wmnet
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:48 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw214[0-7].codfw.wmnet
* 11:07 jgiannelos@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730848{{!}}Configure event stream for map tiles state change (T289771)]] (duration: 01m 04s)
* 17:47 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw213[5-9].codfw.wmnet
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:46 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw214[0-7].codfw.wmnet
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 17:45 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw213[5-9].codfw.wmnet
* 10:48 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 17:39 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw23[0-7][0-9].codfw.wmnet
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 17:31 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw227[0-7].codfw.wmnet,service=canary
* 10:47 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 17:30 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw227[0-7].codfw.wmnet
* 10:14 jbond: mergeing refactor of P:base Gerrit:714975
* 17:29 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 09:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 cdanis@cumin1001: START - Cookbook sre.network.cf
* 09:49 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 17:18 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:56 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 01m 03s)
* 17:17 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw226[8-9].codfw.wmnet
* 08:33 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 17:13 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw225[4-8].codfw.wmnet
* 08:26 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 17:12 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 08:25 ema: cp3062: revert vsl_space experiment [[phab:T293879|T293879]]
* 17:11 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw224[0-2].codfw.wmnet
* 08:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1004.eqiad.wmnet with OS bullseye
* 17:04 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw223[2-9].codfw.wmnet
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17563 and previous config saved to /var/cache/conftool/dbconfig/20211021-080330-root.json
* 17:01 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw2231.codfw.wmnet
* 07:56 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite1004.eqiad.wmnet with OS bullseye
* 16:59 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw2230.codfw.wmnet
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17562 and previous config saved to /var/cache/conftool/dbconfig/20211021-074826-root.json
* 16:54 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw222[4-9].codfw.wmnet
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17561 and previous config saved to /var/cache/conftool/dbconfig/20211021-073323-root.json
* 16:49 mutante: re-weighted appservers and api appservers in eqiad - hardware type G = weight 25, all other types = weight 30 ([[phab:T261159|T261159]])
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17560 and previous config saved to /var/cache/conftool/dbconfig/20211021-071819-root.json
* 16:48 mutante: depooling mw2187 - mw2199 - old codfw appservers of type A to be decom'ed, previously weight 10 ([[phab:T260654|T260654]])
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17559 and previous config saved to /var/cache/conftool/dbconfig/20211021-070315-root.json
* 16:47 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw219[0-9].codfw.wmnet
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17558 and previous config saved to /var/cache/conftool/dbconfig/20211021-064812-root.json
* 16:41 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw218[7-9].codfw.wmnet
* 06:35 elukey: `systemctl reload nginx` on cloudelastic100[5,6] to pick up the new TLS certificate and clear alerts - [[phab:T293826|T293826]]
* 16:35 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw1297.eqiad.wmnet
* 04:47 marostegui: Deploy schema change on s5 codfw - [[phab:T291719|T291719]]
* 16:23 hnowlan@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 04:37 marostegui: Deploy schema change on s6 codfw - [[phab:T291719|T291719]]
* 16:21 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw127[0-5].eqiad.wmnet
* 04:04 legoktm: restarted apache on lists1001 so it only uses new TLS cert ([[phab:T293826|T293826]])
* 16:19 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw126[1-5].eqiad.wmnet,service=canary
* 03:29 eileen: civicrm revision changed from {{Gerrit|e889831012}} to {{Gerrit|733a8fceda}}, config revision is {{Gerrit|eed79486d5}}
* 16:14 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw126[1-9].eqiad.wmnet
* 00:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:12 elukey: remove some old/stale terms from analytics-in4 on cr1/cr2-eqiad (ref: https://gerrit.wikimedia.org/r/c/operations/homer/public/+/622746, https://gerrit.wikimedia.org/r/c/operations/homer/public/+/622744)
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:09 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw127[6-9].eqiad.wmnet,service=canary
* 16:08 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw127[6-9].eqiad.wmnet
* 16:06 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw1290.eqiad.wmnet
* 16:05 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw128[0-9].eqiad.wmnet
* 15:52 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw1290.eqiad.wmnet
* 15:51 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw128[0-9].eqiad.wmnet
* 15:43 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw127[7-9].eqiad.wmnet,service=canary
* 15:43 dzahn@cumin1001: conftool action : set/weight=1; selector: name=mw1276.eqiad.wmnet,service=canary
* 15:41 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw127[6-9].eqiad.wmnet
* 15:39 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw1297.eqiad.wmnet
* 15:38 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw1269.eqiad.wmnet
* 15:38 dzahn@cumin1001: conftool action : set/weight=30; selector: name=mw1267.eqiad.wmnet
* 14:48 moritzm: installing Java security updates on aqs, hadoop and kafka-jumbo
* 14:44 moritzm: restarting tomcat on idp-test* hosts to pick up Java update
* 14:42 elukey: add eventgate-related terms to analytics-in4 filter on cr1/cr2-eqiad (ref https://gerrit.wikimedia.org/r/c/operations/homer/public/+/622705)
* 14:37 moritzm: imported openjdk 8u265-b01-1~deb10u1 to buster-wikimedia (forward port of latest Java 8 security update)
* 14:31 papaul: replacing msw-c5,c6,c7 and fmsw-c8
* 13:58 kormat: disabling GTID on pc2007 (pc1), pc2008 (pc2), pc2009 (pc3) [[phab:T243373|T243373]]
* 13:56 kormat: disabling GTID on db2096 (x1), es2021 (es4), es2023 (es5) [[phab:T243373|T243373]]
* 13:54 hnowlan@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 13:53 kormat: disabling GTID on db2129 (s6), db2118 (s7), db2079 (s8) [[phab:T243373|T243373]]
* 13:52 kormat: disabling GTID on db2123 (s5) [[phab:T243373|T243373]]
* 13:52 kormat: disabling GTID on db2090 (s4) [[phab:T243373|T243373]]
* 13:51 hnowlan@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 13:51 kormat: disabling GTID on db2105 (s3) [[phab:T243373|T243373]]
* 13:50 kormat: disabling GTID on db2107 (s2) [[phab:T243373|T243373]]
* 13:50 hnowlan@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 13:29 elukey: restart jvm daemons on analytics1042, aqs1004, kafka-jumbo1001 to pick up new openjdk upgrades (canaries)
* 13:18 kormat: enabling replication from db2107 to db1122 (s2) [[phab:T243373|T243373]]
* 13:14 kormat: enabling replication from db2096 to db1103 (x1) [[phab:T243373|T243373]]
* 13:10 jynus: restart db2097
* 13:07 jbond42: deploy python3.4 security update to kraz
* 13:03 jbond42: deploy python3.4 security update to canaries on jessie
* 13:01 kormat: enabling replication from db2118 to db1086 (s7) [[phab:T243373|T243373]]
* 12:52 jynus: restart db1140
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s8 weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12402 and previous config saved to /var/cache/conftool/dbconfig/20200827-124338-marostegui.json
* 12:35 jynus: restart db1139
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s7  weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12401 and previous config saved to /var/cache/conftool/dbconfig/20200827-123028-marostegui.json
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s7  weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12400 and previous config saved to /var/cache/conftool/dbconfig/20200827-123003-marostegui.json
* 12:24 marostegui: Fix password format for in db2129 (s6 codfw master) [[phab:T243373|T243373]]
* 12:14 kormat: enabling replication from db2129 to db1093 (s6) [[phab:T243373|T243373]]
* 12:13 jynus: restart db1095
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s6  weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12399 and previous config saved to /var/cache/conftool/dbconfig/20200827-120816-marostegui.json
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s5 codfw weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12398 and previous config saved to /var/cache/conftool/dbconfig/20200827-120211-marostegui.json
* 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s5 eqiad weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12397 and previous config saved to /var/cache/conftool/dbconfig/20200827-115934-marostegui.json
* 11:56 Urbanecm: Lift range blocks exceeding wgBlockCIDRLimit via custom script from F32197596 (ruwiki, ruwikiquote; [[phab:T243980|T243980]])
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s4 codfw weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12396 and previous config saved to /var/cache/conftool/dbconfig/20200827-115110-marostegui.json
* 11:49 moritzm: uploaded python3.4 3.4.2-1+deb8u7+wmf1 for jessie-wikimedia [[phab:T259102|T259102]]
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust s1 codfw weights [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12395 and previous config saved to /var/cache/conftool/dbconfig/20200827-114509-marostegui.json
* 11:22 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust db2126 weight [[phab:T243373|T243373]]', diff saved to https://phabricator.wikimedia.org/P12394 and previous config saved to /var/cache/conftool/dbconfig/20200827-112213-marostegui.json
* 11:12 Urbanecm: EU B&C done
* 11:09 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|34994d39f92b23934929c66f3e15aa332683e746}}: Add $wgTranslateMessageNamespaces[] = NS_MEDIAWIKI; for commonswiki ([[phab:T131300|T131300]]) (duration: 01m 03s)
* 10:57 mvolz@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:56 godog: bounce grafana to apply new settings
* 10:51 kormat: enabling replication from db2123 to db1100 (s5) [[phab:T243373|T243373]]
* 10:48 mvolz@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:30 kormat: enabling replication from es2023 to es1024 (es5) [[phab:T243373|T243373]]
* 10:28 mvolz@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 10:23 kormat: enabling replication from es2021 to es1021 (es4) [[phab:T243373|T243373]]
* 10:19 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:19 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:03 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:03 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 09:54 moritzm: installing Java security updates on IDP* hosts
* 09:51 hnowlan@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 09:47 hnowlan@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 09:45 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 09:44 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 09:44 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 09:43 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 09:43 elukey: decommissioning vms schema[12]00[12] (replaced previously by schema[12]00[34] buster vms)
* 09:42 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 09:41 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 09:40 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 09:39 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 09:20 kormat: enabling replication from db2105 to db1123 (s3) [[phab:T243373|T243373]]
* 09:15 kormat: enabling replication from db2079 to db1109 (s8) [[phab:T243373|T243373]]
* 09:07 kormat: enabling replication from db2090 to db1081 (s4) [[phab:T243373|T243373]]
* 08:53 kormat: enabling replication from pc2009 to pc1009 (pc3) [[phab:T243373|T243373]]
* 08:44 kormat: enabling replication from pc2008 to pc1008 (pc2) [[phab:T243373|T243373]]
* 08:13 marostegui: Enable replication codfw -> eqiad on pc1 [[phab:T243373|T243373]]
* 08:01 gehel: manual cleanup of stale wdqs deploy crontab on wdqs1009
* 07:35 marostegui: Move pc2010 under pc2007 [[phab:T243373|T243373]]
* 07:16 moritzm: installing ghostscript security updates on stretch
* 06:50 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 06:49 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:46 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 06:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 06:34 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:31 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1079,db1082', diff saved to https://phabricator.wikimedia.org/P12392 and previous config saved to /var/cache/conftool/dbconfig/20200827-060652-marostegui.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079,db1082', diff saved to https://phabricator.wikimedia.org/P12391 and previous config saved to /var/cache/conftool/dbconfig/20200827-055815-marostegui.json
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079,db1082', diff saved to https://phabricator.wikimedia.org/P12390 and previous config saved to /var/cache/conftool/dbconfig/20200827-055522-marostegui.json
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P12389 and previous config saved to /var/cache/conftool/dbconfig/20200827-055126-marostegui.json
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P12388 and previous config saved to /var/cache/conftool/dbconfig/20200827-055104-marostegui.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P12387 and previous config saved to /var/cache/conftool/dbconfig/20200827-054259-marostegui.json
* 05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1074 db1085 db1078', diff saved to https://phabricator.wikimedia.org/P12386 and previous config saved to /var/cache/conftool/dbconfig/20200827-054114-marostegui.json
* 05:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082', diff saved to https://phabricator.wikimedia.org/P12385 and previous config saved to /var/cache/conftool/dbconfig/20200827-053814-marostegui.json
* 05:35 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1134 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12384 and previous config saved to /var/cache/conftool/dbconfig/20200827-053558-marostegui.json
* 05:35 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1074, db1085, db1079', diff saved to https://phabricator.wikimedia.org/P12383 and previous config saved to /var/cache/conftool/dbconfig/20200827-053509-marostegui.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1074, db1085, db1079', diff saved to https://phabricator.wikimedia.org/P12382 and previous config saved to /var/cache/conftool/dbconfig/20200827-053100-marostegui.json
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P12381 and previous config saved to /var/cache/conftool/dbconfig/20200827-052925-marostegui.json
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P12380 and previous config saved to /var/cache/conftool/dbconfig/20200827-052818-marostegui.json
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1074', diff saved to https://phabricator.wikimedia.org/P12379 and previous config saved to /var/cache/conftool/dbconfig/20200827-052413-marostegui.json
* 05:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P12378 and previous config saved to /var/cache/conftool/dbconfig/20200827-051609-marostegui.json
* 05:15 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12377 and previous config saved to /var/cache/conftool/dbconfig/20200827-051546-marostegui.json
* 05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12376 and previous config saved to /var/cache/conftool/dbconfig/20200827-050754-marostegui.json
* 05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085', diff saved to https://phabricator.wikimedia.org/P12375 and previous config saved to /var/cache/conftool/dbconfig/20200827-050727-marostegui.json
* 04:53 marostegui: Stop db1074 and db2107 in sync to fix drifts on s2 change_tag - [[phab:T260042|T260042]]
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074', diff saved to https://phabricator.wikimedia.org/P12374 and previous config saved to /var/cache/conftool/dbconfig/20200827-045329-marostegui.json
* 04:04 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=cloudelastic1006.wikimedia.org
* 04:03 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=cloudelastic1005.wikimedia.org
* 04:01 ryankemper@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cloudelastic1005.wikimedia.org
* 02:03 mutante: shutting down install3001,install4001,install5001 VMs (no OS yet, but please also don't delete, debugging in progress, shutting them down until I continue on [[phab:T254157|T254157]])


== 2020-08-26 ==
== 2021-10-20 ==
* 23:35 eileen: civicrm revision changed from {{Gerrit|d2e80f7522}} to {{Gerrit|481ab742db}}, config revision is {{Gerrit|e2ab4d7c1f}}
* 23:56 thcipriani@deploy1002: Finished scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]] (duration: 11m 41s)
* 23:00 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 23:44 thcipriani@deploy1002: Started scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]]
* 22:36 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:30 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:26 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:20 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 23:29 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: fawiki require login for creation of pages in the draft namespace [[phab:T291018|T291018]] (duration: 01m 02s)
* 19:51 XioNoX: standardize pfw3-eqiad
* 23:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:33 marxarelli: 1.36.0-wmf.6 promoted to group1 ([[phab:T257974|T257974]]). logs show no new errors
* 23:27 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: fawiki require login to edit main namespace [[phab:T291018|T291018]] (duration: 01m 04s)
* 19:24 dduvall@deploy1001: Synchronized php: group1 wikis to 1.36.0-wmf.6 (duration: 01m 03s)
* 22:13 dancy@deploy1002: Synchronized README: testing (4/4) (duration: 02m 52s)
* 19:23 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.6
* 22:00 dancy@deploy1002: Synchronized README: testing (3/4) (duration: 02m 57s)
* 18:21 Urbanecm: Morning B&C done
* 21:54 dancy@deploy1002: Synchronized README: testing (2) (duration: 01m 02s)
* 18:20 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|945b97cff8b8a1e4bb43b613fc93b099f74945f7}}: Added import sources for mlwiktionary ([[phab:T260716|T260716]]) (duration: 01m 05s)
* 21:52 dancy@deploy1002: Synchronized README: (no justification provided) (duration: 01m 03s)
* 18:12 Urbanecm: Purge Thai and Greek taglines, URLs are at P12372  ([[phab:T258552|T258552]])
* 21:50 dancy: Testing a series of one-file scap sync-file runs
* 18:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|40092898d8c70191324e844d2c222469b954e9ef}}: Update Thai and Greek taglines ([[phab:T258552|T258552]]) (duration: 01m 03s)
* 21:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:09 urbanecm@deploy1001: Synchronized static/images/mobile/copyright/: {{Gerrit|40092898d8c70191324e844d2c222469b954e9ef}}: Update Thai and Greek taglines ([[phab:T258552|T258552]]) (duration: 01m 05s)
* 21:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:08 herron: upgraded eqiad elk v7 cluster from 7.8.0 to 7.9.0 [[phab:T234854|T234854]]
* 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:02 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b9cf996a38d82fdd67e600a5a951e88423957e8d}}: Promote Growth features out of darkmode on several wikis  ([[phab:T291826|T291826]], [[phab:T255037|T255037]], [[phab:T287878|T287878]]) (duration: 01m 04s)
* 18:02 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:41 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable client side error logging on hewiki ([[phab:T255585|T255585]]) (duration: 01m 04s)
* 20:38 eileen: civicrm revision changed from {{Gerrit|9b5e0d015b}} to {{Gerrit|e889831012}}, config revision is {{Gerrit|eed79486d5}}
* 17:14 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Documentation-only change; sync for line sanity (duration: 01m 04s)
* 20:25 legoktm: uploaded php7.4 on buster to apt.wm.o ([[phab:T293449|T293449]])
* 17:12 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T254349|T254349]] Set wgVisualEditorEnableBetaFeature true on wikis that need it (duration: 01m 03s)
* 19:24 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations (duration: 00m 46s)
* 15:59 oblivian@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 19:24 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations
* 15:53 oblivian@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 19:09 mutante: disabling puppet on mw* for a minute to deploy a change
* 15:41 oblivian@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:11 oblivian@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134 for MCR change', diff saved to https://phabricator.wikimedia.org/P12371 and previous config saved to /var/cache/conftool/dbconfig/20200826-145612-marostegui.json
* 18:31 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1091 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12370 and previous config saved to /var/cache/conftool/dbconfig/20200826-145531-marostegui.json
* 18:30 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1091 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12369 and previous config saved to /var/cache/conftool/dbconfig/20200826-144750-marostegui.json
* 18:24 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:45 elukey@puppetmaster1001: conftool action : set/pooled=inactive:weight=0; selector: name=schema1002.eqiad.wmnet
* 17:28 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org -  [[phab:T293810|T293810]]
* 14:45 elukey@puppetmaster1001: conftool action : set/pooled=inactive:weight=0; selector: name=schema1001.eqiad.wmnet
* 17:27 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org
* 14:45 elukey@puppetmaster1001: conftool action : set/pooled=inactive:weight=0; selector: name=schema2001.codfw.wmnet
* 17:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:45 elukey@puppetmaster1001: conftool action : set/pooled=inactive:weight=0; selector: name=schema2002.codfw.wmnet
* 17:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1091 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12368 and previous config saved to /var/cache/conftool/dbconfig/20200826-143623-marostegui.json
* 17:01 razzi@deploy1002: Finished deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f] (duration: 23m 42s)
* 14:34 elukey@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=schema2003.codfw.wmnet
* 17:00 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client: Update deprecated calls to ParserOutput in ShortDescHandler - [[phab:T293860|T293860]] (duration: 01m 03s)
* 14:34 elukey@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=schema2004.codfw.wmnet
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 elukey@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=schema1004.eqiad.wmnet
* 16:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 elukey@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=schema1003.eqiad.wmnet
* 16:53 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/LiquidThreads/pages/LqtDiscussionPager.php: Remove deprecated usage of setProperty - [[phab:T293895|T293895]] (duration: 01m 03s)
* 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1091 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12367 and previous config saved to /var/cache/conftool/dbconfig/20200826-142746-marostegui.json
* 16:49 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GeoCrumbs: Replace use of deprecated ParserOutput:getProperty() - [[phab:T293894|T293894]] (duration: 01m 09s)
* 14:25 jgleeson: updated civicrm from {{Gerrit|0f195c6cca}} to {{Gerrit|d2e80f7522}}
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:21 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:20 marostegui: Upgrade mysql on db1091 after MCR changes
* 16:37 razzi@deploy1002: Started deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f]
* 14:13 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 16:36 razzi: deploy refinery change for https://phabricator.wikimedia.org/T287084
* 13:37 kormat@cumin1001: dbctl commit (dc=all): 'Repooling db1110 @ 100% [[phab:T261276|T261276]]', diff saved to https://phabricator.wikimedia.org/P12366 and previous config saved to /var/cache/conftool/dbconfig/20200826-133753-kormat.json
* 16:13 jbond: upload cas_6.4.2-1_amd64.deb
* 13:18 duesen: daniel@mwmaint1002:/srv/mediawiki/php-1.36.0-wmf.5$ mwscript maintenance/findBadBlobs.php dewiki --mark [[phab:T205936|T205936]] --revisions - < ~/T205936-dewiki-20050512070000.ids  # marking known bad revisions for [[phab:T205936|T205936]]
* 15:42 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:17 kormat@cumin1001: dbctl commit (dc=all): 'Repooling db1110 @ 75% [[phab:T261276|T261276]]', diff saved to https://phabricator.wikimedia.org/P12365 and previous config saved to /var/cache/conftool/dbconfig/20200826-131732-kormat.json
* 15:39 volans@cumin2002: START - Cookbook sre.dns.netbox
* 13:16 duesen: daniel@mwmaint1002:/srv/mediawiki/php-1.36.0-wmf.5$ mwscript maintenance/findBadBlobs.php oswiki --mark [[phab:T205936|T205936]] --revisions - < ~/T205936-oswiki-20090309200000.ids # marking known bad revisions for [[phab:T205936|T205936]]
* 14:57 moritzm: installing modsecurity-crs security updates on Buster
* 13:07 kormat@cumin1001: dbctl commit (dc=all): 'Repooling db1110 @ 50% [[phab:T261276|T261276]]', diff saved to https://phabricator.wikimedia.org/P12364 and previous config saved to /var/cache/conftool/dbconfig/20200826-130735-kormat.json
* 14:48 moritzm: installing xmlgraphics-commons security updates on Buster
* 13:06 vgutierrez: serve a synthetic warn page to DHE-RSA-AES128-SHA users - [[phab:T258405|T258405]]
* 14:46 moritzm: installing irssi security updates on Buster
* 12:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling db1110 @ 30% [[phab:T261276|T261276]]', diff saved to https://phabricator.wikimedia.org/P12363 and previous config saved to /var/cache/conftool/dbconfig/20200826-124700-kormat.json
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling db1110 @ 20% [[phab:T261276|T261276]]', diff saved to https://phabricator.wikimedia.org/P12362 and previous config saved to /var/cache/conftool/dbconfig/20200826-122059-kormat.json
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 12:12 godog: upgrade nagios-nrpe-server to 2.15-2 on jessie hosts - [[phab:T261198|T261198]]
* 14:35 moritzm: installing commons-io security updates on Buster
* 11:58 kormat@cumin1001: dbctl commit (dc=all): 'Start repooling db1110 [[phab:T261276|T261276]]', diff saved to https://phabricator.wikimedia.org/P12361 and previous config saved to /var/cache/conftool/dbconfig/20200826-115850-kormat.json
* 14:27 ema: cp3062: test higher vsl_space values [[phab:T293879|T293879]]
* 11:56 mlitn@deploy1001: Synchronized php-1.36.0-wmf.6/extensions/WikibaseMediaInfo: MediaSearchQueryBuilder should support keyword only queries (duration: 01m 00s)
* 14:27 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 11:55 mlitn@deploy1001: Synchronized php-1.36.0-wmf.5/extensions/WikibaseMediaInfo: MediaSearchQueryBuilder should support keyword only queries (duration: 01m 08s)
* 14:12 moritzm: installing ruby2.3 security updates
* 11:53 kart_: Finished manual run of ContentTranslation/scripts/purge-unpublished-drafts.php script on mwmaint1002  ([[phab:T261189|T261189]])
* 13:40 moritzm: installing apache2 security updates on buster
* 11:39 kart_: Started manual run of ContentTranslation/scripts/purge-unpublished-drafts.php script on mwmaint1002  ([[phab:T261189|T261189]])
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:29 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:599344{{!}}Enable propagateChangeVisibility for testwikidata]], part 2 (duration: 01m 03s)
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:26 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:599344{{!}}Enable propagateChangeVisibility for testwikidata]], part 1 (duration: 01m 19s)
* 13:21 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 01m 02s)
* 10:30 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:20 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 10:28 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 10:14 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 10:14 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=ats-tls
* 09:18 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=varnish-fe
* 09:14 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 12:51 ema: cp3062: bump vsl_space from 80M (default) to 512M [[phab:T293879|T293879]] - varnish restart needed
* 09:00 XioNoX: re-enable IPv6 BGP to Init7 in knams
* 12:37 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 replication broken', diff saved to https://phabricator.wikimedia.org/P12360 and previous config saved to /var/cache/conftool/dbconfig/20200826-084044-marostegui.json
* 12:36 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 08:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:14 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1091 for MCR change', diff saved to https://phabricator.wikimedia.org/P12358 and previous config saved to /var/cache/conftool/dbconfig/20200826-054557-marostegui.json
* 12:02 urbanecm@deploy1002: Finished scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]]) (duration: 25m 19s)
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1114, db1135 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12357 and previous config saved to /var/cache/conftool/dbconfig/20200826-054409-marostegui.json
* 11:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1114, db1135 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12356 and previous config saved to /var/cache/conftool/dbconfig/20200826-053345-marostegui.json
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1114, db1135 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12355 and previous config saved to /var/cache/conftool/dbconfig/20200826-052355-marostegui.json
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2007.codfw.wmnet
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1114, db1135 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12354 and previous config saved to /var/cache/conftool/dbconfig/20200826-050849-marostegui.json
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2007.codfw.wmnet
* 05:03 marostegui: Update db1135 and db1114 after MCR changes
* 11:37 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:37 urbanecm@deploy1002: Started scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]])
* 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2005.codfw.wmnet
* 11:21 moritzm: installing ffmpeg security updates
* 11:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e520fc57411bb19123766192cd636396ea6fc59d}}: GrowthExperiments: Add campaign pattern for enwiki ([[phab:T293699|T293699]]) (duration: 01m 22s)
* 11:11 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2005.codfw.wmnet
* 10:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 10:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 09:04 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS buster
* 07:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:49 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS buster
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 (s1) for reimage [[phab:T290865|T290865]]', diff saved to https://phabricator.wikimedia.org/P17552 and previous config saved to /var/cache/conftool/dbconfig/20211020-064529-marostegui.json
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS buster
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 (s1) after upgrade', diff saved to https://phabricator.wikimedia.org/P17551 and previous config saved to /var/cache/conftool/dbconfig/20211020-063926-marostegui.json
* 06:35 marostegui: Upgrade db1106
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17550 and previous config saved to /var/cache/conftool/dbconfig/20211020-063431-marostegui.json
* 06:31 dcausse: restarting blazegraph on wdqs1012
* 06:28 elukey: reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage
* 06:21 marostegui: Depool clouddb1013 for upgrade
* 06:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS buster
* 06:12 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17549 and previous config saved to /var/cache/conftool/dbconfig/20211020-061202-marostegui.json
* 06:06 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:05 XioNoX: put transport link between ulsfo and eqsin in service - [[phab:T273308|T273308]]
* 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS buster
* 05:26 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 04:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:40 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable $wgLocalHTTPProxy on group0 wikis ([[phab:T288848|T288848]]) (duration: 01m 05s)
* 01:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 tgr: west coast evening deploys done


== 2020-08-25 ==
== 2021-10-19 ==
* 21:51 mutante: xhgui1001/xhgui2001 - Unpacking xhgui (0.12.0-2-wmf1) over (0.9.0-1-wmf1) ([[phab:T260397|T260397]])
* 23:59 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732103{{!}}Reorder some wikis at wgExtraNamespaces and wmgVisualEditorAvailableNamespaces (T293846)]] (duration: 01m 02s)
* 21:50 mutante: xhgui1001 - Unpacking xhgui (0.12.0-2-wmf1) over (0.9.0-1-wmf1) ...
* 23:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:46 mutante: importing xhgui 0.12.0-2-wmf1 to buster-wikimedia APT repo ([[phab:T260397|T260397]])
* 23:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:40 dcausse@deploy1001: Finished deploy [wikimedia/discovery/analytics@125cb6d]: test: Add wikidata ttl import (duration: 00m 54s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732053{{!}}ruwikiversity: Add 'portal' and 'faculty' namespaces (T293545)]] (duration: 01m 03s)
* 19:39 dcausse@deploy1001: Started deploy [wikimedia/discovery/analytics@125cb6d]: test: Add wikidata ttl import
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:15 marxarelli: 1.36.0-wmf.6 promoted to group0 ([[phab:T257974|T257974]]). no new errors
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:09 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.6
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:710565{{!}}Set the project namespace and sitename for Javanese Wikipedia and Wiktionary (T287437)]] (duration: 01m 02s)
* 19:05 moritzm: installing Java security updates on cloudelastic* hosts
* 23:23 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731953{{!}}Create Portal and Portal talk namespace for shiwiki (T288909)]] (duration: 01m 03s)
* 19:02 moritzm: installing Java security updates on elastic* hosts
* 23:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-reload
* 23:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:58 dduvall@deploy1001: Finished scap: testwikis wikis to 1.36.0-wmf.6 (duration: 41m 58s)
* 23:13 tgr@deploy1002: Synchronized static: Config: [[gerrit:731231{{!}}Repair the size of the logo of Kashmiri Wikipedia (T293342)]] (duration: 02m 14s)
* 17:30 dcausse@deploy1001: Finished deploy [wikimedia/discovery/analytics@bc2f7f1]: test: Add wikidata ttl import (duration: 01m 52s)
* 21:34 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete {{!}} fixed Icinga alert:  RECOVERY - Check for large files in client bucket on mwmaint1002 is OK: OK: [[phab:T165885|T165885]]
* 17:28 dcausse@deploy1001: Started deploy [wikimedia/discovery/analytics@bc2f7f1]: test: Add wikidata ttl import
* 21:32 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete
* 17:17 dduvall@deploy1001: Started scap: testwikis wikis to 1.36.0-wmf.6
* 20:56 ejegg: updated payments-wiki from {{Gerrit|0f48acea49}} to {{Gerrit|30e596903d}}
* 17:08 dduvall@deploy1001: Pruned MediaWiki: 1.36.0-wmf.4 (duration: 01m 40s)
* 19:03 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 17:01 dduvall@deploy1001: Pruned MediaWiki: 1.36.0-wmf.3 (duration: 19m 12s)
* 18:46 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/MediaSearch/: {{Gerrit|a84a675}}: {{Gerrit|3231578}}: MediaSearch backports ([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 17:01 herron: imported logstash, elasticsearch, and kibana 7.9.0 -oss packages into buster-wikimedia thirdparty/elastic79
* 18:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/: {{Gerrit|694580a}}: {{Gerrit|c02e301}}: MediaSearch backports([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 16:42 dcausse@deploy1001: Finished deploy [wikimedia/discovery/analytics@89b4f74]: test: Add wikidata ttl import (duration: 00m 49s)
* 18:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:41 dcausse@deploy1001: Started deploy [wikimedia/discovery/analytics@89b4f74]: test: Add wikidata ttl import
* 18:30 foks: deleting 1 more email with deleteUserEmail.php
* 16:21 shdubsh: restart logstash on logstash1007 -- gc duration outlier
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1476a2d93}}: {{Gerrit|dd8393c1a0}}: foundationwiki: Restrict sensitive namespaces to editor group ([[phab:T205350|T205350]]) (duration: 01m 03s)
* 16:08 dcausse@deploy1001: Finished deploy [wikimedia/discovery/analytics@ae6dd8d]: test: Add wikidata ttl import (duration: 00m 54s)
* 18:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:07 dcausse@deploy1001: Started deploy [wikimedia/discovery/analytics@ae6dd8d]: test: Add wikidata ttl import
* 18:12 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9a2893c7190e615a247674dbf7f87348bf43b91c}}: Enable topic subscriptions as a beta feature on all remaining projects ([[phab:T287802|T287802]]) (duration: 01m 04s)
* 16:00 gehel: repool wdqs1005 - catched up on lag
* 18:00 legoktm@deploy1002: Synchronized wmf-config/: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (2/2) (duration: 01m 06s)
* 15:47 elukey: restart mariadb@analytics_meta on db1108 to apply a replication filter (exclude superset_staging database from replication)
* 17:59 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (1/2) (duration: 01m 05s)
* 15:44 jgleeson: fundraising-tools updated from {{Gerrit|dcad0bfe75}} to {{Gerrit|3fe3a23114}}
* 17:57 foks: removing six email addresses on request (with deleteUserEmail.php)
* 15:41 dcausse@deploy1001: Finished deploy [wikimedia/discovery/analytics@cbf2f9d]: Add wikidata ttl import (duration: 01m 38s)
* 17:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 15:39 dcausse@deploy1001: Started deploy [wikimedia/discovery/analytics@cbf2f9d]: Add wikidata ttl import
* 17:25 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 15:22 liw: testing upcoming Scap release on beta
* 17:11 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 14:56 moritzm: installing rake security updates on stretch
* 17:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 14:56 moritzm: installing take security updates on stretch
* 16:48 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:36 oblivian@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 16:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:32 volker-e@deploy1001: Finished deploy [design/style-guide@e3fda83]: Deploy design/style-guide:  (duration: 00m 05s)
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:32 volker-e@deploy1001: Started deploy [design/style-guide@e3fda83]: Deploy design/style-guide:
* 16:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 14:26 XioNoX: disable IPv6 BGP to Init7 in knams
* 16:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 14:10 andrew@deploy1001: Finished deploy [horizon/deploy@7a3221d]: add hostname checking --bug [[phab:T207538|T207538]] (duration: 03m 50s)
* 16:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 14:06 andrew@deploy1001: Started deploy [horizon/deploy@7a3221d]: add hostname checking --bug [[phab:T207538|T207538]]
* 16:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114 for MCR change', diff saved to https://phabricator.wikimedia.org/P12347 and previous config saved to /var/cache/conftool/dbconfig/20200825-135248-marostegui.json
* 16:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'fully repool db1111 MCR changes', diff saved to https://phabricator.wikimedia.org/P12346 and previous config saved to /var/cache/conftool/dbconfig/20200825-134736-marostegui.json
* 16:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1111 MCR changes', diff saved to https://phabricator.wikimedia.org/P12345 and previous config saved to /var/cache/conftool/dbconfig/20200825-133734-marostegui.json
* 16:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1111 MCR changes', diff saved to https://phabricator.wikimedia.org/P12344 and previous config saved to /var/cache/conftool/dbconfig/20200825-132027-marostegui.json
* 16:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 13:17 moritzm: installing firejail security updates on remaining mw* servers in eqiad
* 15:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 12:56 godog: upgrade nagios-nrpe-server on scb2* and mwlog* - [[phab:T261198|T261198]]
* 15:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1111 MCR changes', diff saved to https://phabricator.wikimedia.org/P12343 and previous config saved to /var/cache/conftool/dbconfig/20200825-125108-marostegui.json
* 15:40 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - remove now redundant stream setting - [[phab:T277193|T277193]] (duration: 01m 04s)
* 12:45 marostegui: Update MySQL on db1111 after MCR change
* 15:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 12:39 marostegui: alter table sites on s6, directly on the primary master [[phab:T260476|T260476]]
* 15:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 12:39 godog: test nagios-nrpe-server with dh 2048 on scb2001 - [[phab:T261198|T261198]]
* 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 12:35 moritzm: imported ceph packages from stretch-backports to component/ceph [[phab:T256877|T256877]]
* 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 12:10 moritzm: installing ruby-json security updates
* 15:32 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 MCR change', diff saved to https://phabricator.wikimedia.org/P12341 and previous config saved to /var/cache/conftool/dbconfig/20200825-120708-marostegui.json
* 15:32 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1118 MCR changes', diff saved to https://phabricator.wikimedia.org/P12340 and previous config saved to /var/cache/conftool/dbconfig/20200825-120211-marostegui.json
* 15:30 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 11:59 jayme@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 15:28 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118 MCR changes', diff saved to https://phabricator.wikimedia.org/P12339 and previous config saved to /var/cache/conftool/dbconfig/20200825-114938-marostegui.json
* 15:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 11:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118 MCR changes', diff saved to https://phabricator.wikimedia.org/P12338 and previous config saved to /var/cache/conftool/dbconfig/20200825-113758-marostegui.json
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:36 hnowlan@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 11:32 hnowlan@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 14:34 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118 MCR changes', diff saved to https://phabricator.wikimedia.org/P12337 and previous config saved to /var/cache/conftool/dbconfig/20200825-112859-marostegui.json
* 14:29 jbond: disable puppet on lvs, cp, authdns, mc, mw-be and wcqs to while i merge G:662699
* 11:25 marostegui: Upgrade mysql on db1118 after MCR change
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:16 Urbanecm: EU B&C done
* 14:11 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 45m 13s)
* 11:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d869e308492ee72cb3d1998b15409aa44a4af9c7}}: Enable ContentTranslation as a default tool in Assamese and Burmese WPs ([[phab:T258503|T258503]]; [[phab:T258505|T258505]]) (duration: 01m 00s)
* 13:52 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 10:59 moritzm: installing remaining libx11 security updates
* 13:45 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:37 arturo: import all binary packages from tesseract-ocr-lang into stretch-wikimedia/component/tesseract-410-bpo ([[phab:T247422|T247422]])
* 13:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:32 hnowlan@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:28 oblivian@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 13:26 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 10:28 oblivian@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' .
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17547 and previous config saved to /var/cache/conftool/dbconfig/20211019-131927-root.json
* 10:23 hnowlan@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17546 and previous config saved to /var/cache/conftool/dbconfig/20211019-131651-root.json
* 10:23 moritzm: removed fermium.wikimedia.org from debmonitor
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17545 and previous config saved to /var/cache/conftool/dbconfig/20211019-130424-root.json
* 09:45 marostegui: Create missing table cx_notification_log on x1 wikishared [[phab:T261190|T261190]]
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17544 and previous config saved to /var/cache/conftool/dbconfig/20211019-130147-root.json
* 08:50 XioNoX: re-activate eqord peering/transit - [[phab:T259593|T259593]]
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17543 and previous config saved to /var/cache/conftool/dbconfig/20211019-124920-root.json
* 08:19 XioNoX: reconfigure eqord to be AS65020 - [[phab:T259593|T259593]]
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17542 and previous config saved to /var/cache/conftool/dbconfig/20211019-124644-root.json
* 08:18 XioNoX: deactivate eqord peering/transit - [[phab:T259593|T259593]]
* 12:40 moritzm: installing aftpd security updates
* 07:22 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1)
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17541 and previous config saved to /var/cache/conftool/dbconfig/20211019-123416-root.json
* 07:13 marostegui: Upgrade MySQL on dbstore1004
* 12:34 marostegui: Upgrade dbstore1003
* 07:09 dcausse: depooling wdqs1005 (high lag)
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17540 and previous config saved to /var/cache/conftool/dbconfig/20211019-123140-root.json
* 07:04 dcausse: restartint blazegraph on wdqs1005 ([[phab:T242453|T242453]])
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17539 and previous config saved to /var/cache/conftool/dbconfig/20211019-121913-root.json
* 06:20 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17538 and previous config saved to /var/cache/conftool/dbconfig/20211019-121636-root.json
* 05:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1111, db1118 for MCR change', diff saved to https://phabricator.wikimedia.org/P12336 and previous config saved to /var/cache/conftool/dbconfig/20200825-053856-marostegui.json
* 12:12 XioNoX: push anycast tuning to all Lumen and NTT transit links - [[phab:T288843|T288843]]
* 05:38 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1084,db1092 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12335 and previous config saved to /var/cache/conftool/dbconfig/20200825-053801-marostegui.json
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1167 (s8) after upgrade', diff saved to https://phabricator.wikimedia.org/P17537 and previous config saved to /var/cache/conftool/dbconfig/20211019-120918-marostegui.json
* 05:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084,db1092 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12334 and previous config saved to /var/cache/conftool/dbconfig/20200825-052602-marostegui.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17536 and previous config saved to /var/cache/conftool/dbconfig/20211019-120458-marostegui.json
* 05:21 moritzm: installing Java security updates on relforge*
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17535 and previous config saved to /var/cache/conftool/dbconfig/20211019-120409-root.json
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084,db1092 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12333 and previous config saved to /var/cache/conftool/dbconfig/20200825-051327-marostegui.json
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17534 and previous config saved to /var/cache/conftool/dbconfig/20211019-120348-root.json
* 05:11 marostegui: Remove revisions triggers from db2094:3311 [[phab:T238966|T238966]]
* 12:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/WikibaseMediaInfo/: {{Gerrit|ec0125770775c1a1a54c3b592d86d287fd9e3ad6}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 55s)
* 05:10 marostegui: Deploy MCR schema change on s1 codfw, this will create lag on s1 codfw - [[phab:T238966|T238966]]
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17533 and previous config saved to /var/cache/conftool/dbconfig/20211019-120132-root.json
* 05:04 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084,db1092 after MCR changes', diff saved to https://phabricator.wikimedia.org/P12332 and previous config saved to /var/cache/conftool/dbconfig/20200825-050451-marostegui.json
* 12:00 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikibaseMediaInfo/: {{Gerrit|79808a90a95dd5dac2b532b87fb7ec1a490ea0f0}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 56s)
* 04:02 ejegg: updated fundraising python tools from {{Gerrit|305f2a4438}} to {{Gerrit|dcad0bfe75}}
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17532 and previous config saved to /var/cache/conftool/dbconfig/20211019-120024-root.json
* 01:49 eileen: civicrm revision changed from {{Gerrit|ce28723709}} to {{Gerrit|0f195c6cca}}, config revision is {{Gerrit|96839009f1}}
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:39 eileen: civicrm revision is {{Gerrit|ce28723709}}, config revision is {{Gerrit|96839009f1}}
* 11:56 XioNoX: push anycast tuning to Tele2, Init7, DT transit links - [[phab:T288843|T288843]]
* 01:30 eileen: civicrm revision is {{Gerrit|ce28723709}}, config revision is {{Gerrit|54c8c7abf2}}
* 11:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:17 cdanis: repool esams
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17531 and previous config saved to /var/cache/conftool/dbconfig/20211019-114844-root.json
* 01:11 cdanis: [[phab:T259621|T259621]] wrong junos version was staged on cr2-esams, abandoning this attempt and putting back in service
* 11:46 marostegui: Upgrade db1105 (s1,s2)
* 01:07 cdanis: cdanis@re0.cr2-esams> request system software add validate re1 /var/tmp/junos-vmhost-install-mx-x86-64-17.3R3-S8.1.tgz
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105 (s1,s2) for upgrade', diff saved to https://phabricator.wikimedia.org/P17530 and previous config saved to /var/cache/conftool/dbconfig/20211019-114649-marostegui.json
* 00:56 cdanis: [[phab:T259621|T259621]] ❌cdanis@cumin1001.eqiad.wmnet ~ 🕘🍺 homer 'cr*' commit 'drain cr2-esams transport link'
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17529 and previous config saved to /var/cache/conftool/dbconfig/20211019-114520-root.json
* 00:36 cdanis: [[phab:T259621|T259621]] cdanis@re1.cr3-esams> request chassis routing-engine master switch
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17527 and previous config saved to /var/cache/conftool/dbconfig/20211019-113340-root.json
* 00:30 cdanis: [[phab:T259621|T259621]] cdanis@re1.cr3-esams> request vmhost reboot re0
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17526 and previous config saved to /var/cache/conftool/dbconfig/20211019-113017-root.json
* 00:24 cdanis: [[phab:T259621|T259621]] cdanis@re1.cr3-esams> request vmhost software add /var/tmp/junos-vmhost-install-mx-x86-64-17.3R3-S8.1.tgz re0
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17525 and previous config saved to /var/cache/conftool/dbconfig/20211019-111837-root.json
* 00:18 cdanis: [[phab:T259621|T259621]] cdanis@re0.cr3-esams> request chassis routing-engine master switch
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17524 and previous config saved to /var/cache/conftool/dbconfig/20211019-111513-root.json
* 00:14 cdanis: [[phab:T259621|T259621]] cdanis@re0.cr3-esams> request vmhost reboot re1
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:08 cdanis: [[phab:T259621|T259621]] cdanis@re0.cr3-esams> request vmhost software add /var/tmp/junos-vmhost-install-mx-x86-64-17.3R3-S8.1.tgz re1
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7c31b04e50101a60db7ae8acae64bc031f5e1007}}: DPL: Explicitly note it is not possible to enable DPL on any more wikis (duration: 00m 55s)
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17523 and previous config saved to /var/cache/conftool/dbconfig/20211019-110333-root.json
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17522 and previous config saved to /var/cache/conftool/dbconfig/20211019-110009-root.json
* 10:56 marostegui: Upgrade clouddb1021
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:51 moritzm: failover master in ganeti-test to ganeti2026
* 10:50 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS stretch
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17521 and previous config saved to /var/cache/conftool/dbconfig/20211019-104829-root.json
* 10:45 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17520 and previous config saved to /var/cache/conftool/dbconfig/20211019-104506-root.json
* 10:38 oblivian@deploy1002: Synchronized w/static.php: Config: [[gerrit:730182{{!}}static.php: Add support for /static/current rewrites (take 2) (T285232)]] (duration: 00m 55s)
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 10:37 marostegui: Upgrade db1101 (s7,s8)
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101 (s7,s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17519 and previous config saved to /var/cache/conftool/dbconfig/20211019-103634-marostegui.json
* 10:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:29 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 10:28 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 10:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:22 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:22 godog: flip mw statsd traffic with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/731918 - [[phab:T247963|T247963]]
* 10:21 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS stretch
* 10:16 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:52 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:44 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.3 (duration: 01m 39s)
* 09:42 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.2 (duration: 16m 06s)
* 09:37 godog: move graphite/statsd writes to graphite2003 - [[phab:T247963|T247963]]
* 09:34 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:27 hashar: sap clean --delete 1.38.0-wmf.2 && scap clean --delete 1.38.0-wmf.3  # [[phab:T281169|T281169]]
* 09:27 hashar: Cloned and applied security patches for 1.38.0-wmf.5 # [[phab:T281169|T281169]]
* 09:19 marostegui: Stop slave on db2112 [[phab:T290865|T290865]]
* 09:18 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:18 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:03 XioNoX: push anycast tuning to all Telia transit links - [[phab:T288843|T288843]]
* 08:50 godog: point graphite.discovery.wmnet to graphite2003 - [[phab:T247963|T247963]]
* 08:40 XioNoX: push prep-work for anycast tuning to all sites - [[phab:T288843|T288843]]
* 08:33 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:33 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:32 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php hrwiki --fix
* 08:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift-ro
* 08:03 XioNoX: push prep-work for anycast tuning in ulsfo (try 2) - [[phab:T288843|T288843]]
* 08:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:32 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:24 ema: A:cp start rolling varnish upgrades to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17517 and previous config saved to /var/cache/conftool/dbconfig/20211019-072111-root.json
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17516 and previous config saved to /var/cache/conftool/dbconfig/20211019-071519-root.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17515 and previous config saved to /var/cache/conftool/dbconfig/20211019-070607-root.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17514 and previous config saved to /var/cache/conftool/dbconfig/20211019-070016-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17513 and previous config saved to /var/cache/conftool/dbconfig/20211019-065104-root.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17512 and previous config saved to /var/cache/conftool/dbconfig/20211019-064512-root.json
* 06:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17511 and previous config saved to /var/cache/conftool/dbconfig/20211019-063559-root.json
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17510 and previous config saved to /var/cache/conftool/dbconfig/20211019-063008-root.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17509 and previous config saved to /var/cache/conftool/dbconfig/20211019-062054-root.json
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17508 and previous config saved to /var/cache/conftool/dbconfig/20211019-061505-root.json
* 06:06 marostegui: Upgrade dbstore1005
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17507 and previous config saved to /var/cache/conftool/dbconfig/20211019-060551-root.json
* 06:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 06:03 marostegui: Upgrade db1184, db1178
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1178 for upgrade', diff saved to https://phabricator.wikimedia.org/P17506 and previous config saved to /var/cache/conftool/dbconfig/20211019-060123-marostegui.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17505 and previous config saved to /var/cache/conftool/dbconfig/20211019-060001-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1184 for upgrade', diff saved to https://phabricator.wikimedia.org/P17504 and previous config saved to /var/cache/conftool/dbconfig/20211019-055429-marostegui.json
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 05:46 marostegui: Reimage db2112 (s1 codfw master) [[phab:T290865|T290865]]
* 04:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 03:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:21 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:18 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:38 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer


== 2020-08-24 ==
== 2021-10-18 ==
* 23:46 cdanis: depool esams [[phab:T259621|T259621]]
* 23:40 hoo: Updated the Wikidata property suggester with data from the 2021-10-04 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 23:16 Urbanecm: Evening B&C window done
* 23:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b654980240d51fff3c6e9c48f7076d4609c2560f}}: Create an alias for the Draft namespace on hrwiki ([[phab:T291755|T291755]]) (duration: 00m 56s)
* 23:06 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|778f710bbbdb24730f7ce4c75d5ff1ca7a5ce3b3}}: Alternate configuration mechanism for Parsoid ([[phab:T241961|T241961]]) (duration: 00m 58s)
* 23:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:13 rzl@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:10 rzl@cumin1001: START - Cookbook sre.hosts.decommission
* 23:12 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=thwiktionary --fix # [[phab:T291761|T291761]]
* 21:29 sbassett@deploy1001: Synchronized private/PrivateSettings.php: Deployed additional mitigations for [[phab:T257687|T257687]] (duration: 00m 58s)
* 23:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|abe777d28594da852e49ccb1c1597b2598f3e483}}: Create Rhymes namespace for thwiktionary ([[phab:T291761|T291761]]) (duration: 00m 57s)
* 20:29 rzl: re-enabled puppet on 'R:File = /etc/nutcracker/nutcracker.yml' [[phab:T261154|T261154]]
* 23:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:25 rzl: disabling puppet on 'R:File = /etc/nutcracker/nutcracker.yml' to swap mc2028 out for mc2037 [[phab:T261154|T261154]]
* 23:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:10 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrus: Increase weight of grants and research namespaces in metawiki search (duration: 00m 58s)
* 22:56 legoktm@deploy1002: Synchronized php-1.38.0-wmf.4/includes/http/MWHttpRequest.php: Allow using a reverse proxy for local HTTP requests ([[phab:T288848|T288848]]) (duration: 00m 56s)
* 15:20 jynus: shutdown backup2001 [[phab:T260764|T260764]]
* 22:06 maryum: deployed security patch for [[phab:T293589|T293589]]
* 15:13 hnowlan@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 21:23 maryum: deployed security patch for [[phab:T293556|T293556]]
* 15:08 hnowlan@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 21:05 mutante: mwmaint1002 - sudo -u www-data /usr/local/bin/mw-cli-wrapper /usr/local/bin/mwscript extensions/TranslationNotifications/scripts/DigestEmailer.php --wiki mediawikiwiki {{!}} Fatal error: Uncaught Error: Class 'MediaWiki\MediaWikiServices' not found
* 15:04 vgutierrez: rolling restart of ats-tls to disable ECDHE-RSA-AES128-SHA - [[phab:T258405|T258405]]
* 20:58 mutante: mwmaint1002 - attempt to start mediawiki_job_translationnotifications-mediawikiwiki which was alerting as failed
* 14:58 hnowlan@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:55 rzl: switchover test complete, puppet re-enabled on cumin1001
* 20:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:54 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0)
* 19:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-update-tendril
* 19:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:53 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
* 19:29 mutante: LDAP: removed non-existent user gerrit2 from group labsadminbots ([[phab:T160122|T160122]])
* 14:52 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 19:29 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/resources/store/state.js: {{Gerrit|ac7b4fc2ccc69589e00a42f49d18a8f6d71777f2}}: Revert 727328 ([[phab:T293554|T293554]]) (duration: 00m 56s)
* 14:52 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0)
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:52 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:48 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99)
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:48 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:47 godog: powercycle ganeti5002 -- host down and nothing in console
* 18:45 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Convert $wgEventStreams to be an associative array - [[phab:T277193|T277193]] (duration: 00m 57s)
* 14:43 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
* 18:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:43 rzl@cumin1001: [DRY-RUN] MediaWiki read-only period ends at: 2020-08-24 14:43:35.570234
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:43 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 18:07 mutante: gerrit - removed tonina from wmde-mediawiki gerrit group ([[phab:T293621|T293621]])
* 14:43 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
* 17:51 mutante: puppet run on all bastion hosts via cumin
* 14:43 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
* 15:32 mvernon@cumin2002: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 14:43 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0)
* 15:32 mvernon@cumin2002: START - Cookbook sre.discovery.service-route
* 14:43 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions
* 15:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 14:43 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0)
* 15:23 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 14:43 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki
* 15:16 herron: reprepro copied anycast-healthchecker, python3-json-logger and python3-anycast-healthchecker from buster-wikimedia to bullseye-wikimedia [[phab:T292196|T292196]]
* 14:42 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=99)
* 15:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 14:42 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
* 15:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 14:42 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0)
* 14:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 14:41 rzl@cumin1001: [DRY-RUN] MediaWiki read-only period starts at: 2020-08-24 14:41:55.754938
* 14:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 14:41 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
* 14:54 herron: rebuilt and uploaded kafkatee for bullseye [[phab:T292196|T292196]]
* 14:41 dcausse: creating cirrus indices for lldwiki
* 14:50 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
* 14:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:39 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
* 14:36 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731346{{!}}[beta] Rename $wgIPInfoGeoIP2Path to $wgIPInfoGeoIP2Prefix (T289361)]] (duration: 00m 56s)
* 14:38 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:28 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:28 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:28 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 14:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:24 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99)
* 13:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:24 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:24 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
* 13:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (2/2) (duration: 00m 56s)
* 14:24 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
* 13:47 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (1/2) (duration: 00m 56s)
* 14:22 moritzm: installing libexif security updates on stretch
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:18 rzl: disabling puppet on cumin1001 and starting a test of the DC switchover automation, expect some SAL noise but no production impact
* 13:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:731014{{!}}Unconditionally enable Wikibase dispatching via jobs (T291828)]] (duration: 00m 56s)
* 14:08 duesen: Deployed patch for [[phab:T260485|T260485]]
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:59 marostegui: Stop mysql on db1117:3325 to clone db1128 - [[phab:T260324|T260324]]
* 12:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2079.codfw.wmnet with OS buster
* 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1092 for MCR change', diff saved to https://phabricator.wikimedia.org/P12327 and previous config saved to /var/cache/conftool/dbconfig/20200824-135538-marostegui.json
* 12:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1101:3318 after MCR change', diff saved to https://phabricator.wikimedia.org/P12326 and previous config saved to /var/cache/conftool/dbconfig/20200824-133032-marostegui.json
* 12:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:55 Lucas_WMDE: UTC morning backport window done
* 13:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 11:55 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (2/2) (duration: 00m 56s)
* 13:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P12325 and previous config saved to /var/cache/conftool/dbconfig/20200824-131305-marostegui.json
* 11:54 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (1/2) (duration: 00m 56s)
* 13:05 moritzm: installing imagemagick security updates on stretch
* 11:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P12323 and previous config saved to /var/cache/conftool/dbconfig/20200824-130024-marostegui.json
* 11:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2079.codfw.wmnet with OS buster
* 12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P12322 and previous config saved to /var/cache/conftool/dbconfig/20200824-125131-marostegui.json
* 11:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 for MCR change', diff saved to https://phabricator.wikimedia.org/P12321 and previous config saved to /var/cache/conftool/dbconfig/20200824-122848-marostegui.json
* 11:49 marostegui: Reimage db2079 (codfw s8 master) [[phab:T290868|T290868]]
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1105:3311 after MCR change', diff saved to https://phabricator.wikimedia.org/P12320 and previous config saved to /var/cache/conftool/dbconfig/20200824-122752-marostegui.json
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730747{{!}}Set dispatchViaJobsAllowedClients to null everywhere (T291828)]] (duration: 00m 56s)
* 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P12319 and previous config saved to /var/cache/conftool/dbconfig/20200824-122050-marostegui.json
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P12318 and previous config saved to /var/cache/conftool/dbconfig/20200824-121200-marostegui.json
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P12317 and previous config saved to /var/cache/conftool/dbconfig/20200824-120310-marostegui.json
* 11:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731239{{!}}Make deduplication actually work for DispatchChangesJob (T291118)]] (duration: 00m 55s)
* 12:01 Urbanecm: EU B&C window completed
* 11:10 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/Hooks/RecentChangeSaveHookHandler.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (2/2) (duration: 00m 56s)
* 12:01 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|8c380d65d760591099c296ae522b2e63953413aa}}: Enable tewiki as import source for tewikibooks ([[phab:T260107|T260107]]) (duration: 00m 57s)
* 11:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:58 XioNoX: test advertise CF tunnel endpoint on cr1-eqiad - [[phab:T259036|T259036]]
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (duration: 00m 56s)
* 11:57 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|5a6d025b04eb20787e8abbbdd56a3abb3818b82f}}: Add retrobibliothek.de to the wgCopyUploadsDomains allowlist of Wikimedia Commons ([[phab:T261012|T261012]]) (duration: 00m 56s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:50 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e1ae39afbb4d6f33e74782580db7dfee06d0097d}}: Enable mapframe at trwiki ([[phab:T260594|T260594]]) (duration: 00m 58s)
* 10:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:43 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.5/extensions/WikimediaEvents/modules/ext.wikimediaEvents/searchSatisfaction.js: {{Gerrit|1066ecbe2836e69211c905f597ad6b62241528c0}}: Enable MediaSearch A/B test ([[phab:T254388|T254388]]) (duration: 00m 56s)
* 10:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.5/extensions/ContentTranslation/modules/publish/ext.cx.wikibase.link.js: {{Gerrit|74a87184408937bcdb4a27f1f563bbbdff45cf97}}: Publish: Fix broken wikidata linking ([[phab:T249458|T249458]]) (duration: 00m 58s)
* 10:47 moritzm: copied wmf-certificates from buster-wikimedia to stretch-wikimedia in reprepro
* 11:39 Urbanecm: Purge 13 URLs with purgeList.php, see P12316 for list of them ([[phab:T260908|T260908]]; [[phab:T258552|T258552]]; [[phab:T261076|T261076]]; [[phab:T261110|T261110]])
* 10:38 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:731237{{!}}Don't filter by change Id when dispatching to client wikis ()]] (duration: 00m 59s)
* 11:34 hnowlan@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 09:48 moritzm: installing node-tar security updates on buster
* 11:32 hnowlan@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 09:39 vgutierrez: updating acme-chief to version 0.34 on acmechief instances - [[phab:T292619|T292619]]
* 11:32 arturo: add liblept5 1.76.0-1~bpo9+1 (and leptonica-progs) to stretch-wikimedia/component/tesseract-410-bpo ([[phab:T247422|T247422]])
* 09:38 godog: sync metrics from graphite1004 to graphite2003 - [[phab:T247963|T247963]]
* 11:30 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|fe0449d244ee876e4fb64da630f0994ab114f248}}: {{Gerrit|74220d0943e6b32cce3c93dd5b9f8bbc63fa5d73}}: {{Gerrit|7db8a19c512cea84f3000463e9dfb6617857c9a6}}: Update Chinese wordmarks and taglines, update zhwikisource project logo ([[phab:T260908|T260908]]; [[phab:T258552|T258552]]; [[phab:T261076|T261076]]; [[phab:T261110|T261110]]) (duration: 00m 59s)
* 09:13 moritzm: installing apr security updates on bullseye
* 11:29 urbanecm@deploy1001: Synchronized static/images/: {{Gerrit|fe0449d244ee876e4fb64da630f0994ab114f248}}: {{Gerrit|74220d0943e6b32cce3c93dd5b9f8bbc63fa5d73}}: {{Gerrit|7db8a19c512cea84f3000463e9dfb6617857c9a6}}: Update Chinese wordmarks and taglines, update zhwikisource project logo ([[phab:T260908|T260908]]; [[phab:T258552|T258552]]; [[phab:T261076|T261076]]; [[phab:T261110|T261110]]) (duration: 00m 58s)
* 08:57 godog: cleanup graphite metrics not modified for >= ~3yr (1024 days)
* 11:21 hnowlan@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 07:34 ema: cp3060 (text), cp3061 (upload): upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 10:46 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:622116{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 07:34 elukey: depool + restart blazegraph on wdqs1013
* 10:45 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:622116{{!}} Bumping portals to master (T128546)]] (duration: 01m 00s)
* 07:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:43 moritzm: installing ruby2.3 security updates
* 06:31 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:12 moritzm: installing firejail security updates on mw canaries
* 06:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:58 oblivian@cumin1001: conftool action : set/weight=1; selector: dc=codfw,cluster=appserver,service=canary
* 09:46 XioNoX: add PNI to CF on cr1-eqiad with import/export NONE - [[phab:T259036|T259036]]
* 09:18 moritzm: restarting mw canaries to pick up libx11 update
* 09:13 moritzm: installing libx11 security updates on stretch
* 09:10 vgutierrez: repool cp5002
* 09:08 _joe_: restarting php-fpm on mw1344 (stuck in SIGILL for new children)
* 09:00 vgutierrez: restart ats-tls on cp5002
* 08:54 moritzm: installing net-snmp security updates on buster
* 08:52 ema: depool cp5002 due to icinga errors
* 08:24 moritzm: installing json-c security updates on buster
* 07:36 XioNoX: push new pfw policies - [[phab:T261007|T261007]]
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3318, db1105:3311 for MCR change', diff saved to https://phabricator.wikimedia.org/P12315 and previous config saved to /var/cache/conftool/dbconfig/20200824-052916-marostegui.json


== 2020-08-23 ==
== 2021-10-16 ==
* 20:38 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 03:56 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:36 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 02:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 20:18 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 01:30 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:15 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 20:02 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:00 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 11:23 gehel: repool wdqs1006 - catched up on lag


== 2020-08-22 ==
== 2021-10-15 ==
* 19:33 ryankemper: depooled wdqs1006 (still has 2.5 hours to catch up on)
* 23:48 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:31 ryankemper: pooled wdqs1006 now that lag has dissipated
* 23:27 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 07:36 gehel: restart blazegraph on wdqs1006 + depool to catchup on lag
* 23:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 05:24 legoktm: legoktm@mwmaint1002:~$ echo "https://releases.wikimedia.org/mediawiki/1.35/" {{!}} mwscript purgeList.php --wiki=aawiki
* 22:38 mutante: apt1001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 22:36 mutante: apt2001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 22:34 mutante: apt2001 - upgraded nginx
* 22:18 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:14 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:05 dpifke@deploy1002: Finished deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes (duration: 00m 05s)
* 22:05 dpifke@deploy1002: Started deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes
* 21:51 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:44 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:36 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:09 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 17:20 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 17:17 mutante: gitlab1001 - disabling puppet for debugging
* 17:05 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold - [[phab:T283076|T283076]]
* 17:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:50 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold
* 16:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:44 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:48 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:15 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:30 elukey: start topic rebalancing for kafka main-eqiad (long maintenance, it will last a couple of days)
* 13:24 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:21 vgutierrez: updating acme-chief to version 0.34 on acmechief-test instances - [[phab:T292619|T292619]]
* 13:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:14 vgutierrez: upload acme-chief 0.34 to apt.wikimedia.org (buster) - [[phab:T292619|T292619]]
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2007.codfw.wmnet
* 11:45 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:33 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2007.codfw.wmnet
* 11:14 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 09:15 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 09:06 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 08:58 jelto: jelto@gitlab1001:~$ sudo disable-puppet "disable puppet on gitlab1001 to test 728380 on GitLab replica - [[phab:T283076|T283076]]"
* 07:41 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:20 urbanecm: Start server-side upload for 1 video file
* 02:14 ryankemper: [[phab:T288231|T288231]] `wdqs2006` data transfer complete and all tests passing on the host. All of `codfw wdqs-internal` is on the new streaming updater
* 00:09 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:07 brennen: end of UTC late backport & config training window


== 2020-08-21 ==
== 2021-10-14 ==
* 17:43 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:59 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 17:39 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 23:58 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 17:00 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:56 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 56s)
* 16:58 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 23:49 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 16:41 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:48 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 16:39 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 23:46 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 56s)
* 16:17 zpapierski@deploy1001: Finished deploy [search/mjolnir/deploy@c80e2e7]: .. redeploy after theory verification (duration: 00m 50s)
* 23:43 ejegg: updated payments-wiki from {{Gerrit|19d18c1852}} to {{Gerrit|0f48acea49}}
* 16:16 zpapierski@deploy1001: Started deploy [search/mjolnir/deploy@c80e2e7]: .. redeploy after theory verification
* 23:34 cjming@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikimediaEvents/includes/VectorPrefDiffInstrumentation.php: Backport: [[gerrit:730733{{!}}Change VectorPrefDiffInstrumentation stream name to `mediawiki.skin_diff` (T289622)]] (duration: 00m 56s)
* 16:15 zpapierski@deploy1001: deploy aborted: .. (duration: 00m 01s)
* 23:24 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730936{{!}}allow sysops to add and remove users to other groups on ptwikivoyage (T292806)]] (duration: 00m 56s)
* 16:15 zpapierski@deploy1001: Started deploy [search/mjolnir/deploy@c80e2e7]: ..
* 23:21 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 13:25 jayme@cumin1001: conftool action : set/pooled=True; selector: dnsdisc=termbox,name=codfw
* 23:11 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730933{{!}}Add americanantiquarian.org to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T292918)]] (duration: 00m 57s)
* 13:25 jayme@cumin1001: conftool action : set/pooled=False; selector: dnsdisc=termbox,name=codfw
* 23:11 mutante: mw1452 - re-pooled, scap pull
* 09:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 23:09 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 09:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 22:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 09:02 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 22:35 ryankemper: [[phab:T288231|T288231]] Ran puppet on `wdqs2006`, now back to the cookbook run
* 09:01 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 22:33 ryankemper: [[phab:T288231|T288231]] Forgot about running puppet-agent on `wdqs2006`; aborted cookbook run
* 01:51 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:33 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 01:49 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 22:33 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 01:15 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:32 ryankemper: [[phab:T288231|T288231]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/730795; proceeding to data-transfer on `wdqs2006`: `sudo rm -fv /srv/wdqs/data_loaded` on `wdqs2006` followed by `ryankemper@cumin1001:~$ sudo cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "streaming updater cutover for wdqs2005" --blazegraph_instance blazegraph --task-id [[phab:T288231|T288231]]`
* 01:13 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 22:31 mutante: depooling mw1452 for testig
* 22:28 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo pool`: transfer completed successfully; tests passing on host (used `ssh -L 9999:localhost:80 wdqs2005.codfw.wmnet` to establish tunnel)
* 22:23 dpifke@deploy1002: Finished deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]] (duration: 00m 05s)
* 22:23 dpifke@deploy1002: Started deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]]
* 22:17 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 22:07 eileen: civicrm revision changed from {{Gerrit|018d3b19fe}} to {{Gerrit|9b5e0d015b}}, config revision is {{Gerrit|781d6a1b1f}}
* 21:34 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:25 robh@cumin1001: START - Cookbook sre.dns.netbox
* 21:10 robh@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 21:06 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 19:23 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 19:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:53 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:53 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=dagwiki --fix
* 18:47 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=frwiktionary --logwiki=metawiki 'TURK FASTER' 'ARTHUR MORGAN'
* 18:42 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'George Dum Fulton' 'George Fulton' # [[phab:T293403|T293403]]
* 18:41 urbanecm: UTC evening B&C done
* 18:40 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/extension.json: {{Gerrit|6da3523daaba85a4199721980c0a9c96b20697e7}}: Fix assessment quickview labels ([[phab:T292596|T292596]]) (duration: 01m 03s)
* 18:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c8dffefd0d095abe3709dcc962d5d24f27b55869}}: Create Salima namespace for dagwiki ([[phab:T289911|T289911]]) (duration: 01m 04s)
* 18:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bccd4bc45498db8628567574d0bb3a23f8fb378}}: Add $wgSitename and $wgMetaNamespace for kswiki and kswiktionary ([[phab:T289752|T289752]], [[phab:T289767|T289767]]) (duration: 01m 04s)
* 18:17 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|262e588b44f126fb9e1aa933a3ca59b191b42bd7}}: Enable Growth mentor dashboard backend on all wikis ([[phab:T278920|T278920]]) (duration: 01m 05s)
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|41baa8c41d64510986f009b9be2d70dad0915f8c}}: Add new mediawiki.skin_diff event logging stream ([[phab:T289622|T289622]]) (duration: 01m 05s)
* 18:03 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:02 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:01 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 17:54 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:52 rzl: repooled mw1452 (with `sudo pool` so no auto log from conftool)
* 17:47 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:45 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw1452.eqiad.wmnet
* 17:42 rzl: depool mw1452 for training
* 17:32 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:31 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:29 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 16:44 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:44 ryankemper: [[phab:T288231|T288231]] Manually killed dangling `pigz` / `nc` processes on `wdqs2008` (and `wdqs2005` implicitly). Should be in the right state to re-start the `data-transfer` cookbook from again
* 16:41 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:37 elukey: drop kubeflow-kfserving* docker images from deneb
* 16:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:34 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 16:33 moritzm: installing node-ansi-regex security updates
* 16:28 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere (duration: 02m 24s)
* 16:25 mbsantos@deploy1002: Started deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere
* 16:24 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 04s)
* 16:16 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad (duration: 02m 41s)
* 16:14 mbsantos@deploy1002: Started deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad
* 16:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:07 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:07 ryankemper: [[phab:T288231|T288231]] About to ctrl+c out of ongoing data transfer because puppet run following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/730794 restarted blazegraph; we'll manually disable updater and kick off the transfer again
* 16:04 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo run-puppet-agent --force`
* 15:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:54 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2008:~$ sudo depool`
* 15:52 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo depool`
* 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:13 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/VisualEditor/includes/VisualEditorHooks.php: Backport: [[gerrit:730729{{!}}Fix value of 'namespacesWithSubpages' in wgVisualEditorConfig (T293310)]] (duration: 01m 04s)
* 15:02 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 03s)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:59 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:53 kormat: upgrading orchestrator.wm.o to 3.2.6-1 [[phab:T275784|T275784]]
* 14:49 jbond@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=apt
* 14:43 jbond: migrate apt.w.o to a dns active/passiev discovery address (cc moritzm)
* 14:23 moritzm: installing krb5 security updates on KDCs
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:10 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 2/2) (duration: 01m 03s)
* 14:07 urbanecm: Run extensions/GrowthExperiments/initWikiConfig.php for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:07 urbanecm: Create growthexperiments DB tables for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 1/2) (duration: 01m 04s)
* 14:03 urbanecm@deploy1002: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|82d0a4bf45126ecba2cfcd1a0c2081a00f58dca3}}: Enable VE by default on 4 more wikis ([[phab:T290614|T290614]]) (duration: 01m 05s)
* 13:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 kormat: uploaded orchestrator 3.2.6-1 packages to apt.wm.o (buster) [[phab:T275784|T275784]]
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 12:44 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 12:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730746{{!}}Untangle “dispatch via jobs” settings in Wikibase.php (T291828)]] (no-op) (duration: 01m 04s)
* 12:12 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730725{{!}}Set wmgWikibaseDispatchViaJobsPruneChangesTableInJobEnabled for wikidatawiki (T291828)]] (no-op) (duration: 01m 05s)
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 10:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/: {{Gerrit|1f33fc3}}, {{Gerrit|e0ea1b8}}, {{Gerrit|cba2ac9}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 05s)
* 10:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|465b564}}, {{Gerrit|a8cc98b}}, {{Gerrit|6e95c48}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 06s)
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 09:20 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:20 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 volans@deploy1002: Finished deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1 (duration: 00m 50s)
* 09:17 volans@deploy1002: Started deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1
* 09:04 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 45s)
* 09:03 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 09:02 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 23s)
* 09:02 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 08:52 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:22 volans: rolling out debmonitor-client upgrade to 0.3.1 across the fleet
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:24 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:24 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift-ro,name=eqiad
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad
* 07:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:37 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:52 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:50 foks: changing user email for "Region of Peel Archives"
* 01:41 ejegg: updated payments-wiki from {{Gerrit|b329d2dea2}} to {{Gerrit|19d18c1852}}
* 01:35 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:31 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2020-08-20 ==
== 2021-10-13 ==
* 22:31 eileen: civicrm revision changed from {{Gerrit|27d5900f7d}} to {{Gerrit|ce28723709}}, config revision is {{Gerrit|706cf3c898}}
* 23:37 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:20 eileen: civicrm revision is {{Gerrit|27d5900f7d}}, config revision is {{Gerrit|706cf3c898}}
* 23:36 eileen: civicrm revision changed from {{Gerrit|946dfb6c5a}} to {{Gerrit|018d3b19fe}}, config revision is {{Gerrit|85277466ed}}
* 22:20 mutante: permanently shut down tungsten.eqiad.wmnet [[phab:T260395|T260395]] [[phab:T158837|T158837]] [[phab:T180761|T180761]] [[phab:T224549|T224549]]
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730575{{!}}Create an alias for the project namespace on kswiki (T291740)]] (duration: 01m 05s)
* 22:18 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 22:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:17 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 22:01 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Collection/includes/Specials/SpecialCollection.php: Backport: [[gerrit:730578{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 21:35 ejegg: updated fundraising CiviCRM from {{Gerrit|958a79f660}} to {{Gerrit|27d5900f7d}}
* 21:50 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection: Backport: [[gerrit:730577{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 20:53 cdanis: repool eqsin
* 21:47 foks: removing 8 files for legal compliance
* 20:37 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 21:03 foks: removing 2 files for legal compliance
* 20:36 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 21:00 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 20:34 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 20:50 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:25 cdanis: cdanis@cr2-eqsin> request vmhost reboot
* 20:49 brennen@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/Api/ApiGetBookCreatorBoxContent.php: Backport: [[gerrit:730574{{!}}Fall back to main page if given title is invalid (T293299)]] (duration: 01m 04s)
* 20:17 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 20:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:16 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 20:40 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:13 cdanis: cdanis@cr2-eqsin> request vmhost software add /var/tmp/junos-vmhost-install-mx-x86-64-18.2R3-S5.3.tgz
* 20:31 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 20:11 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 20:27 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1021.eqiad.wmnet with OS stretch
* 20:02 cdanis: depool eqsin for router upgrade
* 20:04 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:57 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97)
* 20:03 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:37 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 20:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:34 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 19:18 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 19:34 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 19:16 mutante: gitlab2001 - status before was that "gitlab-ctl status" showed components "gitlab-workhorse" and "postgres-exporter" as "down". this was either pre-broken or caused by the restore process. after manually 'gitlab-ctl start gitlab-workhorse' all of the components are in "run" and https://gitlab-replica.wikimedia.org is up ( [[phab:T285867|T285867]])
* 19:24 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 19:08 mutante: gitl1b2001 - started workhorse which was for some reason marked as down after restore command ran
* 19:17 bstorm@cumin1001: END (FAIL) - Cookbook wmcs.wikireplicas.add_wiki (exit_code=99)
* 19:08 mutante: [gitlab2001:~] $ sudo /usr/bin/gitlab-ctl start gitlab-workhorse
* 19:17 bstorm@cumin1001: START - Cookbook wmcs.wikireplicas.add_wiki
* 19:06 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 01m 03s)
* 19:17 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.5  refs [[phab:T257973|T257973]]
* 19:05 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 19:08 mutante: restarted apache on cont2001 for integration.wikimedia.org docroot change
* 19:02 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87879865c35edab3ead523027681146e00d6fc02}}: Create Translation namespace for viwikisource ([[phab:T290691|T290691]]) (duration: 01m 04s)
* 19:07 mutante: switching document root of integration.wikimedia.org to scap ([[phab:T149924|T149924]])
* 18:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|06fd0f225575448771cdba0d4e6bf36bb6715bc1}}: add extendedconfimed for autoreview group on ptwiki ([[phab:T292912|T292912]]) (duration: 01m 04s)
* 19:02 twentyafterfour: 1.36.0-wmf.5 has no known blockers and logspam is cleaned up, time to roll group2 wikis to wmf.5
* 18:37 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript initSiteStats.php --wiki=ptwiki --update
* 18:42 bstorm@cumin1001: END (FAIL) - Cookbook wmcs.wikireplicas.add_wiki (exit_code=99)
* 18:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=ptwiki extendedconfirmed
* 18:42 bstorm@cumin1001: START - Cookbook wmcs.wikireplicas.add_wiki
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bb2b388217aa91a39ed3684f87fdf7edb06fd81}}:  Set autoconfirmedextended and confirmedextended for ptwiki ([[phab:T292915|T292915]]) (duration: 01m 04s)
* 18:19 mutante: ores1004 - starting failed celery-ores-worker
* 18:16 urbanecm@deploy1002: Synchronized static/images/project-logos: {{Gerrit|694bc234ab5dbb9a2387a6129998d45a53ac0ab3}}: Remove an old dawiki temporary logo (duration: 01m 04s)
* 18:18 mutante: testreduce1001 - rt_client and vd_client now properly stopped by puppet [[phab:T257906|T257906]]
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|224e2a374b1cc6327e9d8c2bca576091ce4efc74}}: Add NS_MAIN back to wgExtraSignatureNamespaces for mediawikiwiki ([[phab:T291630|T291630]]) (duration: 01m 05s)
* 17:29 shdubsh: restart elasticsearch on logstash1012 (not 1020) -- high gc runtimes
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 17:28 shdubsh: restart elasticsearch on logstash1020 -- high gc runtimes
* 18:12 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 17:23 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 18:11 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|1b96f54a518620b0dc6a0ab63b402d0ea2c6bf70}}: Update logo for liwiktionary ([[phab:T291479|T291479]]) (duration: 01m 14s)
* 17:23 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97)
* 18:10 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 17:23 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 18:10 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 17:22 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 18:09 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 17:22 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 18:09 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 16:48 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 volans: uploaded debmonitor-client_0.3.1 to apt.wikimedia.org stretch-wikimedia,buster-wikimedia,bullseye-wikimedia
* 16:48 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 17:14 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|dd7a3314602ffddc5b917cccc71c917301639388}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 04s)
* 16:46 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 17:13 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|5c27154cf434bebc37f5e98e2ad1b5cea7cde1d4}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 15s)
* 16:45 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 16:57 mutante: stat1008 - short on disk space, mostly used in /tmp, high CPU usage by R proccess, sent a message about it to all shell users via wall
* 16:43 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 16:50 mutante: stat1008 - apt-get clean - freed 1.3 GB disk space - was alerting in Icinga because / was 97% full
* 16:40 _joe_: restarted apache2 on icinga1001
* 16:37 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:13 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 16:37 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 16:11 shdubsh: restart elasticsearch on logstash1011 -- long gc runs
* 16:23 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:10 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 16:23 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 16:08 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 15:29 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:02 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 15:28 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:55 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 15:26 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:06 oblivian@deploy1001: Finished deploy [ores/deploy@8540eec]: various configuration fixes (duration: 09m 03s)
* 15:26 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:57 oblivian@deploy1001: Started deploy [ores/deploy@8540eec]: various configuration fixes
* 15:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:53 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 15:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 13:53 oblivian@deploy1001: Finished deploy [ores/deploy@e860508]: switch everything to use envoy as a service proxy [[phab:T244843|T244843]] (duration: 14m 00s)
* 15:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:39 oblivian@deploy1001: Started deploy [ores/deploy@e860508]: switch everything to use envoy as a service proxy [[phab:T244843|T244843]]
* 15:12 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 13:26 oblivian@deploy1001: Finished deploy [ores/deploy@74677b6]: switch testwiki to use envoy as a service proxy [[phab:T244843|T244843]] (take 2) (duration: 11m 37s)
* 15:12 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:14 oblivian@deploy1001: Started deploy [ores/deploy@74677b6]: switch testwiki to use envoy as a service proxy [[phab:T244843|T244843]] (take 2)
* 15:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:11 oblivian@deploy1001: Finished deploy [ores/deploy@a208a0e]: switch testwiki to use envoy as a service proxy [[phab:T244843|T244843]] (duration: 11m 19s)
* 15:04 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 13:09 gehel: repool wdqs1007 - catched up on lag
* 15:03 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 13:00 oblivian@deploy1001: Started deploy [ores/deploy@a208a0e]: switch testwiki to use envoy as a service proxy [[phab:T244843|T244843]]
* 15:03 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 12:51 oblivian@deploy1001: Finished deploy [ores/deploy@a208a0e]: switch testwiki to use envoy as a service proxy [[phab:T244843|T244843]] (duration: 07m 03s)
* 15:01 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:44 oblivian@deploy1001: Started deploy [ores/deploy@a208a0e]: switch testwiki to use envoy as a service proxy [[phab:T244843|T244843]]
* 15:01 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:49 Lucas_WMDE: EU backport window done
* 15:01 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:44 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.4/extensions/AbuseFilter/includes/AbuseFilterHooks.php: {{Gerrit|d762e7b5526d91fe21e5980bc5e9f3be06a2f85c}}: Use $user param when filtering edits ([[phab:T258717|T258717]]) (duration: 01m 05s)
* 14:59 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:41 eileen: civicrm revision changed from {{Gerrit|6c9441a18e}} to {{Gerrit|958a79f660}}, config revision is {{Gerrit|706cf3c898}}
* 14:59 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:38 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.5/extensions/AbuseFilter/includes/AbuseFilterHooks.php: {{Gerrit|00da39b6913ac2eab600bbb61258472b60d2cbcb}}: Use $user param when filtering edits ([[phab:T258717|T258717]]) (duration: 01m 05s)
* 14:59 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:32 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.36.0-wmf.5/extensions/Wikibase/client/data-bridge/dist/: Backport: [[gerrit:621488{{!}}Don't try to load source maps in production (T260852)]] (duration: 01m 07s)
* 14:57 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:07 mlitn@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix testwikidata depicts id & CirrusSearchUserTesting config (duration: 01m 06s)
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:07 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=trwiki editor # [[phab:T260899|T260899]]
* 14:56 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:58 XioNoX: re-pool codfw - [[phab:T259621|T259621]]
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 10:53 XioNoX: un-drain cr1-codfw - [[phab:T259621|T259621]]
* 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:45 XioNoX: cr1-codfw> request chassis routing-engine master switch - [[phab:T259621|T259621]]
* 14:54 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 10:26 hashar: Restarted zuul-merger instances on contint1001 and contint2001
* 14:52 ema: repool cp4021, further testing can be performed on sretest1001 [[phab:T201317|T201317]]
* 10:24 hashar@deploy1001: Finished deploy [zuul/deploy@8a05b4d]: Support Gerrit replication events (duration: 00m 24s)
* 14:51 volans: restarting ircecho.service on alert1001 to get back icinga-wm without the underscore
* 10:24 hashar@deploy1001: Started deploy [zuul/deploy@8a05b4d]: Support Gerrit replication events
* 14:50 elukey: restart pybal on lvs1015 (low-traffic primary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 10:21 XioNoX: cr1-codfw> request chassis routing-engine master switch - [[phab:T259621|T259621]]
* 14:48 moritzm: reverted to clean package state on deneb
* 10:12 XioNoX: reboot cr1-codfw:re1 (backup) for upgrade - [[phab:T259621|T259621]]
* 14:44 elukey@puppetmaster1001: conftool action : ge; selector: cluster=ml_serve,service=inference
* 09:57 XioNoX: bump cr1-codfw OSPF metrics - [[phab:T259621|T259621]]
* 14:36 elukey: restart pybal on lvs1016 (low-traffic secondary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 09:51 XioNoX: enable transit/peering and re-set normal OSPF values on cr2-codfw - [[phab:T259621|T259621]]
* 14:27 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:41 XioNoX: cr2-codfw> request chassis routing-engine master switch - [[phab:T259621|T259621]]
* 14:27 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:36 eileen: civicrm revision changed from {{Gerrit|cf9fadbeed}} to {{Gerrit|6c9441a18e}}, config revision is {{Gerrit|706cf3c898}}
* 14:25 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:33 XioNoX: reboot cr2-codfw:re0 (backup) for upgrade - [[phab:T259621|T259621]]
* 14:25 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 XioNoX: cr2-codfw> request chassis routing-engine master switch - [[phab:T259621|T259621]]
* 14:21 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 kormat: stress-testing db2125 [[phab:T260670|T260670]]
* 14:21 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:08 XioNoX: reboot cr2-codfw:re1 (backup) for upgrade - [[phab:T259621|T259621]]
* 14:20 moritzm: temporarily downgrade sphinx packages on deneb to 1.7.9-1~bpo9+1 to build a Ganeti 2.16 stretch backport with delicate toolchain needs
* 09:03 kormat@cumin1001: dbctl commit (dc=all): 'Repool db2125 after host failure [[phab:T260670|T260670]]', diff saved to https://phabricator.wikimedia.org/P12303 and previous config saved to /var/cache/conftool/dbconfig/20200820-090313-kormat.json
* 14:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 kormat: removing /usr/bin/check_mariadb.py from all db hosts [[phab:T259516|T259516]]
* 14:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 XioNoX: disable transit/peering on cr2-codfw - [[phab:T259621|T259621]]
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:48 XioNoX: bump cr2-codfw OSPF metrics - [[phab:T259621|T259621]]
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:44 jynus: running analyze table on db1115's tendril.global_status_log, may case some stalls on tendril/dbtree [[phab:T260876|T260876]]
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:41 XioNoX: depool codfw for routers upgrade - [[phab:T259621|T259621]]
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:31 XioNoX: enable transit/peering on cr3-knams - [[phab:T259621|T259621]]
* 13:59 XioNoX: push prep-work for anycast tuning in ulsfo - [[phab:T288843|T288843]]
* 08:21 XioNoX: reboot cr3-knams for upgrade - [[phab:T259621|T259621]]
* 13:38 jayme: imported helm-diff_3.1.3-2 to buster-wikimedia (https://gerrit.wikimedia.org/r/c/operations/debs/helm-diff/+/730509)
* 08:07 XioNoX: disable transit/peering on cr3-knams - [[phab:T259621|T259621]]
* 13:37 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 07:39 hashar: contint2001: restarted zuul
* 13:34 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 07:29 hashar: contint1001: restarted zuul-merger
* 12:13 Lucas_WMDE: UTC morning backport+config window done
* 07:29 hashar@deploy1001: Finished deploy [zuul/deploy@5989ed0]: Upgrade gear from 0.7.0 to 1.15.1+wmf1 - [[phab:T258630|T258630]] (duration: 00m 13s)
* 12:12 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/includes: Backport: [[gerrit:730370{{!}}Add Link: Do not log "no suggestion found" errors in production log (T291251)]] (duration: 01m 04s)
* 07:28 hashar@deploy1001: Started deploy [zuul/deploy@5989ed0]: Upgrade gear from 0.7.0 to 1.15.1+wmf1 - [[phab:T258630|T258630]]
* 12:11 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]'  # after applying 730512 at mwmaint1002 to workaround [[phab:T293219|T293219]] # [[phab:T255037|T255037]]
* 01:54 ejegg: re-enabled fundraising scheduled jobs
* 12:11 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/modules: Backport: [[gerrit:730371{{!}}Suggested Edits: Update local config.presets when topics/difficulty presets change (T292536)]] (duration: 01m 07s)
* 00:51 mutante: ms-be1039 - started failed ferm service
* 11:56 urbanecm@deploy1002: Synchronized wmf-config/config/itwiki.yaml: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]) (duration: 01m 04s)
* 00:35 ejegg: stopped fundraising scheduled jobs
* 11:55 urbanecm: mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=mediawikiwiki "Growth/Communities/How to introduce yourself as a mentor" "Growth/Communities/How to configure the mentors' list" "Martin Urbanec (WMF)" --reason '[[:phab:T293184]]' # [[phab:T293184|T293184]]
* 00:27 eileen: civicrm revision changed from {{Gerrit|c442a09153}} to {{Gerrit|cf9fadbeed}}, config revision is {{Gerrit|3cdffd4fc2}}
* 11:55 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 2/3) (duration: 01m 04s)
* 11:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 1/3) (duration: 01m 05s)
* 11:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]' # [[phab:T255037|T255037]]
* 11:49 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=itwiki growthexperiments # [[phab:T255037|T255037]]
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:730380{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 07s)
* 11:43 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Wikibase/repo/: Backport: [[gerrit:730385{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 18s)
* 11:33 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 11:19 ema: pool cp4021 after reimage [[phab:T201317|T201317]]
* 11:05 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 10:15 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:09 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:728490{{!}}Add more types of QuickSurveys on beta cluster (T292459)]] (duration: 01m 53s)
* 10:06 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 09:22 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 08:35 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:21 elukey: run kafka preferred-replica-election on kafka-main1001 to rebalance partition leaders - [[phab:T288825|T288825]]
* 08:15 godog: bounce graphite on graphite1004 to apply new config
* 07:33 elukey: increase kafka topic partition size of the top 4 high traffic topics of main-eqiad as described in https://phabricator.wikimedia.org/T288825#7422726
* 07:13 XioNoX: provision new eqsin-ulsfo link - [[phab:T273308|T273308]]
* 06:26 elukey: `kafka topics --alter --topic <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change --partitions 3` on kafka-main2001 - [[phab:T288825|T288825]]
* 00:38 ejegg: updated payments-wiki from {{Gerrit|030b11da1a}} to {{Gerrit|b329d2dea2}}


== 2020-08-19 ==
== 2021-10-12 ==
* 23:20 Urbanecm: Evening B&C window closed
* 23:48 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a80899948c26ca36b970b80fbad07600fe4ce92c}}: Enable VisualEditor in namespaces Draft and Wikiproject on hywiki ([[phab:T260825|T260825]]) (duration: 01m 05s)
* 23:16 urbanecm: UTC late B&C window done
* 22:41 eileen: civicrm revision changed from {{Gerrit|34f95a3311}} to {{Gerrit|c442a09153}}, config revision is {{Gerrit|3cdffd4fc2}}
* 23:15 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 01m 04s)
* 21:27 eileen: civicrm revision changed from {{Gerrit|154519cc1f}} to {{Gerrit|34f95a3311}}, config revision is {{Gerrit|3cdffd4fc2}}
* 23:12 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 02m 09s)
* 21:17 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 23:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:17 cdanis@cumin1001: START - Cookbook sre.network.cf
* 22:53 urbanecm: [urbanecm@labweb1001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=labswiki Jamesmontalvo3 #
* 20:39 dpifke@deploy1001: Finished deploy [performance/arc-lamp@2ef1af7]: Deploy fixes for notifications and OOM prevention ([[phab:T259167|T259167]]) (duration: 00m 06s)
* 22:51 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:39 dpifke@deploy1001: Started deploy [performance/arc-lamp@2ef1af7]: Deploy fixes for notifications and OOM prevention ([[phab:T259167|T259167]])
* 20:21 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:43 ebernhardson: restart mjolnir-kafka-bulk-daemon on search-loader2001 with debug logging
* 19:31 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.1 (duration: 04m 02s)
* 19:20 mutante: testreduce1001 - re-enabled puppet, confirmed parsoid-rt service was now stopped properly by puppet while it runs as before on scandium, the previous parsoid-testing host. switching it over is now a Hiera one-liner. ([[phab:T257906|T257906]])
* 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:15 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.36.0-wmf.5 refs [[phab:T257973|T257973]] (duration: 01m 04s)
* 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:14 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.5 refs [[phab:T257973|T257973]]
* 19:02 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 19:02 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|60af096b80a8ef7bc94ec40ce203fd27b0c97f26}}: Add autopatrolled group at arzwiki ([[phab:T260761|T260761]]) (duration: 01m 04s)
* 18:47 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.4 refs [[phab:T281168|T281168]] (duration: 45m 36s)
* 18:52 mutante: testreduce1001 - disable puppet; stop parsoid-rt service
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 18:47 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|924a03bd624d6750a7e776e09713056cc45e5cc5}}: Add clinton.presidentiallibraries.us to the wgCopyUploadsDomains allowlist of Wikimedia Commons ([[phab:T259927|T259927]]) (duration: 01m 04s)
* 18:01 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.4 refs [[phab:T281168|T281168]]
* 18:45 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|83b34e1bd1ed804a70f67e089580e082f89e2a0f}}: ClosedWikiProvider: Use testUserForCreation rather than testForAuthentication ([[phab:T258695|T258695]]) (duration: 01m 04s)
* 17:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:39 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|95d45f6e002df78d4860a711042d77a6b0bdecb9}}: Dont index Draft (118) and Draft talk (119) on hywiki ([[phab:T260804|T260804]]) (duration: 01m 04s)
* 17:56 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/CentralNotice: Backport: [[gerrit:730141]] (duration: 00m 59s)
* 18:32 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|803cb1a0d2c8cc6df8e4e88ab3c4d27eb71d01b3}}: Update taglines for various projects ([[phab:T258552|T258552]]) (duration: 01m 04s)
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:30 urbanecm@deploy1001: Synchronized static/images/mobile/copyright/: {{Gerrit|803cb1a0d2c8cc6df8e4e88ab3c4d27eb71d01b3}}: Update taglines for various projects ([[phab:T258552|T258552]]) (duration: 01m 06s)
* 17:46 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 18:25 mutante: rebooting webperf1002 VM on ganeti level (outside OS) to upgrade rom 8 to 16GB RAM ([[phab:T260192|T260192]])
* 17:43 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 18:24 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:41 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 18:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bb4aa44b0bd5b2b33d190d3af81e038e5fc55e3f}}: Configure namespaces on commons to include categories ([[phab:T198716|T198716]]) (duration: 01m 04s)
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:21 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b9043331c1c1b352256cffd471b9ff128806607c}}: Update project wordmarks ([[phab:T254788|T254788]]; sync 2/2) (duration: 01m 04s)
* 17:32 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SyntaxHighlight_GeSHi/includes/ResourceLoaderPygmentsModule.php: Backport: [[gerrit:730233{{!}}Include generated styles before Mediawiki overrides (T292736)]] (duration: 00m 57s)
* 18:19 urbanecm@deploy1001: Synchronized static/images/mobile/copyright/: {{Gerrit|b9043331c1c1b352256cffd471b9ff128806607c}}: Update project wordmarks ([[phab:T254788|T254788]]; sync 1/2) (duration: 01m 06s)
* 17:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 mutante: rebooting webperf2002 VM on ganeti level (outside OS) to upgrade rom 8 to 16GB RAM ([[phab:T260192|T260192]])
* 17:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a6f8354e7599a5e92bea060807065f5b42c540e5}}: Enable $wgMFNoindexPages for all wikis ([[phab:T255458|T255458]]) (duration: 01m 07s)
* 17:23 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730236{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 18:13 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:13 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:13 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 17:16 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730235{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 17:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 17:12 moritzm: installing rsync bugfix updates
* 17:38 mutante: decom'ing releases2001.codfw.wmnet (
* 17:09 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:37 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 16:56 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:39 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 16:55 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 16:37 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 16:53 moritzm: failed over ganeti master for test cluster to ganeti2025
* 16:32 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:50 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 16:48 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:41 rzl: finished exercising the switchdc cookbooks with --live-test for now, all changes reverted including re-enabling puppet on cumin1001
* 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:38 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
* 16:30 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts testvm2009.codfw.wmnet
* 15:37 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 16:30 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:34 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0)
* 16:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:34 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 16:26 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 15:33 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99)
* 16:26 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes: Backport: [[gerrit:730226{{!}}Pre-format comments for non-local files too (T292570)]] (duration: 01m 15s)
* 15:33 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 16:17 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:31 jbond42: update java.security https://gerrit.wikimedia.org/r/c/operations/puppet/+/593467
* 16:16 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2009.codfw.wmnet
* 15:30 oblivian@cumin1001: conftool action : set/ttl=300; selector: dnsdisc=api-rw
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:26 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99)
* 16:10 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:26 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 16:09 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 15:22 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99)
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:22 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 16:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730231{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 00m 57s)
* 15:18 godog: prometheus codfw lvextend --resizefs --size +80G /dev/mapper/vg--ssd-prometheus--ops
* 16:00 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:17 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:17 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 15:58 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730230{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 02m 13s)
* 15:16 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99)
* 15:57 volans@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2009.codfw.wmnet
* 15:16 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:14 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0)
* 15:51 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:14 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 15:49 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:08 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99)
* 15:48 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 15:08 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 15:48 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:06 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:41 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for analytics1069.eqiad.wmnet
* 15:04 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 15:41 btullis@cumin1001: START - Cookbook sre.hosts.remove-downtime for analytics1069.eqiad.wmnet
* 14:50 rzl@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99)
* 15:02 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:50 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 14:50 volans@cumin2002: START - Cookbook sre.dns.netbox
* 14:50 rzl: running the switchdc cookbooks with --live-test, simulating a switch to eqiad where we're already running, no production impact is expected
* 13:49 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 14:47 rzl@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
* 14:47 rzl@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 14:41 rzl: disable puppet on cumin1001 for switchdc testing
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:35 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 13:14 godog: add 50G to prometheus/k8s in eqiad
* 14:27 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 13:13 otto@deploy1002: Synchronized wmf-config/CommonSettings.php: Enable x_client_ip_forwarding_enabled for eventgate-analytics and eventgate-analytics-external - [[phab:T288853|T288853]] (duration: 00m 56s)
* 13:38 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 13:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:34 gehel: depooling wdqs1007 and restarting blazegraph
* 13:11 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:29 _joe_: depooling and disabling puppet on restbase1024 for further investigation
* 13:05 volans: upgraed spicerack to 1.0.5 on cumin hosts
* 13:27 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 12:25 volans: uploaded spicerack_1.0.5 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 13:26 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 12:15 elukey: `kafka topics --alter --topic codfw.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 13:25 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 12:15 elukey: `kafka topics --alter --topic eqiad.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 13:10 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:10 elukey: `kafka topics --alter --topic codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 13:03 _joe_: building and uploading fluent-bit, ratelimit images
* 12:09 elukey: `kafka topics --alter --topic eqiad.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 13:01 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 11:58 elukey: `kafka topics --alter --topic codfw.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 12:57 _joe_: building a new version of the base docker images
* 11:49 elukey: `kafka topics --alter --topic eqiad.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:29 awight: EU bacon finished
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:28 effie: restart mwdebug* servers
* 11:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:621227{{!}}Fix typos in flaggedrevs comments ()]] (duration: 01m 19s)
* 11:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:22 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 08:48 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:34 urbanecm: UTC morning B&C window done
* 08:48 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:43 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 11:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|860ea0944d6dc1e6b5061eb84eec378eb5ac8441}}: Remove NS_MAIN from wgExtraSignatureNamespaces on most special wikis ([[phab:T291630|T291630]]) (duration: 00m 57s)
* 08:43 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 11:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:36 XioNoX: update firewall policies on pfw - [[phab:T260585|T260585]]
* 11:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:35 jayme: running puppet on A:all-mw-eqiad
* 11:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:23 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:14 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 08:23 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:20 godog: switch grafana.w.o to grafana 7 in codfw - [[phab:T259143|T259143]]
* 11:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:19 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 11:06 urbanecm@deploy1002: Synchronized w/static.php: {{Gerrit|e77ae17efb34723598fc69e87109944384df442a}}: static.php: correctly report a bad request (duration: 00m 57s)
* 08:18 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 11:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2003.codfw.wmnet
* 08:14 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2003.codfw.wmnet
* 08:06 jayme: running puppet on A:all-mw-eqiad
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 07:46 godog: upgrade to grafana 7 on cloudmetrics hosts - [[phab:T259143|T259143]]
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 07:15 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 10:30 ema: apply https://gerrit.wikimedia.org/r/726912 to all A:cp nodes [[phab:T288106|T288106]]
* 07:10 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:24 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 06:39 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:23 ema: depool/repool ats-be on cp4028 to verify updates to /etc/varnish/directors.frontend.vcl on cp4027 keep on working fine [[phab:T288106|T288106]]
* 06:13 eileen: tools revision changed from {{Gerrit|b4ebd1e564}} to {{Gerrit|0b9d971bc4}}
* 10:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 06:07 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:22 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 06:04 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 10:16 ema: cp4027: enable and run puppet to test https://gerrit.wikimedia.org/r/726912 [[phab:T288106|T288106]]
* 06:03 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:12 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti2025.codfw.wmnet with OS buster
* 06:00 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:16 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17456 and previous config saved to /var/cache/conftool/dbconfig/20211012-091614-kormat.json
* 05:55 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:01 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17455 and previous config saved to /var/cache/conftool/dbconfig/20211012-090111-kormat.json
* 05:53 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:46 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17454 and previous config saved to /var/cache/conftool/dbconfig/20211012-084607-kormat.json
* 05:47 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 08:31 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17453 and previous config saved to /var/cache/conftool/dbconfig/20211012-083103-kormat.json
* 05:37 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:31 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:42 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:58 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|17dc3aa}}, {{Gerrit|e0ca905}}, {{Gerrit|c0f4f4e}}: GrowthExperiments backports ([[phab:T292224|T292224]], [[phab:T290609|T290609]], [[phab:T290609|T290609]]) (duration: 00m 59s)
* 03:40 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 07:40 elukey: run kafka preferred-replica-election on kafka-main2001 to rebalance partition leaders after the last topic moves - [[phab:T288825|T288825]]
* 02:53 cstone: civicrm revision changed from {{Gerrit|f5469d0a4c}} to {{Gerrit|154519cc1f}}
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 02:00 wkandek@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 07:22 moritzm: installing RT security updates
* 01:05 dpifke@deploy1001: Synchronized wmf-config/profiler.php: Deploying https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/620139 (duration: 01m 18s)
* 04:43 eileen: civicrm revision changed from {{Gerrit|96090e4bd2}} to {{Gerrit|946dfb6c5a}}, config revision is {{Gerrit|85277466ed}}
* 00:49 dpifke@deploy1001: Synchronized wmf-config/ProductionServices.php: Disabling old XHGui backend ([[phab:T180761|T180761]]) (duration: 05m 13s)
* 03:56 kart_: cxserver: Remove Matxin Key from Production ([[phab:T292635|T292635]])
* 00:15 wkandek@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 03:54 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:48 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:45 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 02:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 eileen: civicrm revision changed from {{Gerrit|598b59b0ee}} to {{Gerrit|96090e4bd2}}, config revision is {{Gerrit|85277466ed}}


== 2020-08-18 ==
== 2021-10-11 ==
* 23:45 catrope@deploy1001: Synchronized php-1.36.0-wmf.5/extensions/GrowthExperiments: Only fetch task card data for users in variant C/D ([[phab:T258021|T258021]]) (duration: 01m 05s)
* 21:25 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 23:44 catrope@deploy1001: Synchronized php-1.36.0-wmf.4/extensions/GrowthExperiments: Only fetch task card data for users in variant C/D ([[phab:T258021|T258021]]) (duration: 01m 06s)
* 20:58 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 23:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1301.eqiad.wmnet
* 17:08 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 23:34 Urbanecm: Run scap pull at mw1301
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 23:33 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable static maps on testwiki, disable them on test2wiki (duration: 03m 22s)
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 23:32 mutante: rebooting mw1301 via mgmt
* 15:31 jgleeson: smashpig updated from {{Gerrit|3607b16f83}} to {{Gerrit|dd3a81c7c2}}
* 23:22 mutante: killed reboot-cluster on cumin1001
* 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 23:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ac34f7274823e40d0c79752eb5ffe74c76856d04}}: Enable subpages in NS:0 in techconductwiki ([[phab:T260350|T260350]]) (duration: 05m 14s)
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 23:04 wkandek@cumin1001: conftool action : set/pooled=yes; selector: name=mw1300.eqiad.wmnet
* 14:36 Emperor: start restoring weight to ms-be2045 [[phab:T290881|T290881]]
* 22:47 wkandek@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 13:42 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 22:41 wkandek@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=1)
* 12:53 moritzm: install apache security updates on buster
* 22:09 mholloway-shell@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 12:49 topranks: Setting up BGP peering to AS12552 (GlobalConnect Group) at AMS-IX on cr2-esams
* 22:07 mholloway-shell@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 12:45 ema: cp4027: upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 22:06 mholloway-shell@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 12:04 moritzm: install apache security updates on bullseye
* 21:39 mholloway-shell@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 10:23 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 21:37 mholloway-shell@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 09:50 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 21:34 mholloway-shell@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 09:45 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 21:24 wkandek@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 09:37 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 21:03 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:13 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 21:01 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 09:09 elukey: force kafka preferred-replica-election on kafka-main2001 after the first 50 topic partitions moves - [[phab:T288825|T288825]]
* 20:27 hashar: https://releases-jenkins.wikimedia.org/ changed agent from releases1001 to releases1002
* 09:05 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 20:14 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.5  refs [[phab:T257973|T257973]]
* 09:01 godog: bounce swift-object-replicator on ms-be2036
* 20:11 mutante: running puppet on cp-ats-ulsfo and switching releases-jenkins backend
* 08:52 godog: bounce statsite on graphite1004 to apply unit config changes
* 20:07 twentyafterfour@deploy1001: Finished scap: testwikis wikis to 1.36.0-wmf.5  refs [[phab:T257973|T257973]] (duration: 53m 12s)
* 08:48 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 20:00 mutante: releases1001 rm /etc/rsync.d/frag* & run puppet
* 08:41 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet
* 19:54 mutante: rsyncing /var/lib/jenkins from releases1001 to releases1002/2002 with --delete  [[phab:T256164|T256164]]
* 08:38 moritzm: updated buster d-i image for Bullseye 11.1 point release [[phab:T292844|T292844]]
* 19:47 ejegg: updated payments-wiki from {{Gerrit|a7ee1790e0}} to {{Gerrit|ef7ebd08cb}}
* 08:38 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 19:44 hashar: Deleting old jobs from https://releases-jenkins.wikimedia.org/  # [[phab:T256164|T256164]]
* 08:26 godog: swift eqiad-prod: final weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 19:41 hashar: releases1001: deleting old legacy mediawiki snapshots under /var/lib/jenkins/<nowiki>{</nowiki>REL1_27,REL1_29,REL1_30<nowiki>}</nowiki>  # [[phab:T256164|T256164]]
* 08:25 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 19:14 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.36.0-wmf.5  refs [[phab:T257973|T257973]]
* 08:24 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet
* 19:13 twentyafterfour: Promote testwikis from 1.36.0-wmf.4 to 1.36.0-wmf.5 refs [[phab:T257973|T257973]]
* 08:06 godog: bounce uwsgi on graphite hosts to bump request size limit - [[phab:T292877|T292877]]
* 17:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 07:58 volans: migrating physical hosts DHCP to the new reimage process - [[phab:T269855|T269855]]
* 16:12 oblivian@cumin1001: conftool action : set/pooled=yes; selector: name=mw14(09{{!}}11{{!}}13).*
* 07:57 elukey: start kafka topics rebalancing for main-codfw (long running maintenance) - [[phab:T288825|T288825]]
* 16:03 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=1)
* 15:36 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 15:30 jayme@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 15:02 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 14:56 papaul: replacing msw-c1,c2 and c4
* 14:55 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1104', diff saved to https://phabricator.wikimedia.org/P12293 and previous config saved to /var/cache/conftool/dbconfig/20200818-145337-marostegui.json
* 14:48 oblivian@cumin1001: conftool action : set/pooled=yes; selector: name=mw13(55{{!}}64{{!}}65).*
* 14:46 XioNoX: move v4 HE on cr3-ulsfo from peering to transit bgp group
* 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P12292 and previous config saved to /var/cache/conftool/dbconfig/20200818-144415-marostegui.json
* 14:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P12291 and previous config saved to /var/cache/conftool/dbconfig/20200818-143758-marostegui.json
* 14:35 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 14:29 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P12290 and previous config saved to /var/cache/conftool/dbconfig/20200818-142937-marostegui.json
* 14:28 marostegui: Stop MYSQL on db2125 for on-site maintenance - [[phab:T260670|T260670]]
* 13:54 marostegui: Revoke DELETE and CREATE from xhgui user on m2 [[phab:T260640|T260640]]
* 13:53 XioNoX: bump Zayo v4 BGP session in eqiad
* 13:49 XioNoX: move v4 HE on cr2-eqord from peering to transit bgp group
* 13:37 XioNoX: move v4 cr1-eqiad from peering to transit bgp group
* 13:04 kormat: disabling puppet on all db machines [[phab:T259516|T259516]]
* 12:57 _joe_: rebooting appservers in eqiad, 3 at a time
* 12:57 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 12:37 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 12:34 kormat: deploying wmfmariadbpy 0.4
* 12:21 jayme@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 11:53 XioNoX: add new icinga hosts to mr policies - [[phab:T260533|T260533]]
* 11:40 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 11:36 Lucas_WMDE: EU backport&config done
* 11:33 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:620888{{!}}Add Wikisource wordmark for trwikisource (T260658)]], part 2 (duration: 00m 55s)
* 11:32 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf '%s\n' 'https://en.wikipedia.org/static/images/mobile/copyright/wikisource-wordmark-tr.svg' {{!}} mwscript purgeList.php # [[phab:T260658|T260658]]
* 11:32 lucaswerkmeister-wmde@deploy1001: Synchronized static/images/mobile/copyright/wikisource-wordmark-tr.svg: Config: [[gerrit:620888{{!}}Add Wikisource wordmark for trwikisource (T260658)]], part 1 (duration: 00m 55s)
* 11:24 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:595543{{!}}Enable Data Bridge on Catalan Wikipedia (T232584)]] (duration: 01m 01s)
* 11:06 jbond42: deploy net-snmp update to buster
* 10:56 oblivian@cumin1001: conftool action : set/pooled=yes; selector: cluster=api_appserver,dc=codfw,name=mw229.*
* 10:55 oblivian@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97)
* 10:54 marostegui: Reboot db2125 after running a full upgrade - [[phab:T260670|T260670]]
* 10:46 marostegui: Powercycle db2125 from the idrac [[phab:T260670|T260670]]
* 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2125 - host down [[phab:T260670|T260670]]', diff saved to https://phabricator.wikimedia.org/P12288 and previous config saved to /var/cache/conftool/dbconfig/20200818-100718-marostegui.json
* 09:45 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 09:43 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet
* 09:40 oblivian@cumin1001: conftool action : set/pooled=yes; selector: cluster=api_appserver,dc=codfw,name=mw214[234].*
* 09:40 oblivian@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97)
* 09:35 kart_: Update cxserver to 2020-08-17-090424-production ([[phab:T259980|T259980]])
* 09:32 kartik@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 09:29 kartik@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 09:28 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 09:28 oblivian@cumin1001: conftool action : set/pooled=yes; selector: cluster=api_appserver,dc=codfw,name=mw214[02].*
* 09:26 volans: upgraded spicerack to v0.0.39 on cumin hosts
* 09:25 kartik@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 09:21 volans: uploaded spicerack_0.0.39-1+deb10u1 to apt.wikimedia.org buster-wikimedia
* 09:05 hashar: Restarting CI Jenkins
* 08:44 vgutierrez: restart ats-tls on cp5006
* 08:24 oblivian@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97)
* 08:17 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 08:16 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 08:10 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1089', diff saved to https://phabricator.wikimedia.org/P12284 and previous config saved to /var/cache/conftool/dbconfig/20200818-080256-marostegui.json
* 07:58 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 07:53 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 07:45 godog: VictorOps ack'd incidents will re-trigger after 24h if not resolved - [[phab:T259465|T259465]]
* 07:44 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=1)
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P12283 and previous config saved to /var/cache/conftool/dbconfig/20200818-074325-marostegui.json
* 07:42 _joe_: performing rolling reboot of all codfw api servers
* 07:38 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P12282 and previous config saved to /var/cache/conftool/dbconfig/20200818-072349-marostegui.json
* 07:19 oblivian@cumin1001: conftool action : set/pooled=yes; selector: name=mw213[5-9].codfw.wmnet
* 07:16 jynus: update rest of phabricator passwords [[phab:T250361|T250361]]
* 07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P12281 and previous config saved to /var/cache/conftool/dbconfig/20200818-071121-marostegui.json
* 07:08 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 07:07 godog: prometheus eqiad: add 100G to prometheus/global
* 07:01 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 07:01 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 07:01 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 06:53 twentyafterfour: phabricator maintenance successful
* 06:48 jynus: deploy another password change to phabricator service (potentially disruptive) [[phab:T250361|T250361]]
* 06:41 XioNoX: add cloudflare PNI IPs in eqiad - [[phab:T259036|T259036]]
* 06:21 jynus: deploy password change to phabricator service [[phab:T146055|T146055]]
* 06:06 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 06:01 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 05:52 _joe_: running puppet on mc1020 [[phab:T260622|T260622]]
* 05:02 twentyafterfour: phabricator appears to be fully functional
* 05:01 twentyafterfour: phabricator read-only ended
* 05:00 twentyafterfour: phabricator is now read-only
* 05:00 marostegui: Failover m3 (phabricator) database master from db1128 to db1132 - [[phab:T259589|T259589]]
* 04:32 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1088', diff saved to https://phabricator.wikimedia.org/P12279 and previous config saved to /var/cache/conftool/dbconfig/20200818-043241-marostegui.json
* 01:54 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1376.eqiad.wmnet
* 01:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1343.eqiad.wmnet
* 01:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1344.eqiad.wmnet
* 01:04 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 00:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 00:48 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:45 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 00:39 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:39 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1341.eqiad.wmnet
* 00:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1340.eqiad.wmnet
* 00:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1339.eqiad.wmnet
* 00:31 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 00:24 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 00:15 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:13 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 00:07 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:07 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1315.eqiad.wmnet
* 00:06 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)


== 2020-08-17 ==
== 2021-10-09 ==
* 23:59 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 05:01 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 04:28 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:49 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 01:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 23:49 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1313.eqiad.wmnet
* 00:46 mutante: ms-be2045 - started systemd-timedated which had been killed by something
* 23:49 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:28 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 23:49 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 00:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.force-unfreeze (exit_code=99)
* 23:47 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 00:23 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.force-unfreeze
* 23:41 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 00:13 ryankemper: [[phab:T292814|T292814]] Write queue stuck at 133 events in partition 1 of topic `codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite`, will try again at another time
* 23:40 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1312.eqiad.wmnet
* 00:12 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 23:40 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 23:30 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 23:28 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1297.eqiad.wmnet
* 23:26 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 23:25 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 23:23 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 23:11 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 23:11 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 23:11 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 23:10 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 23:10 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1288.eqiad.wmnet
* 23:00 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 22:55 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 22:47 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:45 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 22:43 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:43 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1286.eqiad.wmnet
* 22:43 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 22:41 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 22:37 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1285.eqiad.wmnet
* 22:33 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 22:26 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:26 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 22:26 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1284.eqiad.wmnet
* 22:25 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 22:23 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 22:21 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 22:21 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 22:17 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:15 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 22:09 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:08 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1282.eqiad.wmnet
* 22:04 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 22:02 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 22:02 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1281.eqiad.wmnet
* 22:00 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 21:57 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki=aawiktionary --site-group wiktionary ([[phab:T259360|T259360]])
* 21:56 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 21:56 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 21:53 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add api-gateway.request stream config [[phab:T259736|T259736]], one host timed out (duration: 00m 55s)
* 21:49 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:48 ppchelko@deploy1001: sync-file aborted: Add api-gateway.request stream config [[phab:T259736|T259736]] (duration: 05m 01s)
* 21:47 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1278.eqiad.wmnet
* 21:46 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 21:43 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:42 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1279.eqiad.wmnet
* 21:42 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 21:38 sbassett@deploy1001: Synchronized private/PrivateSettings.php: Further mitigations for [[phab:T257687|T257687]] (duration: 00m 57s)
* 21:38 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 21:36 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 21:34 effie: blocking temporarily traffic to mc1020
* 21:23 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 21:20 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:19 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1276.eqiad.wmnet
* 21:12 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2240.codfw.wmnet
* 21:08 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:04 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 20:54 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 20:47 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 20:38 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 20:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 20:27 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 20:24 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 20:20 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 20:19 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 20:19 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 20:02 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 19:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 19:39 dzahn@cumin1001: START - Cookbook sre.hosts.reboot-single
* 19:30 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 19:28 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 19:22 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 19:01 ppchelko@deploy1001: Finished deploy [restbase/deploy@7f16bad]: Add thankyouwiki [[phab:T259002|T259002]], take 3 (duration: 02m 57s)
* 18:58 ppchelko@deploy1001: Started deploy [restbase/deploy@7f16bad]: Add thankyouwiki [[phab:T259002|T259002]], take 3
* 18:58 ppchelko@deploy1001: Finished deploy [restbase/deploy@7f16bad]: Add thankyouwiki [[phab:T259002|T259002]], take 2 (duration: 11m 19s)
* 18:46 ppchelko@deploy1001: Started deploy [restbase/deploy@7f16bad]: Add thankyouwiki [[phab:T259002|T259002]], take 2
* 18:46 ppchelko@deploy1001: Finished deploy [restbase/deploy@7f16bad]: Add thankyouwiki [[phab:T259002|T259002]] (duration: 131m 17s)
* 18:46 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 18:43 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 18:39 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 18:32 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 18:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|808c17d28c5ebf5ed75f70c224d66129eb2edcd8}}: Change logo for lldwiki to match the requested one ([[phab:T259432|T259432]]) (duration: 00m 56s)
* 18:04 urbanecm@deploy1001: Synchronized static/images/project-logos/: {{Gerrit|67e8f886cd1a9cd2b63ed69761bec6c52889a5b6}}: Add logo files for lldwiki ([[phab:T259432|T259432]]) (duration: 00m 56s)
* 17:17 cdanis@cumin1001: conftool action : set/pooled=yes; selector: name=mw1359.*
* 17:06 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 17:04 oblivian@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,dc=codfw,name=mw2246.codfw.wmnet
* 17:01 oblivian@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97)
* 16:36 jynus: restart backup2001, backup1001 one after the other
* 16:35 ppchelko@deploy1001: Started deploy [restbase/deploy@7f16bad]: Add thankyouwiki [[phab:T259002|T259002]]
* 16:31 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 16:27 urbanecm@deploy1001: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 00m 56s)
* 16:23 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingSchemas - remove unneeded override for SearchSatisfaction - [[phab:T259163|T259163]] (duration: 00m 56s)
* 16:22 cdanis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:21 oblivian@cumin1001: conftool action : set/pooled=inactive; selector: cluster=jobrunner,dc=codfw,name=mw2250.codfw.wmnet
* 16:20 oblivian@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,dc=codfw
* 16:20 cdanis@cumin1001: START - Cookbook sre.hosts.downtime
* 16:14 cdanis@cumin1001: conftool action : set/pooled=inactive; selector: name=mw1359.*
* 16:12 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 16:12 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 15:44 ppchelko@deploy1001: Finished deploy [restbase/deploy@ddcecce]: [[phab:T257943|T257943]] [[phab:T260556|T260556]] [[phab:T253478|T253478]] [[phab:T254490|T254490]] [[phab:T259054|T259054]]. take 3. feeds timed out (duration: 01m 31s)
* 15:43 ppchelko@deploy1001: Started deploy [restbase/deploy@ddcecce]: [[phab:T257943|T257943]] [[phab:T260556|T260556]] [[phab:T253478|T253478]] [[phab:T254490|T254490]] [[phab:T259054|T259054]]. take 3. feeds timed out
* 15:43 ppchelko@deploy1001: Finished deploy [restbase/deploy@ddcecce]: [[phab:T257943|T257943]] [[phab:T260556|T260556]] [[phab:T253478|T253478]] [[phab:T254490|T254490]] [[phab:T259054|T259054]]. take 2. feeds timed out (duration: 20m 40s)
* 15:36 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕦☕ homer 'cr*'  commit 'revert skipping RPKI validation for Jio AS55836 {{Gerrit|I0fd4683}} [[phab:T260452|T260452]]'
* 15:30 cdanis: ❌cdanis@cumin1001.eqiad.wmnet ~ 🕦☕ homer 'cr*-codfw*'  commit 'revert skipping RPKI validation for Jio AS55836 {{Gerrit|I0fd4683}} [[phab:T260452|T260452]]'
* 15:22 ppchelko@deploy1001: Started deploy [restbase/deploy@ddcecce]: [[phab:T257943|T257943]] [[phab:T260556|T260556]] [[phab:T253478|T253478]] [[phab:T254490|T254490]] [[phab:T259054|T259054]]. take 2. feeds timed out
* 15:22 ppchelko@deploy1001: Finished deploy [restbase/deploy@ddcecce]: [[phab:T257943|T257943]] [[phab:T260556|T260556]] [[phab:T253478|T253478]] [[phab:T254490|T254490]] [[phab:T259054|T259054]] (duration: 02m 30s)
* 15:19 ppchelko@deploy1001: Started deploy [restbase/deploy@ddcecce]: [[phab:T257943|T257943]] [[phab:T260556|T260556]] [[phab:T253478|T253478]] [[phab:T254490|T254490]] [[phab:T259054|T259054]]
* 15:08 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 15:06 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 15:04 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 14:57 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingSchemas - schema revision version bump for erroring schemas - all wikis (take 2) - [[phab:T254606|T254606]] (duration: 00m 53s)
* 14:57 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 14:57 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-cluster
* 14:44 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingSchemas - schema revision version bump for erroring schemas - all wikis - [[phab:T254606|T254606]] (duration: 00m 55s)
* 14:35 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingSchemas - schema revision version bump for erroring schemas - group0 - [[phab:T254606|T254606]] (duration: 00m 56s)
* 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1088 after upgrading its mysql package', diff saved to https://phabricator.wikimedia.org/P12277 and previous config saved to /var/cache/conftool/dbconfig/20200817-141449-marostegui.json
* 14:09 marostegui: Sanitize thankyouwiki on db1124:3315, db2094:3315 - [[phab:T260551|T260551]]
* 14:03 marostegui: Sanitize lldwiki on db1124:3315 and db2094:3315 [[phab:T259436|T259436]]
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1088 after upgrading its mysql package', diff saved to https://phabricator.wikimedia.org/P12276 and previous config saved to /var/cache/conftool/dbconfig/20200817-140229-marostegui.json
* 13:58 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T259432|T259432]])
* 13:54 Urbanecm: Creating thankyouwiki and lldwiki is done
* 13:54 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 01m 52s)
* 13:54 Urbanecm: Create account Pcoombe (WMF) at thankyouwiki, email set to pcoombe@wikimedia.org ([[phab:T259002|T259002]])
* 13:52 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating thankyouwiki ([[phab:T259002|T259002]]) (duration: 00m 55s)
* 13:51 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating thankyouwiki ([[phab:T259002|T259002]]) (duration: 00m 55s)
* 13:49 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating thankyouwiki ([[phab:T259002|T259002]])
* 13:48 urbanecm@deploy1001: Synchronized dblists: Creating thankyouwiki ([[phab:T259002|T259002]]) (duration: 00m 55s)
* 13:47 marostegui: Deploy MCR change on db1104
* 13:47 urbanecm@deploy1001: Synchronized wmf-config/db-codfw.php: Creating thankyouwiki ([[phab:T259002|T259002]]) (duration: 00m 56s)
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104 for MCR change', diff saved to https://phabricator.wikimedia.org/P12275 and previous config saved to /var/cache/conftool/dbconfig/20200817-134701-marostegui.json
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P12274 and previous config saved to /var/cache/conftool/dbconfig/20200817-134619-marostegui.json
* 13:46 urbanecm@deploy1001: Synchronized wmf-config/db-eqiad.php: Creating thankyouwiki ([[phab:T259002|T259002]]) (duration: 00m 55s)
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1088 after upgrading its mysql package', diff saved to https://phabricator.wikimedia.org/P12273 and previous config saved to /var/cache/conftool/dbconfig/20200817-134604-marostegui.json
* 13:41 jayme: imported td-agent-bit_1.5.3-0 to buster-wikimedia - [[phab:T260536|T260536]]
* 13:40 jayme: imported !log imported to buster-wikimedia
* 13:39 marostegui: Upgrade db1088 (s6) to a newer mysql version (10.4.14)
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1088 for mysql upgrade', diff saved to https://phabricator.wikimedia.org/P12272 and previous config saved to /var/cache/conftool/dbconfig/20200817-133905-marostegui.json
* 13:34 jbond42: deploy json-c security update to buster
* 13:33 marostegui: Restart mysql on db2102 (testing new package)
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P12271 and previous config saved to /var/cache/conftool/dbconfig/20200817-133043-marostegui.json
* 13:29 urbanecm@deploy1001: Synchronized langlist: Creating lldwiki ([[phab:T259432|T259432]]) (duration: 00m 54s)
* 13:28 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating lldwiki ([[phab:T259432|T259432]]) (duration: 00m 55s)
* 13:27 urbanecm@deploy1001: sync-file aborted: Creating lldwiki ([[phab:T259432|T259432]])¨ (duration: 00m 00s)
* 13:26 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating lldwiki ([[phab:T259432|T259432]]) (duration: 00m 53s)
* 13:25 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating lldwiki ([[phab:T259432|T259432]])
* 13:23 urbanecm@deploy1001: Synchronized dblists: Creating lldwiki ([[phab:T259432|T259432]]) (duration: 00m 56s)
* 13:22 urbanecm@deploy1001: Synchronized wmf-config/db-codfw.php: Creating lldwiki ([[phab:T259432|T259432]]) (duration: 00m 56s)
* 13:20 urbanecm@deploy1001: Synchronized wmf-config/db-eqiad.php: Creating lldwiki ([[phab:T259432|T259432]]) (duration: 00m 55s)
* 13:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P12270 and previous config saved to /var/cache/conftool/dbconfig/20200817-131307-marostegui.json
* 13:10 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 13:10 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 13:09 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P12269 and previous config saved to /var/cache/conftool/dbconfig/20200817-130127-marostegui.json
* 12:58 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:53 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:53 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:53 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:53 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depoool db1089 for MCR change', diff saved to https://phabricator.wikimedia.org/P12268 and previous config saved to /var/cache/conftool/dbconfig/20200817-124458-marostegui.json
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P12267 and previous config saved to /var/cache/conftool/dbconfig/20200817-124409-marostegui.json
* 12:44 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:38 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:35 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 12:35 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 12:35 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:27 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:22 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P12266 and previous config saved to /var/cache/conftool/dbconfig/20200817-122234-marostegui.json
* 12:21 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:20 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:19 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:19 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P12265 and previous config saved to /var/cache/conftool/dbconfig/20200817-121600-marostegui.json
* 12:05 Lucas_WMDE: EU backport window done
* 12:02 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes.php bjnwiki --fix {{!}} tee [[phab:T259429|T259429]]-fix
* 12:02 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes.php bjnwiki {{!}} tee [[phab:T259429|T259429]]-dryrun
* 12:01 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:620643{{!}}Set Portal and Portal_talk namespaces in bjnwiki as an extra namespace. (T259429)]] (duration: 00m 55s)
* 11:57 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P12264 and previous config saved to /var/cache/conftool/dbconfig/20200817-115741-marostegui.json
* 11:54 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:620513{{!}}Add Wiktionary wordmark for eswiktionary (T254059)]], part 2 (duration: 00m 57s)
* 11:53 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/mobile/copyright/wiktionary-wordmark-es.svg\n' {{!}} mwscript purgeList.php # [[phab:T254059|T254059]]
* 11:53 lucaswerkmeister-wmde@deploy1001: Synchronized static/images/mobile/copyright/wiktionary-wordmark-es.svg: Config: [[gerrit:620513{{!}}Add Wiktionary wordmark for eswiktionary (T254059)]], part 1 (duration: 00m 56s)
* 11:46 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/project-logos/zh_classicalwiki%s.png\n' '' '-1.5x' '-2x' {{!}} mwscript purgeList.php # [[phab:T259006|T259006]]
* 11:45 lucaswerkmeister-wmde@deploy1001: Synchronized static/images/project-logos/: Config: [[gerrit:620510{{!}}Change the logo of lzh Wikipedia (T259006)]] (duration: 00m 55s)
* 11:40 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:620509{{!}}Add Turkish powered by MW and Wikimedia project icons for Turkish Wikiquote, Turkish Wiktionary, Turkish Wikisource and Turkish Wikibooks (T260493)]] (duration: 00m 55s)
* 11:35 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:620507{{!}}Add Turkish powered by MW and Wikimedia project icons (T260492)]] (duration: 00m 57s)
* 11:25 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:14 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 11:12 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 11:10 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 11:09 cparle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [SDC] configure mediasearch A/B test (duration: 01m 08s)
* 11:08 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 10:57 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:54 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:52 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:52 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:52 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:51 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 10:49 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:47 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:42 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:38 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:36 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:35 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:35 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:30 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 10:29 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 10:29 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 10:14 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:14 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 10:13 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:13 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 10:10 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:07 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:06 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:58 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:58 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:56 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:55 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 09:52 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:48 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:45 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 09:42 jynus: updating compiler facts for cloud puppet compiler project to include new host dbprov2003
* 09:39 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:38 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:38 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:36 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:36 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:29 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:28 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:27 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:23 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 09:22 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:21 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 09:18 _joe_: running a full apt-get upgrade on mw1379-1380
* 09:18 _joe_: re-upgrading imagemagick on mw1378
* 09:16 _joe_: upgrading packages on mw1377
* 09:14 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:06 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:06 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 09:05 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 08:25 jayme: forcing a puppet run on all mw-api servers in eqiad - [[phab:T260329|T260329]]
* 07:52 _joe_: repooling mw1382
* 07:37 _joe_: running the same test on mw1382 [[phab:T260329|T260329]]
* 07:34 _joe_: repooling mw1381
* 07:15 _joe_: running the same test on mw1381 [[phab:T260329|T260329]]
* 07:15 _joe_: repooled mw1281
* 06:26 _joe_: stop testing on mw1281, [[phab:T260329|T260329]]
* 05:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 05:28 marostegui: Stop mysql on db1099:3311, db1099:3318 for reimage
* 05:28 _joe_: depooling mw1281 for testing for [[phab:T260329|T260329]]
* 05:25 marostegui: Deploy schema change on db1139:3311
* 05:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311, db1099:3318 for reimage and MCR change', diff saved to https://phabricator.wikimedia.org/P12263 and previous config saved to /var/cache/conftool/dbconfig/20200817-052147-marostegui.json


== 2020-08-16 ==
== 2021-10-08 ==
* 11:12 gehel: repooling wdqs1004 - catched up on lag
* 23:16 legoktm: sudo cumin -b 10 C:mediawiki::packages 'apt-get purge lilypond-data -y'
* 23:10 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 21:38 mutante: mwmaint2002 - disable-puppet, stop bacula-fd, recovery in progress
* 21:34 mutante: disabling puppet on bacula - going through a restore https://wikitech.wikimedia.org/wiki/Bacula#Restore_from_a_non-existent_host_(missing_private_key)
* 21:30 legoktm: running puppet across C:mediawiki::packages to uninstall lilypond and ploticus: legoktm@cumin1001:~$ sudo cumin -b 4 C:mediawiki::packages 'run-puppet-agent'
* 20:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 20:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 20:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 19:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 19:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 19:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 19:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 18:15 cstone: civicrm revision changed from {{Gerrit|5cb7d487cb}} to {{Gerrit|598b59b0ee}}
* 16:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=enwiki --force # to measure performance on a large wiki
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:29 jelto: enable puppet on gitlab1001 again for [[phab:T283076|T283076]]
* 14:05 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:01 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:49 Amir1: wikiadmin@10.64.16.85(wikidatawiki)> delete from wb_changes_subscription where cs_subscriber_id in ('testcommonswiki', 'mowiki');
* 09:39 Emperor: installing stress on ms-be2045 given recent h/w issues [[phab:T290881|T290881]]
* 08:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:04 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force
* 07:43 Emperor: reboot ms-be2045 [[phab:T290881|T290881]]
* 07:41 gehel: manually resuming the data reloads on wdqs1009 and wdqs2008
* 06:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:42 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 06:28 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:28 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 05:35 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:56 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 04:32 ryankemper: [[phab:T292814|T292814]] Beginning rolling restart of `cloudelastic`: `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic restart" --nodes-per-run 1 --start-datetime 2021-10-08T03:53:49 --task-id [[phab:T292814|T292814]]` on `ryankemper@cumin1001` tmux `elastic`
* 04:31 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:29 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 04:23 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@8f57a56]: 0.3.89 (duration: 08m 22s)
* 04:20 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:20 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:18 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 04:17 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 04:15 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.89` on canary `wdqs1003`; proceeding to rest of fleet
* 04:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@8f57a56]: 0.3.89
* 04:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.89`. Pre-deploy tests passing on canary `wdqs1003`
* 03:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 03:58 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 02:04 Krinkle: krinkle@deploy1002$ echo 'https://en.wikipedia.org/static/images/project-logos/jvwiktionary.png' {{!}} mwscript purgeList.php , ref [[phab:T287425|T287425]], [[phab:T292810|T292810]]
* 00:07 tgr_: deploy window over
* 00:05 tgr@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments: Backport: [[gerrit:727498{{!}}Mentee overview: Make UncachedMenteeOverviewDataProvider::getBlocksForUsers faster (T290609)]] (duration: 00m 56s)


== 2020-08-15 ==
== 2021-10-07 ==
* 21:18 gehel: depooling wdqs1004 and restarting services, will wait to catch up on lag before repooling
* 23:43 thcipriani@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 3/3 (duration: 00m 55s)
* 23:41 thcipriani@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 2/3 (duration: 00m 55s)
* 23:40 thcipriani@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 1/3 (duration: 00m 56s)
* 23:30 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 2/2 (duration: 00m 56s)
* 23:28 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikiquote-wordmark-tr.svg: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 1/2 (duration: 00m 57s)
* 21:35 urbanecm: Password reset for SUL User:LA2-bot ([[phab:T292793|T292793]])
* 20:43 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3
* 20:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281167|T281167]]
* 20:35 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 20:35 cmooney@cumin1001: START - Cookbook sre.network.cf
* 20:23 krinkle@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Gadgets/: {{Gerrit|I7c858b8c4bc}} (duration: 00m 56s)
* 20:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Echo/: {{Gerrit|8a7ff05ba28f302adb581bf430a868bb815b4ffd}}: Revert "Use namespaced CentralAuthSessionProvider" (duration: 00m 57s)
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/CentralAuth/: {{Gerrit|c01c2e4983bad8582ddd62aeb35ac9be852d493b}}: Revert "Namespace session providers" (duration: 00m 57s)
* 19:44 urbanecm: Backporting https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/727489, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/727487 in an unsafe way -- exceptions at testwikis expected, wmf.3 is not deployed elsewhere, so this should be ok
* 19:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert all wikis to 1.38.0-wmf.2 ([[phab:T281167|T281167]])
* 19:33 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): variously blocked, rolling back to testwikis for safe deploy of backports
* 19:14 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.38.0-wmf.2
* 19:07 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 19:03 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to all wikis
* 18:50 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=test2wiki
* 18:46 sukhe: running authdns-update for [[phab:T292537|T292537]]
* 18:29 urbanecm: Morning B&C window done
* 18:28 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a946c046ae17a520f8d3463a16b1435ceb4856c}}: Deploy Growth mentor dashboard to pilot wikis ([[phab:T278920|T278920]]) (duration: 01m 04s)
* 18:23 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 03s)
* 18:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 04s)
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|31770f2b3660e7d7490c0a9ab66285c1f069732d}}: shwiki: Deploy Growth features to newcomers ([[phab:T278240|T278240]]) (duration: 01m 04s)
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|33526dfed148068585289f5ac501feda72068fd9}}: Stream config changes for android_daily_stats schema ([[phab:T286000|T286000]]) (duration: 01m 06s)
* 18:10 ejegg: updated payments-wiki from {{Gerrit|6d3560d083}} to {{Gerrit|030b11da1a}}
* 18:07 arnoldokoth: gitlab2001 re-image complete ([[phab:T283076|T283076]])
* 17:30 mutante: rebooting gitlab2001.wikimedia.org
* 16:56 arnoldokoth: down timing gitlab2001 for re-imaging ([[phab:T283076|T283076]])
* 16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:32 hnowlan: roll restarting maps cassandra instances for java updates
* 16:19 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 16:19 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 15:07 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001 (duration: 00m 08s)
* 15:07 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001
* 14:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:49 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001 (duration: 00m 10s)
* 14:49 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001
* 14:48 hashar: Upgrading Gerrit replica to 3.3.6 # [[phab:T290236|T290236]]
* 14:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:56 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:46 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:30 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:29 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:29 hashar: restarting CI Jenkins for git plugin update
* 13:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 hashar: Upgraded CI Jenkins on contint2001
* 13:14 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:13 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:10 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:09 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:06 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 13:05 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:05 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:40 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 12:16 moritzm: installing testvm2005
* 11:59 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 11:52 Lucas_WMDE: EU backport+config window (aka UTC morning) done
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725858{{!}}Enable Content and Section Translation to Kurdish WP (T290238)]] (duration: 01m 04s)
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/WikidataPageBanner/includes/WikidataPageBannerFunctions.php: Backport: [[gerrit:727188{{!}}Change PropertyId to NumericPropertyId (T289125, T292667)]] (duration: 01m 05s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 jbond: update puppet stdlib gerrit:726872
* 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
* 09:26 mvernon@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host ms-be2045.codfw.wmnet
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2005.codfw.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
* 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2005.codfw.wmnet
* 08:49 mvernon@cumin2002: START - Cookbook sre.experimental.reimage for host ms-be2045.codfw.wmnet
* 08:36 moritzm: imported jenkins 2.303.2 to thirdparty/ci component for buster-wikimedia
* 07:57 Emperor: re-enabling puppet on ms-be2045 after hw work [[phab:T290881|T290881]]
* 07:39 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 07:39 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 07:38 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 07:37 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 07:34 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 07:33 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 07:32 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:31 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 06:21 ryankemper: [Elastic] Restart of `relforge` complete
* 06:05 ryankemper: [Elastic] Cluster in green status, proceeding to next and final node => `ryankemper@relforge1003:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:53 ryankemper: [Elastic] `ryankemper@relforge1004:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:48 ryankemper: [Elastic] Performing rolling restarts of `relforge`. `relforge1003` is the master so I'll restart `relforge1004` first to minimize disruption
* 03:00 ejegg: updated payments-wiki from {{Gerrit|23d0ffac66}} to {{Gerrit|6d3560d083}}
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: enable Parsoid API everywhere (duration: 01m 04s)
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 mutante: [grafana2001:~] $ sudo systemctl start rsync-var-lib-grafana  because of "PROBLEM - Check systemd state on grafana2001 is CRITICAL: CRITICAL - degraded" because of some race condition where a file vanished during sync


== 2020-08-14 ==
== 2021-10-06 ==
* 19:41 effie: restart mwdebug1002
* 23:57 mutante: releases2002 - rm /srv/org/wikimedia/reprepro/conf/distributions - contains only jessie-mediawiki - see 725670 and EOL of MediaWiki 1.31
* 16:58 cdanis: done deploying 'allow nameservers of Jio AS55836 to skip RPKI validation I9fcff8' to all routers [[phab:T260449|T260449]]
* 23:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:44 cdanis: ❌cdanis@cumin1001.eqiad.wmnet ~ 🕧☕ homer 'cr2-esams*' commit 'allow nameservers of Jio AS55836 to skip RPKI validation I9fcff8'
* 23:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:39 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕧☕ homer 'cr1-codfw*' commit 'allow nameservers of Jio AS55836 to skip RPKI validation I9fcff8'
* 23:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:36 cdanis: ❌cdanis@cumin1001.eqiad.wmnet ~ 🕧☕ homer 'cr2-codfw*' commit 'allow nameservers of Jio AS55836 to skip RPKI validation I9fcff8'
* 23:21 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 02:41 eileen: tools revision changed from {{Gerrit|9a89f45974}} to {{Gerrit|b4ebd1e564}}
* 23:20 jforrester@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ckb.svg: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 23:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:16 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726603{{!}}Enable NewUserMessage for ptwikivoyage (T290820)]] (duration: 01m 05s)
* 22:30 mutante: re-enabling puppet on mw*, an-worker* after deploying gerrit:726954. no issue this time
* 22:23 mutante: temp. disabling puppet on an-worker*, mw*
* 20:50 mutante: global puppet failure - revert is merged, puppet run will recover on next run everywhere. partially forcing with cumin, partially letting it recover naturally
* 20:43 mutante: [cumin1001:~] $ sudo cumin -b 5 -p 95 'mw2*' 'run-puppet-agent -q --failed-only'
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:05 brennen@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 01m 03s)
* 19:04 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 19:01 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): still unblocked after triage meeting, rolling to group1
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:44 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Revert disabling static mapframes on eswiki (duration: 01m 14s)
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:31 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: eswiki: Disable static mapframes ([[phab:T291736|T291736]]) (duration: 01m 17s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: viwikibooks: Set $wgRestrictDisplayTitle to false ([[phab:T289837|T289837]]) (duration: 01m 21s)
* 17:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:47 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:43 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to group0
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:35 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726596{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 04s)
* 16:35 jynus: stopping db1127 for hw maintenance [[phab:T292366|T292366]]
* 16:31 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:31 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:28 brennen@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726597{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 10s)
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 15:45 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): proceeding to deploy backports for [[phab:T292589|T292589]]
* 15:37 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 15:35 volans: installer spicerack 1.0.4 on cumin2002
* 12:50 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:48 volans: uploaded spicerack_1.0.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2004.codfw.wmnet
* 12:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:18 effie: pool mw1455 mw1422
* 12:17 urbanecm: wikiadmin@10.64.0.164(viwiki)> delete from growthexperiments_mentee_data; # cleanup after disabling mentor dashboard backend
* 12:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2004.codfw.wmnet
* 12:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1aa67d4846f39f59127a835cb7a8ed2974506025}}: viwiki: Disable mentor dashboard backend ([[phab:T278920|T278920]]) (duration: 01m 06s)
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:56 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2003.codfw.wmnet
* 11:55 XioNoX: esams - Advertise 185.15.59.0/24 instead of 185.15.58.0/23 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:50 jelto: disable puppet on gitlab1001 to test puppetized code on GitLab replica - [[phab:T283076|T283076]]
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1003.wikimedia.org
* 10:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1003.wikimedia.org
* 10:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 urbanecm@deploy1002: Synchronized wmf-config/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 2/2) (duration: 01m 05s)
* 10:01 urbanecm@deploy1002: Synchronized dblists/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 1/2) (duration: 01m 04s)
* 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host copernicium.wikimedia.org
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host copernicium.wikimedia.org
* 09:19 jbond: update ipaddress6 fact - https://gerrit.wikimedia.org/r/c/operations/puppet/+/726625
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:725923{{!}}Don't fail job if subscribed wiki is unknown (T292446 T292440)]] (duration: 01m 15s)
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 08:21 XioNoX: add ROAs for 185.15.58.0/24 and 185.15.59.0/24 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 08:04 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews --fix # [[phab:T291344|T291344]]
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews # [[phab:T291344|T291344]]
* 07:55 urbanecm: mwdebug1001: scap pull ([[phab:T291344|T291344]] fix done)
* 07:51 urbanecm: Staging at mwdebug1001 for [[phab:T291344|T291344]]
* 05:53 kart_: Updated cxserver to use nodejs12 ([[phab:T290754|T290754]])
* 05:47 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:39 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:36 Amir1: start of mwscript extensions/Wikibase/repo/maintenance/pruneChanges.php --wiki wikidatawiki --number-of-days=2
* 05:31 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 04:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:29 ryankemper: [WDQS] `wdqs1012` is back up after restarting blazegraph (blazegraph was locked up)
* 04:27 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph` (attempting to bring downed `wdqs1012` back into health)
* 04:25 ryankemper: [WDQS] Repooling eqiad hosts following the brief outage from earlier: `wdqs1004`, `wdqs1006`, `wdqs1007`
* 03:19 eileen: civicrm revision changed from {{Gerrit|b6f5f71c18}} to {{Gerrit|82efd2e195}}, config revision is {{Gerrit|f4c57d4733}}
* 03:11 tstarling@deploy1002: Synchronized php-1.38.0-wmf.3/includes/CommentFormatter/RowCommentIterator.php: fix UBN [[phab:T292590|T292590]] (duration: 01m 04s)
* 01:39 legoktm: legoktm@mwmaint1002:~$ echo "https://en.wikiversity.org/static/images/mobile/copyright/wikiversity.svg" {{!}}mwscript purgeList.php
* 01:17 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 03s)
* 01:12 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 17s)
* 00:59 arlolra@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable legacy media dom on metawiki (duration: 01m 05s)
* 00:37 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:35 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 03s)
* 00:32 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:29 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 04s)
* 00:16 mutante: puppetmasters: rm /etc/logrotate.d/geoipupdate && systemctl start logrotate && puppet agent -tv
* 00:14 mutante: puppetmaster2002 - rm /etc/logrotate.d/geoipupdate (not managed by puppet anymore but not removed, caused duplicate logrotate config, made logrotate service fail), start logrotate
* 00:08 cstone: civicrm revision changed from {{Gerrit|34d3c3aae8}} to {{Gerrit|b6f5f71c18}}
* 00:01 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725132{{!}}Add WN as an alias to project namespace in Polish Wikinews (T291344)]] (duration: 01m 04s)


== 2020-08-13 ==
== 2021-10-05 ==
* 23:39 tzatziki: removing 3 files for legal compliance
* 23:54 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikiversity.svg: Config: [[gerrit:725413{{!}}Wikiversity Logo Update for 2017 Logo Version (T292109)]] (duration: 01m 03s)
* 22:03 mutante: switching xhgui from tungsten to xhgui1001 - ran puppet on webperf*001 - [[phab:T180761|T180761]] [[phab:T158837|T158837]]
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 04s)
* 21:54 andrew@deploy1001: Finished deploy [horizon/deploy@f3dcb29]: fix proxy in project-local domain --bug [[phab:T260388|T260388]] (duration: 03m 53s)
* 23:44 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-az.svg: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 23s)
* 21:50 andrew@deploy1001: Started deploy [horizon/deploy@f3dcb29]: fix proxy in project-local domain --bug [[phab:T260388|T260388]]
* 23:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725386{{!}}Add image_suggestion_interaction event stream]] (duration: 01m 12s)
* 21:11 mutante: rsyncing /var/lib/jenkins from releases1001 to releases1002 and then all other releases* servers. 57GB, overwriting existing data from manual config ([[phab:T247652|T247652]])
* 23:02 legoktm: deleting old stretch docker images from the registry for [[phab:T292485|T292485]]
* 20:53 kormat: dropping xhgui.xhgui on m2
* 22:24 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 19:35 thcipriani@deploy1001: Synchronized php-1.36.0-wmf.4/extensions/DiscussionTools: [[gerrit:620030{{!}}Revert new reply API (again)]] [[phab:T259855|T259855]] (duration: 00m 57s)
* 22:20 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]) rolling back to testwikis for the day; will revisit in US-morning
* 18:06 herron: restarted ES on logstash1010
* 20:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 18:05 dpifke@deploy1001: Synchronized wmf-config/ProductionServices.php: Enabling new XHGui backend ([[phab:T180761|T180761]]) (duration: 00m 56s)
* 20:44 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/includes/page: Backport: [[gerrit:726594{{!}}Pre-format comments for non-local files too]] ([[phab:T292570|T292570]]) (duration: 01m 04s)
* 17:16 hnowlan: deployed ATS and varnish rules to route api.wikimedia.org
* 20:18 mutante: puppetmaster1003 et al - converting maxmind geoip database fetching from cron to timers
* 16:26 hnowlan: created api.wikimedia.org
* 20:06 mutante: cumin 'puppetmaster*' "disable-puppet '[[phab:T288844|T288844]] - [[phab:T273673|T273673]] - gerrit:721595 - $<nowiki>{</nowiki>USER<nowiki>}</nowiki>'"
* 15:49 hnowlan: moving api-gateway service to state production. critical set to false
* 19:30 mutante: restoring /home/amire80 from and to mwmaint2002 via Bacula bconsole ([[phab:T292573|T292573]])
* 15:41 herron: restart ES on logstash1012
* 19:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 14:56 fdans@deploy1001: Finished deploy [analytics/refinery@ba1a439]: Regular analytics weekly train (duration: 11m 34s)
* 19:03 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 14:45 ema: repool mw1382 with kernel memory accounting disabled [[phab:T260281|T260281]]
* 18:26 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.23 (duration: 01m 57s)
* 14:45 fdans@deploy1001: Started deploy [analytics/refinery@ba1a439]: Regular analytics weekly train
* 18:23 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.21 (duration: 04m 20s)
* 14:41 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:21 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]): pruning old branches, starting with 1.37.0-wmf.21, proceeeding to 1.37.0-wmf.23 if time allows
* 14:40 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:11 ppchelko@deploy1002: Synchronized wmf-config: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] Php72ToUpper.php removal (duration: 01m 06s)
* 14:38 ema: reboot mw1382 with kernel memory accounting disabled [[phab:T260281|T260281]]
* 18:04 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] CS.php (duration: 01m 06s)
* 14:34 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 17:55 brennen@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 45m 59s)
* 14:34 _joe_: rebooting mw1381 with a newer kernel, mw1383 as control with the old kernel [[phab:T260329|T260329]]
* 17:12 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 14:33 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single
* 17:09 brennen@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 14:31 _joe_: installing kernel 4.19.0-0.bpo.9 on mw1381 [[phab:T260329|T260329]]
* 17:03 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 14:05 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 17:02 btullis@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 14:00 elukey: create schema[12]00[34] in ganeti - [[phab:T260347|T260347]]
* 17:02 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 13:59 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 16:56 brennen: successfully applied security patches for 1.38.0-wmf.3 train ([[phab:T281167|T281167]])
* 13:58 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 16:47 brennen: coordinated with deployment backup and starting train prep for 1.38.0-wmf.3 ([[phab:T281167|T281167]]), branched at {{Gerrit|65279490f82c785181b8b6961e40901a4aaafca4}}
* 13:53 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:57 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 13:51 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:57 jbond@cumin2002: START - Cookbook sre.puppet.renew-cert for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 13:46 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:38 jbond: reimage puppetboard2002
* 13:45 hnowlan: moving api-gateway service to monitoring_setup
* 15:15 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 13:44 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:15 jbond@cumin1001: START - Cookbook sre.puppet.renew-cert for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 13:44 hashar: Gracefully restarting Zuul
* 15:10 moritzm: imported routinator 0.10.1-1bullseye to thirdparty/routinator for bullseye-wikimedia [[phab:T292503|T292503]]
* 13:39 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 14:58 jbond: reimage puppetboard1002
* 13:10 _joe_: forcing a puppet run on the api appservers in eqiad  [[phab:T260329|T260329]]
* 14:40 effie: depool  mw1455 and mw1422
* 13:07 oblivian@deploy1001: Synchronized wmf-config/CommonSettings.php: revert enabling of lilypond (again) [[phab:T257091|T257091]] [[phab:T260329|T260329]] (duration: 00m 59s)
* 14:30 Pchelolo: run foreachwiki uppercaseTitlesForUnicodeTransition.php --charmap current_to_php7_overrides.php [[phab:T219279|T219279]]
* 11:24 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:51 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor - Drop REL1_31, start REL1_37 (duration: 00m 57s)
* 11:20 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 13:46 Pchelolo: run renameInvalidUsernames.php --wiki loginwiki --list /tmp/rename_users_for_uppercase_all.txt [[phab:T219279|T219279]]
* 11:09 hnowlan: restarting pybal on lvs2010 [[phab:T254908|T254908]]
* 13:39 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 11:09 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 13:39 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 11:06 hnowlan: restarting pybal on lvs2009 [[phab:T254908|T254908]]
* 13:23 ppchelko@deploy1002: Synchronized php-1.38.0-wmf.2/maintenance/uppercaseTitlesForUnicodeTransition.php: Backport uppercaseTitlesForUnicodeTransition.php maintenance script improvements [[phab:T219279|T219279]] (duration: 00m 58s)
* 11:05 hnowlan: restarting pybal on lvs1016 [[phab:T254908|T254908]]
* 12:53 ema: upload varnish 6.0.8-1wm1 to apt.wikimedia.org [[phab:T292290|T292290]]
* 11:05 jayme: depool mw1380 for downgrade of poppler-utils,libpoppler-glib8,libpoppler64,curl,libcurl3,libcurl3-gnutls,libpython3.5,python3.5,libpython3.5-stdlib,python3.5-minimal,libpython3.5-minimal,imagemagick-6-common,libmagickcore-6.q16-3,libmagickwand-6.q16-3,imagemagick-6.q16,imagemagick,e2fslibs,e2fsprogs,libcomerr2,libss2 and reboot - [[phab:T260329|T260329]]
* 12:43 elukey: import AMD ROCm 4.2 to buster-wikimedia's thirdparty/amd-rocm42 - [[phab:T287267|T287267]]
* 11:05 hnowlan: restarting pybal on lvs1015 [[phab:T254908|T254908]]
* 12:24 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 11:04 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single
* 11:58 hnowlan: reverted restbase2023 to use CN=hostname certificate due to loading errors
* 10:42 hnowlan: Moving api-gateway service to from service_setup to lvs_setup and running puppet on LVS servers
* 11:57 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 10:17 jayme: depool mw1379 for downgrade of poppler-utils,libpoppler-glib8,libpoppler64,curl,libcurl3,libcurl3-gnutls,libpython3.5,python3.5,libpython3.5-stdlib,python3.5-minimal,libpython3.5-minimal,imagemagick-6-common,libmagickcore-6.q16-3,libmagickwand-6.q16-3,imagemagick-6.q16,imagemagick,e2fslibs,e2fsprogs,libcomerr2,libss2 and reboot - [[phab:T260329|T260329]]
* 11:57 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 10:04 XioNoX: re-order OSPF interfaces on all routers (now partially netbox driven)
* 11:37 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 09:37 ayounsi@deploy1001: Finished deploy [homer/deploy@89636df]: Homer release v0.2.5 (duration: 03m 03s)
* 11:28 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 09:34 ayounsi@deploy1001: Started deploy [homer/deploy@89636df]: Homer release v0.2.5
* 11:17 hnowlan_: disabling puppet on cassandra nodes for rollout of 724061 - defaulting to cn=fqdn certificates
* 08:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 11:15 effie: upgrade scap to 4.0.2 - [[phab:T291095|T291095]]
* 08:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 11:12 urbanecm@deploy1002: Synchronized dblists/commonsuploads.dblist: {{Gerrit|04524992865b0ae5750eb6fb0a374aa74a65b383}}: Enable local uploads for tcywiki ([[phab:T166763|T166763]]) (duration: 00m 59s)
* 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1082', diff saved to https://phabricator.wikimedia.org/P12247 and previous config saved to /var/cache/conftool/dbconfig/20200813-085547-marostegui.json
* 10:11 vgutierrez: update acme-chief to version 0.32 on acmechief hosts - [[phab:T290249|T290249]]
* 08:45 _joe_: downgrading imagemagick on mw1378 [[phab:T260329|T260329]]
* 10:09 vgutierrez: update acme-chief to version 0.32 on acmechief-test hosts - [[phab:T290249|T290249]]
* 08:43 _joe_: downgrading imagemagick on mw1378 [[phab:T260281