You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(ejegg: restarted fundraising jobs on main CiviCRM box)
imported>Stashbot
(mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .)
 
(436 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== 2020-06-24 ==
== 2021-10-27 ==
* 00:35 ejegg: restarted fundraising jobs on main CiviCRM box
* 23:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:33 ejegg: updated Fundraising CiviCRM from {{Gerrit|f01b036128}} to {{Gerrit|52a32f2d66}}
* 23:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:40 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Allow upload by URL for Wikisources ([[phab:T293205|T293205]]), and enable it on enwikisource for autoconfirmed ([[phab:T294447|T294447]]) (duration: 01m 03s)
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:28 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734361{{!}}Add mobile wordmark for Meetei (Manipuri) Wikipedia to config (T294189)]] (duration: 01m 02s)
* 23:27 catrope@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-mni.svg: Config: [[gerrit:734361{{!}}Add mobile wordmark for Meetei (Manipuri) Wikipedia to config (T294189)]] (duration: 01m 03s)
* 23:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:06 reedy@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/SecurePoll/includes/Crypt/GpgCrypt.php: [[phab:T294489|T294489]] (duration: 01m 15s)
* 21:42 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist wikipedia namespaceDupes.php --fix {{!}} tee namespacedupes-wikipedia-real.log # run namespaceDupes.php for all Wikipedias
* 21:38 urbanecm: run namespaceDupes.php for a bunch of Wikipedias
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:51 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T294489|T294489]] (duration: 01m 59s)
* 20:51 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2255.codfw.wmnet
* 20:47 mutante: mw2255 - scap pull, repooling - after DRAC firmware was upgraded - [[phab:T283582|T283582]]
* 20:47 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet
* 19:53 bblack: cp5xxx: switching unified cert to digicert-2021
* 19:49 bblack: cp5007: switching unified cert to digicert-2021
* 19:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:36 twentyafterfour@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]] (duration: 01m 47s)
* 19:34 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 19:28 bblack: cp5001: switching unified cert to digicert-2021
* 19:24 bblack: cp5xxx: disabling puppet ahead of digicert unified certificate update rollout
* 18:46 legoktm: installing python-swiftclient on mw1305 for debugging
* 18:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:23 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:22 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734697{{!}}Disable Education Program namespaces in eswiki (T294365)]] (duration: 01m 04s)
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:10 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734451{{!}}Temporarily change the votewiki lang to fa (T292685)]] (duration: 01m 04s)
* 17:40 otto@deploy1002: Finished deploy [analytics/refinery@0d79e18] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0d79e18] (duration: 06m 30s)
* 17:34 otto@deploy1002: Started deploy [analytics/refinery@0d79e18] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0d79e18]
* 17:29 otto@deploy1002: Finished deploy [analytics/refinery@0d79e18] (thin): Regular analytics weekly train THIN [analytics/refinery@0d79e18] (duration: 00m 07s)
* 17:29 otto@deploy1002: Started deploy [analytics/refinery@0d79e18] (thin): Regular analytics weekly train THIN [analytics/refinery@0d79e18]
* 16:42 otto@deploy1002: Finished deploy [analytics/refinery@0d79e18]: Regular analytics weekly train [analytics/refinery@0d79e18] (duration: 20m 30s)
* 16:29 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 16:29 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 16:21 otto@deploy1002: Started deploy [analytics/refinery@0d79e18]: Regular analytics weekly train [analytics/refinery@0d79e18]
* 15:36 ejegg: updated payments-wiki from {{Gerrit|6e810fb401}} to {{Gerrit|5b9fdd0fe1}}
* 15:28 volans: deployed new prefixes for drmrs in modules/network/data/data.yaml - [[phab:T282787|T282787]]
* 15:12 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:08 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 15:07 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:03 volans@cumin2002: START - Cookbook sre.dns.netbox
* 14:56 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:51 volans@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 marostegui: Replace m5-master so it points to dbproxy1017 - [[phab:T288093|T288093]]
* 13:58 elukey: removed /var/run/confd-template/.inference*.err files from puppetmaster2001 (backup saved in /home/elukey just in case)
* 10:53 jbond: enable puppet fleet wide post gerrit:734937
* 10:43 jbond: disable puppet fleet wide to deploy a puppetmaster change gerrit:734937
* 10:43 jbond: disable puppet fleet wide to deploy a puppetmaster change
* 10:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:12 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/GrowthExperiments/: {{Gerrit|305e97a}}, {{Gerrit|b9eaa20}}: GrowthExperiments backports ([[phab:T293434|T293434]], [[phab:T294386|T294386]]) (duration: 01m 04s)
* 10:10 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GrowthExperiments/: {{Gerrit|305e97a}}, {{Gerrit|667a4be}}: GrowthExperiments backports ([[phab:T293434|T293434]], [[phab:T294386|T294386]]) (duration: 01m 04s)
* 10:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:02 urbanecm: [urbanecm@mwdebug1001 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=cswiki --dbshard=s2 --verbose # testing 734752
* 10:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:01 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=cswiki --dbshard=s2 --verbose # testing 734752
* 09:25 godog: another run of backfill on graphite1004 - [[phab:T294355|T294355]]
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist replicas from s6 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17615 and previous config saved to /var/cache/conftool/dbconfig/20211027-092043-marostegui.json
* 09:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:04 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:50 topranks: Enabling Telxius circuit from cr1-eqiad to asw1-b12-drmrs with homer.
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'Contributions replicas from s6.codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17614 and previous config saved to /var/cache/conftool/dbconfig/20211027-074935-marostegui.json
* 07:26 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist replicas from s6.codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17613 and previous config saved to /var/cache/conftool/dbconfig/20211027-072546-marostegui.json
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges and recentchangeslinked replicas from s6.codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17612 and previous config saved to /var/cache/conftool/dbconfig/20211027-060634-marostegui.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager replicas from s6.codfw [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17611 and previous config saved to /var/cache/conftool/dbconfig/20211027-053104-marostegui.json


== 2020-06-23 ==
== 2021-10-26 ==
* 23:16 wkandek: releases1002 is back after being moved to row D ([[phab:T255590|T255590]])
* 22:59 legoktm: uploaded python-logstash to buster-wikimedia for [[phab:T294393|T294393]]
* 23:11 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 21:29 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 22:35 ejegg: disabled fundraising jobs on civi1001 for testing on civi2001
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:24 wkandek@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 21:13 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 22:13 AndyRussG: updated payments-wiki from {{Gerrit|5fd4eb1519}} to {{Gerrit|28ad76dcd7}}
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:06 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:04 reedy@deploy1002: Synchronized php-1.38.0-wmf.5/tests/phpunit/includes/api/query/ApiQueryImageInfoTest.php: [[phab:T293783|T293783]] (duration: 01m 02s)
* 21:23 wkandek@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:23 dzahn@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97)
* 21:03 reedy@deploy1002: Synchronized php-1.38.0-wmf.6/tests/phpunit/includes/api/query/ApiQueryImageInfoTest.php: [[phab:T293783|T293783]] (duration: 01m 02s)
* 21:23 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:01 reedy@deploy1002: Synchronized php-1.38.0-wmf.6/includes/api/ApiQueryImageInfo.php: [[phab:T293783|T293783]] (duration: 01m 03s)
* 21:22 wkandek@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 21:00 reedy@deploy1002: Synchronized php-1.38.0-wmf.5/includes/api/ApiQueryImageInfo.php: [[phab:T293783|T293783]] (duration: 01m 03s)
* 21:22 wkandek@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:22 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 20:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:22 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:15 wkandek@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)
* 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:14 wkandek@cumin1001: START - Cookbook sre.hosts.decommission
* 19:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:31 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on all wikis - take 2 - [[phab:T238230|T238230]] (duration: 01m 06s)
* 19:51 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 19:16 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on all wikis - [[phab:T238230|T238230]] (duration: 01m 05s)
* 19:48 eileen: civicrm revision changed from {{Gerrit|733a8fceda}} to {{Gerrit|dba74c443b}}, config revision is {{Gerrit|eed79486d5}}
* 19:06 brennen@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.38
* 19:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:55 mutante: gerrit1001 (prod) - restarting gerrit service to verify config changes
* 19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:53 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on group0 - [[phab:T238230|T238230]] (duration: 01m 06s)
* 19:38 twentyafterfour@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]] (duration: 25m 28s)
* 18:24 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254925|T254925]] [[phab:T246489|T246489]] (duration: 01m 06s)
* 19:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:04 brennen@deploy1001: Finished scap: testwikis wikis to 1.35.0-wmf.38 (duration: 85m 53s)
* 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:39 brennen@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.38
* 19:16 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 16:01 brennen: 1.35.0-wmf.38 was branched at {{Gerrit|a35f7318}} for https://phabricator.wikimedia.org/T254175
* 19:13 twentyafterfour@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 15:47 moritzm: prune nginx packages on mwdebug hosts [[phab:T255565|T255565]]
* 18:47 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:37 moritzm: prune nginx packages on mw1380-mw1412 [[phab:T255565|T255565]]
* 17:52 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS (duration: 01m 34s)
* 15:28 moritzm: installing libvpx security updates
* 17:50 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 15:27 mutante: removing ganeti VM xhgui1001 from eqiad row_A, will recreate in another row for rebalancing VMs between rows ([[phab:T180761|T180761]] [[phab:T238098|T238098]])
* 17:09 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 37s)
* 15:26 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 17:06 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 15:18 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 17:05 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS (duration: 1100m 51s)
* 15:12 mutante: removing ganeti VM releases1002 in eqiad row_A - will recreate in another row to re-balance ([[phab:T255590|T255590]])
* 16:25 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 15:12 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 16:25 cdanis@cumin1001: START - Cookbook sre.network.cf
* 15:10 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 16:24 mutante: [mwmaint1002:~] $ sudo systemctl start mediawiki_job_wikidata_resubmit_changes_for_dispatch
* 14:56 moritzm: failover ganeti master in eqiad to ganeti1011
* 16:23 mutante: mwmaint1002 - running puppet, created new mw periodic job from gerrit:732972 ([[phab:T294031|T294031]])
* 14:55 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 16:07 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:50 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 16:04 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 14:48 urbanecm@deploy1001: Synchronized private/PrivateSettings.php: [[phab:T250887|T250887]] (duration: 00m 58s)
* 15:45 lucaswerkmeister-wmde@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:08 mholloway-shell@deploy1001: Finished deploy [recommendation-api/deploy@db7fd80]: Update recommendation-api to {{Gerrit|7e00177}} (duration: 03m 13s)
* 15:41 lucaswerkmeister-wmde@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:05 mholloway-shell@deploy1001: Started deploy [recommendation-api/deploy@db7fd80]: Update recommendation-api to {{Gerrit|7e00177}}
* 15:38 lucaswerkmeister-wmde@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 13:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:27 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 13:54 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:27 cdanis@cumin1001: START - Cookbook sre.network.cf
* 13:54 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:07 topranks: Running homer against cr3-esams to create new temp GRE tunnel to asw1-b12-drmrs
* 13:54 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:02 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 13:34 moritzm: draining ganeti1012 for eventual reboot
* 15:02 cdanis@cumin1001: START - Cookbook sre.network.cf
* 13:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:55 topranks: Adding static route on cr3-esams to asw1-b12-drmrs Telia link IP to allow GRE to be built.
* 13:28 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:50 elukey: ran "Capirca Host Definition" script on netbox - output https://netbox.wikimedia.org/extras/scripts/results/1787315/
* 12:56 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase: {{Gerrit|7723cf724df9ede49129443e43336e93efcd7a41}}: RecentChangeFactory: Add missing rc_logid value ([[phab:T293885|T293885]]) (duration: 01m 02s)
* 12:54 jynus@cumin1001: START - Cookbook sre.hosts.downtime
* 13:40 elukey: ran "Capirca Host Definition" script on netbox-next to get up-to-date aqs_group host definition - result https://netbox-next.wikimedia.org/extras/scripts/results/894348/
* 12:45 moritzm: draining ganeti1011 for eventual reboot
* 13:24 kart_: Updated cxserver to 2021-10-25-123807-production ([[phab:T217747|T217747]], [[phab:T218217|T218217]], [[phab:T292421|T292421]])
* 12:45 marostegui: Deploy schema change on s6 codfw master (lag will appear on codfw) - [[phab:T253276|T253276]]
* 13:19 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 12:36 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:13 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 12:31 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:05 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 12:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:05 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.4 (duration: 31m 07s)
* 11:56 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:35 awight: EU BACON cooked
* 12:35 hashar: scap clean --delete 1.38.0-wmf.4 # [[phab:T293947|T293947]]
* 11:34 awight@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/TwoColConflict/: BACON: [[gerrit:607248{{!}}Fix broken copy link in JS mode (T253724)]] (duration: 00m 57s)
* 12:32 hashar: Applied security patches to 1.38.0-wmf.6 # [[phab:T293947|T293947]]
* 11:07 mlitn@deploy1001: Synchronized wmf-config/InitialiseSettings.php: test commons: Use the database name in the Wikibase entity source config (duration: 00m 59s)
* 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:04 moritzm: draining ganeti1008 for eventual reboot
* 12:31 hashar: scap prep 1.38.0-wmf.6 # [[phab:T293947|T293947]]
* 10:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:16 jbond: upload cas_6.4.2-1+wmf10u3_amd64
* 10:55 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:42 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:42 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:40 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:38 moritzm: temporarily shutdown xhgui1001/releases1002 to reshuffle Ganeti instances for reboots
* 11:51 urbanecm@deploy1002: Finished scap: {{Gerrit|c131f32e5e0804c8f5c2ec768b334c81a1b35151}}: Add namespace translations for [ami] Amis and [pwn] Paiwan ([[phab:T292414|T292414]], [[phab:T292415|T292415]]) (duration: 02m 25s)
* 10:38 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 11:49 urbanecm@deploy1002: Started scap: {{Gerrit|c131f32e5e0804c8f5c2ec768b334c81a1b35151}}: Add namespace translations for [ami] Amis and [pwn] Paiwan ([[phab:T292414|T292414]], [[phab:T292415|T292415]])
* 10:35 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:35 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:13 urbanecm@deploy1002: Synchronized logos/config.yaml: {{Gerrit|575a6a66b279c3d2d8974ffcc4911cc5b927be47}}: Fix HD logo size in some wikis ([[phab:T250731|T250731]]; 2/2) (duration: 00m 55s)
* 10:22 kormat: reimaging db1088 to buster [[phab:T250666|T250666]]
* 11:13 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|575a6a66b279c3d2d8974ffcc4911cc5b927be47}}: Fix HD logo size in some wikis ([[phab:T250731|T250731]]; 1/2) (duration: 00m 57s)
* 10:03 jynus@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:01 jynus@cumin2001: START - Cookbook sre.hosts.downtime
* 10:46 jbond: upload cas_6.4.2-1+wmf10u2_amd64.deb
* 09:48 jbond42: add new CI check for cloud yaml data https://gerrit.wikimedia.org/r/c/operations/puppet/+/606444/
* 10:40 mvernon@cumin2002: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=swift
* 09:46 jynus: stopping and reimaging db2101 into buster [[phab:T254871|T254871]]
* 10:39 mvernon@cumin2002: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=swift-ro
* 09:32 marostegui: Reload haproxy on dbproxy1012 and dbproxy1014 to test db1097 as secondary for 24h [[phab:T254556|T254556]]
* 10:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:46 ema: mwmaint1002: add uid=abban,ou=people,dc=wikimedia,dc=org to group 'nda' [[phab:T255775|T255775]]
* 10:07 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Switching back graphite to eqiad (duration: 00m 55s)
* 08:38 XioNoX: re-enable peering BGP sessions on AMS-IX - [[phab:T253970|T253970]]
* 10:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:03 moritzm: draining ganeti1007 for eventual reboot
* 10:06 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Switching back graphite to eqiad (duration: 01m 04s)
* 07:58 XioNoX: restart scs-a8-eqiad - [[phab:T256101|T256101]]
* 09:49 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 07:51 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:49 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 07:49 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 09:47 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 07:42 marostegui: Deploy schema change on db1088
* 09:40 godog: flip back write traffic to graphite1004 (all but mediawiki) - [[phab:T247963|T247963]]
* 07:30 marostegui: Reimage db2133 (m2 codfw master) to Buster (this will trigger haproxy IRC alert) [[phab:T250666|T250666]]
* 09:27 godog: move read traffic back to graphite1004 - [[phab:T247963|T247963]]
* 07:01 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1118', diff saved to https://phabricator.wikimedia.org/P11637 and previous config saved to /var/cache/conftool/dbconfig/20200623-070120-marostegui.json
* 08:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:06 XioNoX: disable peering BGP sessions on AMS-IX - [[phab:T253970|T253970]]
* 08:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:24 marostegui: Compress InnoDB on db1080 [[phab:T254462|T254462]]
* 08:33 ema: upload varnish_6.0.8-1wm2 to component/varnish6 on apt.wm.org [[phab:T293879|T293879]]
* 05:23 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1080 for InnoDB compression', diff saved to https://phabricator.wikimedia.org/P11636 and previous config saved to /var/cache/conftool/dbconfig/20200623-052350-marostegui.json
* 08:31 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GrowthExperiments/maintenance: {{Gerrit|91316ed5714c4426a29fefded5c4db08dbba48bb}}: Add purgeExpiredMentorStatus.php ([[phab:T280307|T280307]]) (duration: 00m 56s)
* 05:22 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P11635 and previous config saved to /var/cache/conftool/dbconfig/20200623-052254-marostegui.json
* 08:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:12 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P11634 and previous config saved to /var/cache/conftool/dbconfig/20200623-051159-marostegui.json
* 08:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:03 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P11633 and previous config saved to /var/cache/conftool/dbconfig/20200623-050314-marostegui.json
* 07:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 07:07 effie: pool mw1319 and mw1312
* 07:05 effie: pool  wtp1026.eqiad.wmnet
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17606 and previous config saved to /var/cache/conftool/dbconfig/20211026-063647-root.json
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17605 and previous config saved to /var/cache/conftool/dbconfig/20211026-062144-root.json
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17604 and previous config saved to /var/cache/conftool/dbconfig/20211026-060640-root.json
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17603 and previous config saved to /var/cache/conftool/dbconfig/20211026-055136-root.json
* 05:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17602 and previous config saved to /var/cache/conftool/dbconfig/20211026-053633-root.json
* 05:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17601 and previous config saved to /var/cache/conftool/dbconfig/20211026-052129-root.json
* 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:24 krinkle@deploy1002: Synchronized wmf-config/logging.php: {{Gerrit|I0211e1c77}} (duration: 00m 55s)
* 01:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2020-06-22 ==
== 2021-10-25 ==
* 23:41 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: touch for [[phab:T247330|T247330]] (duration: 00m 56s)
* 23:12 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Create alias for Appendix and Appendix_talk namespaces on mywiktionary ([[phab:T291146|T291146]]) (duration: 00m 55s)
* 23:36 catrope@deploy1001: Synchronized dblists/: Close trwikinews ([[phab:T247330|T247330]]) (duration: 00m 58s)
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:28 RoanKattouw: Synchronized wmf-config/InitialiseSettings.php: Create rollbacker group on elwiktionary ([[phab:T255569|T255569]])  (typoed the task number before)
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:26 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Create rollbacker group on elwiktionary ([[phab:T225569|T225569]]) (duration: 00m 56s)
* 22:57 ryankemper: [wcqs] Downtimed `wcqs*` until roughly a week from now (while we setup oauth)
* 23:21 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add localized sitename for bewikibooks ([[phab:T253962|T253962]]) (duration: 00m 57s)
* 22:53 legoktm: uploaded PHP 7.4.25 to apt.wm.o (DSA-4992-1)
* 23:16 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add domains to wgCopyUploadsDomains ([[phab:T255336|T255336]], [[phab:T255363|T255363]], [[phab:T255386|T255386]], [[phab:T255313|T255313]]) (duration: 01m 01s)
* 22:44 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 22:39 bstorm_: downtimed labstore1005 to prevent an alert during puppet merge [[phab:T253353|T253353]]
* 22:30 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 03m 04s)
* 22:38 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:27 ryankemper@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 22:35 volans@cumin1001: START - Cookbook sre.dns.netbox
* 21:53 mutante: new project language "pwn" added - Paiwan is a native language of Taiwan, spoken by the Paiwan, a Taiwanese indigenous people. [[phab:T292415|T292415]]
* 22:16 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@f2002c8]: bump glent jar to 0.2.2 (duration: 00m 56s)
* 21:52 mutante: new project language "ami" added - Sowal no 'Amis is the Formosan language of the 'Amis (or Ami), an indigenous people living along the east coast of Taiwan. - [[phab:T292414|T292414]]
* 22:15 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@f2002c8]: bump glent jar to 0.2.2
* 21:50 mutante: log authdns1001 (DNS) - sudo authdns-update, add new project language "ami" (Amis) for [[phab:T292414|T292414]] - edited langlist.tmpl which regenerates all project zones
* 22:12 volans: cleanup interfaces and addresses in Netbox for offline servers - [[phab:T233183|T233183]]
* 21:40 mutante: authdns1001 (DNS) - sudo authdns-update, add new project language "pwn" (Paiwan) for [[phab:T292415|T292415]]
* 21:59 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@6e7f9f7]: bump glent jar to 0.2.2 (duration: 00m 18s)
* 19:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 21:58 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@6e7f9f7]: bump glent jar to 0.2.2
* 19:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 17:19 mutante: gerrit1002 - let puppet remove [database] secttion from config; restart gerrit another time
* 19:47 mutante: mw2255 - depooled=inactive (incl "dsh groups"), shut down physically for [[phab:T283582|T283582]] - can be worked on anytime
* 17:14 mutante: gerrit1002 (gerrit-test): re-enabled puppet, restarted gerrit service
* 19:45 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2255.codfw.wmnet
* 16:58 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:45 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet
* 16:49 volans@cumin1001: START - Cookbook sre.dns.netbox
* 19:42 mutante: icinga - ACKing all unhandled CRIT alerts on hosts with "dev" or "test" in their name, regardless of notifications being disabled or not. just so that we get more signal than noise in actual unhandled CRITs in web UI
* 15:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 19:40 mutante: cumin2002 - sudo systemctl reset-failed to clear Icinga alert about failed but (now) non-existing service database-backups-snapshots.service, assuming it's a case of "only in active DC"
* 15:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 19:12 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 14:48 moritzm: installing mutt security updates
* 19:12 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 14:47 Amir1: creating shnwiktionary is done
* 19:07 kormat@cumin1001: dbctl commit (dc=all): 'Temporarily move mw groups to db1123 [[phab:T294295|T294295]]', diff saved to https://phabricator.wikimedia.org/P17597 and previous config saved to /var/cache/conftool/dbconfig/20211025-190717-kormat.json
* 14:44 ladsgroup@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 58s)
* 19:06 mutante: db1112 - powercycling
* 14:42 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 19:04 legoktm@cumin1001: dbctl commit (dc=all): 'Depool db1112 ([[phab:T294295|T294295]])', diff saved to https://phabricator.wikimedia.org/P17596 and previous config saved to /var/cache/conftool/dbconfig/20211025-190436-legoktm.json
* 14:41 ladsgroup@deploy1001: Synchronized static/images/project-logos/: Creating shnwiktionary ([[phab:T253029|T253029]]) (duration: 00m 56s)
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:40 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating shnwiktionary ([[phab:T253029|T253029]]) (duration: 00m 56s)
* 18:40 jforrester@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/timeline/includes/Timeline.php: Backport: [[gerrit:734312{{!}}Input may be null when rendering a self-closing tag `<timeline />` (T294020)]] (duration: 00m 55s)
* 14:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:37 ladsgroup@deploy1001: rebuilt and synchronized wikiversions files: Creating shnwiktionary ([[phab:T253029|T253029]])
* 18:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:36 ladsgroup@deploy1001: Synchronized dblists: Creating shnwiktionary ([[phab:T253029|T253029]]) (duration: 00m 58s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:24 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 55s)
* 14:10 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 18:22 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 54s)
* 14:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:19 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732840{{!}}Fix array declaration of NS_USER_TALK abbreviation on ruwikiquote (T197058)]] (duration: 00m 55s)
* 13:59 moritzm: re-enabling Puppet in codfw
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:55 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 18:15 jforrester@deploy1002: Synchronized wmf-config/flaggedrevs.php: Config: [[gerrit:732836{{!}}flaggedrevs: Drop legacy wgFlaggedRevsStatsAge config, no longer read]] (duration: 00m 55s)
* 13:51 moritzm: disable Puppet in codfw to reduce puppetdb2002 memory activity, unblocking the migration of the Ganeti instance for a reboot
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:19 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump eventlogging_Test schema version to 1.1.0 to pick up client_dt and set wgEventLoggingServiceUri for all wikis - [[phab:T238230|T238230]] (duration: 00m 58s)
* 18:12 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732254{{!}}Make reply tool available as opt-out on frwiki (T293687)]] (duration: 00m 56s)
* 13:11 marostegui: Stop MySQL on db2078 instances
* 17:41 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2253.codfw.wmnet
* 12:53 vgutierrez: upgrade to trafficserver 8.0.8~rc0-1wm1 on cp5006 and cp5012
* 17:40 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 12:45 moritzm: draining ganeti2007 for eventual reboot
* 17:39 mutante: mw2253 - scap pull after hw maintenance is over
* 12:42 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 17:32 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 12:38 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 17:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 12:31 akosiaris: failover logstash2023 from ganeti2007->ganeti2023 for migration_downtime change to apply
* 17:24 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:26 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster (duration: 01m 25s)
* 17:23 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 12:24 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster
* 17:22 XioNoX: update core routers ACLs
* 12:22 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster (duration: 00m 03s)
* 17:20 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 12:22 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster
* 16:49 XioNoX: update management routers ACLs
* 11:53 Urbanecm: EU B&C window done
* 16:36 XioNoX: DNS - Add eqsin-ulsfo transport v6 prefix - [[phab:T273308|T273308]]
* 11:50 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/VisualEditor/modules/: Backport: {{Gerrit|0a08066}}: Revert "Allow generic params to be passed to getWikitextFragment" ([[phab:T255785|T255785]]) (duration: 00m 58s)
* 16:31 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1094', diff saved to https://phabricator.wikimedia.org/P11627 and previous config saved to /var/cache/conftool/dbconfig/20200622-114554-marostegui.json
* 16:28 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 moritzm: draining ganeti2008 for eventual reboot
* 16:25 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 11:37 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster (duration: 00m 28s)
* 16:25 mmandere@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 11:37 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster
* 16:21 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 11:34 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1094', diff saved to https://phabricator.wikimedia.org/P11625 and previous config saved to /var/cache/conftool/dbconfig/20200622-113401-marostegui.json
* 16:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:30 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|74e8295}}: IS: Cleanup some redundant rows (duration: 00m 56s)
* 16:10 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2253.codfw.wmnet
* 11:29 Urbanecm: Run namespaceDupes.php for zh* projects ([[phab:T165593|T165593]])
* 16:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1094', diff saved to https://phabricator.wikimedia.org/P11623 and previous config saved to /var/cache/conftool/dbconfig/20200622-112451-marostegui.json
* 16:08 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:734298{{!}}Empty wikibase disabled access entity types on Beta (T294159)]] (beta-only) (duration: 01m 47s)
* 11:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|db952ba}}: Add zh-hans and zh-hant translation of Module and Module_talk aliases for all Zh Projects ([[phab:T165593|T165593]]) (duration: 00m 56s)
* 16:04 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1301fd4}}: Add import sources for gomwiktionary ([[phab:T255098|T255098]]) (duration: 00m 57s)
* 16:01 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 11:08 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1094', diff saved to https://phabricator.wikimedia.org/P11622 and previous config saved to /var/cache/conftool/dbconfig/20200622-110806-marostegui.json
* 15:57 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 52s)
* 11:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|defa81e}}: Disable NS_USER(_TALK) search engine indexing on trwiki ([[phab:T255538|T255538]]) (duration: 00m 58s)
* 15:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:35 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:606985{{!}} Bumping portals to master (606985)]] (duration: 00m 56s)
* 15:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:34 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:606985{{!}} Bumping portals to master (606985)]] (duration: 01m 12s)
* 15:49 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 54s)
* 09:58 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:46 jbond: upgrade cas/idp to 6.4.2
* 09:56 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 14:56 mutante: mw2253 - shut down and downtimed for 2 days
* 09:33 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1094 for reimage', diff saved to https://phabricator.wikimedia.org/P11621 and previous config saved to /var/cache/conftool/dbconfig/20200622-093323-marostegui.json
* 14:50 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 09:31 godog: roll-restart logstash in codfw/eqiad to apply configuration change
* 14:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 08:59 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:49 mutante: depooling mw2253 for DRAC upgrade ([[phab:T283582|T283582]])
* 08:56 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:48 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 08:33 moritzm: reimaging cumin1001 to buster [[phab:T245114|T245114]]
* 14:45 jbond: update cas package
* 08:13 godog: extend prometheus codfw ops filesystem to 1TB
* 14:31 marostegui: Deploy schema change on s3 codfw - [[phab:T291719|T291719]]
* 08:02 vgutierrez: upgrade to trafficserver 8.0.8~rc0-1wm1 on cp4026 and cp4032
* 12:04 ema: cp3062: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 08:02 vgutierrez: upload trafficserver 8.0.8~rc0-1wm1 to apt.wm.o (buster)
* 11:57 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 07:33 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:30 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 11:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:16 marostegui: Reimage db1117 (irc haproxy alerts will be triggered)
* 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:26 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:24 Lucas_WMDE: UTC morning backport+config window done
* 06:24 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732969{{!}}Remove dispatchLagToMaxLagFactor Wikibase setting (T292604)]] (duration: 00m 54s)
* 06:06 marostegui: Stop MySQL on dbstore1005 for reimage to Buster - [[phab:T254870|T254870]]
* 11:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:58 marostegui: Compress InnoDb on db1118 [[phab:T254462|T254462]]
* 11:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732951{{!}}Remove wikibaseDispatchRedisLockManager config (T292604)]] (duration: 00m 54s)
* 05:51 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:14 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732950{{!}}Remove wmg variables for dispatchChanges.php Wikibase settings (T292604)]] (duration: 00m 55s)
* 05:49 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:43 marostegui: Stop haproxy on dbproxy1008 - [[phab:T255406|T255406]]
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732949{{!}}Remove dispatchChanges.php-related Wikibase settings (T292604)]] (duration: 00m 55s)
* 05:33 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1118 for reimage and InnoDB compression', diff saved to https://phabricator.wikimedia.org/P11617 and previous config saved to /var/cache/conftool/dbconfig/20200622-053334-marostegui.json
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1134', diff saved to https://phabricator.wikimedia.org/P11616 and previous config saved to /var/cache/conftool/dbconfig/20200622-053104-marostegui.json
* 11:05 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732372{{!}}Remove dispatchViaJobs-related Wikibase settings (T291828)]] (duration: 00m 56s)
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11615 and previous config saved to /var/cache/conftool/dbconfig/20200622-051730-marostegui.json
* 09:52 godog: bounce uwsgi graphite web on graphite2003 - [[phab:T294220|T294220]]
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11614 and previous config saved to /var/cache/conftool/dbconfig/20200622-051720-marostegui.json
* 09:52 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11613 and previous config saved to /var/cache/conftool/dbconfig/20200622-050259-marostegui.json
* 09:48 volans@cumin1001: START - Cookbook sre.dns.netbox
* 04:50 marostegui: Deploy schema change on s3 primary master with a big sleep between wikis - [[phab:T250066|T250066]]
* 09:43 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:733089{{!}}[BETA CLUSTER] Enable WikibaseLexeme Scribunto access (T294159)]] (merged on Friday, syncing now to avoid outdated files even if it’s just -labs.php) (duration: 00m 55s)
* 04:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11612 and previous config saved to /var/cache/conftool/dbconfig/20200622-044853-marostegui.json
* 09:18 godog: bounce graphite-web on graphite2003 to test timeout bump - [[phab:T294220|T294220]]
* 08:08 XioNoX: merge DNS changes to add drmrs
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 05:47 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,name=wtp1026.*
* 05:43 _joe_: pooling wtp1042 [[phab:T294212|T294212]]
* 05:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1109.eqiad.wmnet with OS buster
* 05:01 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1109.eqiad.wmnet with OS buster
* 04:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 (s8) for reimage [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17590 and previous config saved to /var/cache/conftool/dbconfig/20211025-043028-marostegui.json


== 2020-06-20 ==
== 2021-10-23 ==
* 22:56 cdanis@cumin2001: dbctl commit (dc=all): 'db1088 seems to have crashed', diff saved to https://phabricator.wikimedia.org/P11611 and previous config saved to /var/cache/conftool/dbconfig/20200620-225624-cdanis.json
* 16:40 dcausse: restarting blazegraph on wdqs1004 and wdqs1006 (free allocators alert)
* 07:42 elukey: powercycle an-worker1093 - bug soft lock up CPU showed in mgmt console
* 15:45 urbanecm: Start server-side upload for 1 video file ([[phab:T289781|T289781]]), testing whether [[phab:T291137|T291137]] is still an issue
* 07:36 elukey: powercycle an-worker1091 - bug soft lock up CPU showed in mgmt console


== 2020-06-19 ==
== 2021-10-22 ==
* 18:10 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump eventlogging_Test schema version to 1.1.0 to pick up client_dt - [[phab:T238230|T238230]] (duration: 00m 59s)
* 23:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:07 mutante: ganeti4003 - rebooting install4001 - trying to bootstrap OS install from install2003
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:47 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 20:57 bblack: re-pooling eqiad in DNS
* 15:28 godog: roll-restart kibana to apply new settings
* 20:54 legoktm: <XioNoX> I disabled the interface on cr1, going to re-enabled the active on on cr2
* 13:01 moritzm: installing cups security updates (client side libs/tools)
* 20:48 legoktm: bblack has temporarily depooled eqiad https://gerrit.wikimedia.org/r/733043
* 12:31 qchris: Disabling puppet on gerrit1002 (test instance) to do some more testing
* 20:41 XioNoX: disable sessions to equinix eqiad IXP
* 12:14 godog: delete march indices from logstash 5 eqiad to free up space
* 19:17 urbanecm: Start server-side upload of 1 video file ([[phab:T294134|T294134]])
* 12:12 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:06 jbond: upload puppetboard_3.1.0-1_all.deb to ullseye-wikimedia
* 12:10 marostegui@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:42 ema: deployment-cache-upload06: restart varnish-frontend, package got upgraded to 6.0.8 [[phab:T294116|T294116]]
* 12:08 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 13:30 jbond: upload python3-pypuppetdb_2.4.0-1_all.deb to bullseye
* 12:07 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:46 jbond: upload cas_6.4.2-1+wmf10u1
* 12:06 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 12:05 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 10:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 11:39 marostegui: Reimage db2116 db2119 db2130
* 09:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294029|T294029]]
* 10:55 moritzm: installing mesa security updates
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2025.codfw.wmnet with OS buster
* 10:49 godog: close april logstash indices on logstash 5 eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 10:45 moritzm: installing tomcat8 security updates
* 08:27 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 10:38 jayme: imported chartmuseum_0.12.0-1 to buster-wikimedia
* 08:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 10:24 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1093', diff saved to https://phabricator.wikimedia.org/P11604 and previous config saved to /var/cache/conftool/dbconfig/20200619-102447-marostegui.json
* 08:23 ema: cp3062: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 10:21 godog: start closing logstash indices for 2020.03 in elastic 5 eqiad
* 08:00 ema: deployment-cache-text06: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 09:22 godog: restart elasticsearch on logstash1010
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17580 and previous config saved to /var/cache/conftool/dbconfig/20211022-055403-root.json
* 09:14 apergos: rsync from dumpsdata1003 as root to labstore1007 of dumps output files to catch up, with --bwlimit=160000 up from 80000
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17579 and previous config saved to /var/cache/conftool/dbconfig/20211022-053900-root.json
* 08:45 volans: backup netbox and run one-time script to reserve first IPs on all infra prefixes on Netbox - [[phab:T233183|T233183]]
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17578 and previous config saved to /var/cache/conftool/dbconfig/20211022-052356-root.json
* 08:45 godog: roll restart elasticsearch_5@production-logstash-eqiad
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17577 and previous config saved to /var/cache/conftool/dbconfig/20211022-050852-root.json
* 08:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17576 and previous config saved to /var/cache/conftool/dbconfig/20211022-045349-root.json
* 08:21 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 04:46 marostegui_: Deploy schema change on s8 codfw - [[phab:T291719|T291719]]
* 08:18 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 04:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17575 and previous config saved to /var/cache/conftool/dbconfig/20211022-043845-root.json
* 08:15 godog: roll-restart logstash elk5 for "JVM GC Old generation-s runs" alert
* 02:59 ejegg: updated payments-wiki from {{Gerrit|088a8cda1e}} to {{Gerrit|6e810fb401}}
* 08:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:59 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1093', diff saved to https://phabricator.wikimedia.org/P11601 and previous config saved to /var/cache/conftool/dbconfig/20200619-075907-marostegui.json
* 07:54 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:47 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:44 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P11600 and previous config saved to /var/cache/conftool/dbconfig/20200619-074420-marostegui.json
* 07:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:28 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:23 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:16 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:15 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:10 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:09 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:03 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:02 moritzm: rebooting ganeti nodes in eqiad for kernel security updates
* 06:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 06:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 06:47 moritzm: force reinstall of memcached 1.6 deb packages to ensure that the override is used in addition to the unmodified systemd unit from the deb [[phab:T233933|T233933]]
* 06:39 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:36 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:20 marostegui: Stop mysql on db2132 to reimage m1 codfw master - [[phab:T254556|T254556]]
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2075 db2111', diff saved to https://phabricator.wikimedia.org/P11599 and previous config saved to /var/cache/conftool/dbconfig/20200619-061922-marostegui.json
* 06:05 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:02 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:01 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:00 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1112', diff saved to https://phabricator.wikimedia.org/P11598 and previous config saved to /var/cache/conftool/dbconfig/20200619-055430-marostegui.json
* 05:41 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db2075 and db2111 for reimage', diff saved to https://phabricator.wikimedia.org/P11597 and previous config saved to /var/cache/conftool/dbconfig/20200619-054118-marostegui.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2108', diff saved to https://phabricator.wikimedia.org/P11596 and previous config saved to /var/cache/conftool/dbconfig/20200619-053402-marostegui.json
* 05:25 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:23 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 04:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2108 for reimage', diff saved to https://phabricator.wikimedia.org/P11595 and previous config saved to /var/cache/conftool/dbconfig/20200619-044440-marostegui.json
* 04:39 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1098:3316', diff saved to https://phabricator.wikimedia.org/P11594 and previous config saved to /var/cache/conftool/dbconfig/20200619-043956-marostegui.json
* 04:35 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P11593 and previous config saved to /var/cache/conftool/dbconfig/20200619-043554-marostegui.json


== 2020-06-18 ==
== 2021-10-21 ==
* 22:30 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on all wikis - [[phab:T249261|T249261]] (duration: 00m 56s)
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:14 volans: start check-homer-diff.service on cumin2001 after merging the fix r/606526
* 23:38 jforrester@deploy1002: Synchronized w/fatal-error.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 20:17 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on all wikis - [[phab:T249261|T249261]] (duration: 00m 57s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:44 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on group1 wikis - [[phab:T249261|T249261]] (duration: 00m 57s)
* 23:37 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 18:53 wkandek@cumin1001: conftool action : set/pooled=yes; selector: name=mw2339.codfw.wmnet
* 23:36 jforrester@deploy1002: Synchronized multiversion/: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 18:35 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:34 jforrester@deploy1002: Synchronized docroot/noc/conf/index.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 18:32 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:33 jforrester@deploy1002: Synchronized wmf-config: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 17:16 wkandek@cumin1001: conftool action : set/pooled=no; selector: name=mw2339.codfw.wmnet
* 23:32 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:14 wkandek@cumin1001: conftool action : set/pooled=yes; selector: name=mw2339.codfw.wmnet
* 23:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:12 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2339.codfw.wmnet
* 23:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:51 maryum: reindex suspended until deployment of code
* 23:25 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:730946{{!}}CommonSettings: Drop legacy CentralAuth config flag, never read (T277932)]] (duration: 00m 55s)
* 16:49 hnowlan: Shut off non-dockerised deployment-prep instance of changeprop
* 23:18 thcipriani@deploy1002: Synchronized tests/multiversion/StaticSettingsTest.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 16:15 maryum: reindexing French wiki in Elasticsearch
* 23:15 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 15:37 Reedy: creatd bot_passwords tables on officeiwki and otrs_wikiwiki [[phab:T254925|T254925]] [[phab:T246489|T246489]]
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:34 moritzm: installing harfbuzz security updates
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:23 moritzm: installing Ruby 2.1 security updates
* 22:42 mutante: [[phab:T294038|T294038]] [krb1001:~] $ sudo manage_principals.py create effeietsanders ... Principal successfully created.  . .Successfully sent email
* 15:15 moritzm: installing python-django security updates (packaged buster version)
* 21:44 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 47s)
* 15:04 moritzm: installing bind updates on jessie (client side tools/libs)
* 21:41 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 14:19 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1078', diff saved to https://phabricator.wikimedia.org/P11591 and previous config saved to /var/cache/conftool/dbconfig/20200618-141941-marostegui.json
* 20:54 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 13s)
* 14:14 moritzm: failover ganeti master in codfw to ganeti2021
* 20:53 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 14:03 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1078 for schema change', diff saved to https://phabricator.wikimedia.org/P11590 and previous config saved to /var/cache/conftool/dbconfig/20200618-140352-marostegui.json
* 20:53 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 35s)
* 14:02 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1075', diff saved to https://phabricator.wikimedia.org/P11589 and previous config saved to /var/cache/conftool/dbconfig/20200618-140203-marostegui.json
* 20:52 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 13:53 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:53 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:52 akosiaris: restart logstash2005 for applying an increased ganeti migration_downtime of 10k
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:43 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:34 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 19:42 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Update $wgTimelineFonts for new path to unifont in Shellbox container ([[phab:T293050|T293050]]) (duration: 00m 55s)
* 13:11 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 19:38 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 13:05 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 19:35 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 12:52 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1075 for schema change', diff saved to https://phabricator.wikimedia.org/P11586 and previous config saved to /var/cache/conftool/dbconfig/20200618-125216-marostegui.json
* 19:31 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es5 master as es1024 is fully repooled now', diff saved to https://phabricator.wikimedia.org/P11585 and previous config saved to /var/cache/conftool/dbconfig/20200618-124801-marostegui.json
* 19:23 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 12:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:10 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs (duration: 00m 23s)
* 12:20 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 19:09 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs
* 12:05 kormat: reimaging db1077 for final test [[phab:T251768|T251768]]
* 19:07 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: (no justification provided) (duration: 00m 08s)
* 11:51 jbond@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: (no justification provided) (duration: 01m 00s)
* 19:07 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: (no justification provided)
* 11:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:53 urbanecm: Deploy security patch for [[phab:T285116|T285116]] (wmf.4, wmf.5)
* 11:00 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 18:53 mutante: dumpsdata1003 - sudo systemctl reset-failed to clear Icinga alert about failed cleanup_tmpdumps.service
* 10:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:55 mutante: that's a key for https://www.worldcat.org/whatis/default.jsp btw for those wondering
* 10:34 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 17:53 mutante: citoid - replaced "wskey" for worldcat in private repo as requested on [[phab:T294010|T294010]] (is in 4 places, 3 for deployment_server/k8s and one remnant for scb)
* 09:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 17:53 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 09:48 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 17:52 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2076', diff saved to https://phabricator.wikimedia.org/P11583 and previous config saved to /var/cache/conftool/dbconfig/20200618-094001-marostegui.json
* 17:50 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 09:39 akosiaris: update wikifeeds to latest chart version in codfw
* 16:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:39 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:38 marostegui@cumin2001: dbctl commit (dc=all): 'Repool es2022', diff saved to https://phabricator.wikimedia.org/P11582 and previous config saved to /var/cache/conftool/dbconfig/20200618-093803-marostegui.json
* 16:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:38 akosiaris: uncordon kubernetes20<nowiki>{</nowiki>07..14<nowiki>}</nowiki> and kubernetes10<nowiki>{</nowiki>07..14<nowiki>}</nowiki>. Nodes are now fully put in rotation and ready to receive production traffic
* 16:12 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 09:34 marostegui: Deploy schema change on s3 codfw master (this will create lag on codfw) - [[phab:T250066|T250066]]
* 16:07 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 09:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 16:06 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (2/3) (duration: 00m 54s)
* 09:30 godog: temp stop logstash on elk7 to test 8 pipeline workers - [[phab:T255243|T255243]]
* 16:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:25 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 16:04 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 09:15 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 16:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 09:09 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:08 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 16:01 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 09:06 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 15:59 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (2/3) (duration: 00m 55s)
* 09:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:58 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (1/3) (duration: 00m 57s)
* 08:59 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool es1025', diff saved to https://phabricator.wikimedia.org/P11581 and previous config saved to /var/cache/conftool/dbconfig/20200618-085927-marostegui.json
* 15:43 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:21 robh@cumin1001: START - Cookbook sre.dns.netbox
* 08:50 ayounsi@cumin2001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1)
* 15:14 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 08:49 ayounsi@cumin2001: START - Cookbook sre.network.prepare-upgrade
* 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:49 ayounsi@cumin2001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 15:13 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 54s)
* 08:49 ayounsi@cumin2001: START - Cookbook sre.network.prepare-upgrade
* 15:12 Lucas_WMDE: my next message accidentally says 1/3 again but it’s 2/3, sorry
* 08:49 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool es1025', diff saved to https://phabricator.wikimedia.org/P11580 and previous config saved to /var/cache/conftool/dbconfig/20200618-084929-marostegui.json
* 15:11 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 08:47 marostegui@cumin2001: dbctl commit (dc=all): 'Depool es2022 for reimage', diff saved to https://phabricator.wikimedia.org/P11578 and previous config saved to /var/cache/conftool/dbconfig/20200618-084720-marostegui.json
* 15:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:56 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 08:40 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:42 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/Wikibase.default.php: Backport: [[gerrit:732666{{!}}Enable dispatching via jobs by default (T291828)]] (duration: 00m 55s)
* 08:37 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool es1025', diff saved to https://phabricator.wikimedia.org/P11577 and previous config saved to /var/cache/conftool/dbconfig/20200618-083749-marostegui.json
* 14:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:25 elukey: change archiva-ci password in archiva
* 14:39 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client/: Backport: [[gerrit:732674{{!}}Fix ExternalUserNames service wiring for local database]] (duration: 00m 57s)
* 08:24 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool es1025', diff saved to https://phabricator.wikimedia.org/P11576 and previous config saved to /var/cache/conftool/dbconfig/20200618-082432-marostegui.json
* 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:12 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:33 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 08:10 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 08:08 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 08:06 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 08:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 08:00 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:56 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 07:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:55 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 07:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 07:50 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 07:45 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:34 volans: uploaded spicerack_1.0.6 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 07:41 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:41 marostegui: Reimage es1025
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:35 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:04 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 07:34 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1136', diff saved to https://phabricator.wikimedia.org/P11574 and previous config saved to /var/cache/conftool/dbconfig/20200618-073414-marostegui.json
* 12:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 07:33 ayounsi@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 07:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 07:25 ayounsi@cumin2001: START - Cookbook sre.dns.netbox
* 12:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 07:24 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:48 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 07:22 moritzm: rolling reboot of ganeti servers in codfw
* 12:48 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 07:10 ayounsi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 07:07 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 12:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 04:50 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1136', diff saved to https://phabricator.wikimedia.org/P11573 and previous config saved to /var/cache/conftool/dbconfig/20200618-045047-marostegui.json
* 12:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 11:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:13 Lucas_WMDE: UTC morning backport+config window done
* 11:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294008|T294008]]
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 jgiannelos@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730848{{!}}Configure event stream for map tiles state change (T289771)]] (duration: 01m 04s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:48 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:47 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:14 jbond: mergeing refactor of P:base Gerrit:714975
* 09:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:49 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:56 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 01m 03s)
* 08:33 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:26 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:25 ema: cp3062: revert vsl_space experiment [[phab:T293879|T293879]]
* 08:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1004.eqiad.wmnet with OS bullseye
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17563 and previous config saved to /var/cache/conftool/dbconfig/20211021-080330-root.json
* 07:56 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite1004.eqiad.wmnet with OS bullseye
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17562 and previous config saved to /var/cache/conftool/dbconfig/20211021-074826-root.json
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17561 and previous config saved to /var/cache/conftool/dbconfig/20211021-073323-root.json
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17560 and previous config saved to /var/cache/conftool/dbconfig/20211021-071819-root.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17559 and previous config saved to /var/cache/conftool/dbconfig/20211021-070315-root.json
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17558 and previous config saved to /var/cache/conftool/dbconfig/20211021-064812-root.json
* 06:35 elukey: `systemctl reload nginx` on cloudelastic100[5,6] to pick up the new TLS certificate and clear alerts - [[phab:T293826|T293826]]
* 04:47 marostegui: Deploy schema change on s5 codfw - [[phab:T291719|T291719]]
* 04:37 marostegui: Deploy schema change on s6 codfw - [[phab:T291719|T291719]]
* 04:04 legoktm: restarted apache on lists1001 so it only uses new TLS cert ([[phab:T293826|T293826]])
* 03:29 eileen: civicrm revision changed from {{Gerrit|e889831012}} to {{Gerrit|733a8fceda}}, config revision is {{Gerrit|eed79486d5}}
* 00:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2020-06-17 ==
== 2021-10-20 ==
* 23:25 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0e7079d}}: Install DiscussionTools on all wikis (attempt 2) ([[phab:T252264|T252264]]; [[phab:T253943|T253943]]) (duration: 00m 56s)
* 23:56 thcipriani@deploy1002: Finished scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]] (duration: 11m 41s)
* 23:23 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/DiscussionTools/includes/Hooks.php: {{Gerrit|ff01083}}: Use $wgLocaltimezone global instead of request context ([[phab:T255704|T255704]]) (duration: 00m 57s)
* 23:44 thcipriani@deploy1002: Started scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]]
* 23:21 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/DiscussionTools/includes/Hooks.php: {{Gerrit|4551d29}}: Use $wgLocaltimezone global instead of request context ([[phab:T252264|T252264]]; [[phab:T253943|T253943]]; [[phab:T255704|T255704]]) (duration: 00m 58s)
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:01 ryankemper@deploy1001: Finished deploy [wdqs/wdqs@79fb82f]: 0.3.39 (duration: 14m 38s)
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:47 ryankemper@deploy1001: Started deploy [wdqs/wdqs@79fb82f]: 0.3.39
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:01 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 23:29 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: fawiki require login for creation of pages in the draft namespace [[phab:T291018|T291018]] (duration: 01m 02s)
* 20:32 hashar: Fixed up zuul-merger on contint1001 due to some faulty hotfix
* 23:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:08 hashar: Stopped zuul-merger on contint1001
* 23:27 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: fawiki require login to edit main namespace [[phab:T291018|T291018]] (duration: 01m 04s)
* 19:21 marostegui: Deploy schema change on s6 codfw master [[phab:T238966|T238966]]
* 22:13 dancy@deploy1002: Synchronized README: testing (4/4) (duration: 02m 52s)
* 19:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1094', diff saved to https://phabricator.wikimedia.org/P11572 and previous config saved to /var/cache/conftool/dbconfig/20200617-191723-marostegui.json
* 22:00 dancy@deploy1002: Synchronized README: testing (3/4) (duration: 02m 57s)
* 19:11 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 21:54 dancy@deploy1002: Synchronized README: testing (2) (duration: 01m 02s)
* 19:08 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:52 dancy@deploy1002: Synchronized README: (no justification provided) (duration: 01m 03s)
* 19:05 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 21:50 dancy: Testing a series of one-file scap sync-file runs
* 18:57 milimetric@deploy1001: Finished deploy [analytics/refinery@6640d6f] (thin): Quick fix for data quality bundles (THIN) (duration: 00m 10s)
* 21:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:57 milimetric@deploy1001: Started deploy [analytics/refinery@6640d6f] (thin): Quick fix for data quality bundles (THIN)
* 21:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:52 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b9cf996a38d82fdd67e600a5a951e88423957e8d}}: Promote Growth features out of darkmode on several wikis  ([[phab:T291826|T291826]], [[phab:T255037|T255037]], [[phab:T287878|T287878]]) (duration: 01m 04s)
* 18:44 milimetric@deploy1001: Finished deploy [analytics/refinery@6640d6f]: Quick fix for data quality bundles (duration: 27m 55s)
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:41 Urbanecm: Morning B&C window done
* 20:38 eileen: civicrm revision changed from {{Gerrit|9b5e0d015b}} to {{Gerrit|e889831012}}, config revision is {{Gerrit|eed79486d5}}
* 18:31 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|96153f9}}: Add temporary logging for mediamoderation ([[phab:T247943|T247943]]) (duration: 00m 56s)
* 20:25 legoktm: uploaded php7.4 on buster to apt.wm.o ([[phab:T293449|T293449]])
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: REVERT: {{Gerrit|ae76450}}: Install DiscussionTools on all wikis ([[phab:T252264|T252264]]; [[phab:T253943|T253943]]) (duration: 00m 34s)
* 19:24 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations (duration: 00m 46s)
* 18:22 urbanecm@deploy1001: scap failed: average error rate on 3/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
* 19:24 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations
* 18:21 urbanecm@deploy1001: scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
* 19:09 mutante: disabling puppet on mw* for a minute to deploy a change
* 18:16 milimetric@deploy1001: Started deploy [analytics/refinery@6640d6f]: Quick fix for data quality bundles
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c9f6452}}: Set DiscussionToolsEnableVisual to true by default ([[phab:T251654|T251654]]) (duration: 00m 56s)
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:05 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:31 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:04 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 18:30 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 16:57 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on group0 wikis - [[phab:T249261|T249261]] (duration: 00m 56s)
* 18:24 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 16:00 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1094', diff saved to https://phabricator.wikimedia.org/P11571 and previous config saved to /var/cache/conftool/dbconfig/20200617-160013-marostegui.json
* 17:28 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org -  [[phab:T293810|T293810]]
* 15:28 godog: temp bump logstash7 workers to 8 and temp stop logstash - [[phab:T255243|T255243]]
* 17:27 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org
* 15:17 jforrester@deploy1001: Synchronized private/PrivateSettings.php: [[phab:T247943|T247943]] Add API key and recipient config for MediaModeration (duration: 00m 55s)
* 17:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:17 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2338.codfw.wmnet
* 17:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:11 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw233[5-9].codfw.wmnet
* 17:01 razzi@deploy1002: Finished deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f] (duration: 23m 42s)
* 15:11 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T247943|T247943]] Install MediaModeration extension - III: Install where enabled (duration: 00m 56s)
* 17:00 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client: Update deprecated calls to ParserOutput in ShortDescHandler - [[phab:T293860|T293860]] (duration: 01m 03s)
* 15:10 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2335.codfw.wmnet
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:09 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2336.codfw.wmnet
* 16:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:09 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2337.codfw.wmnet
* 16:53 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/LiquidThreads/pages/LqtDiscussionPager.php: Remove deprecated usage of setProperty - [[phab:T293895|T293895]] (duration: 01m 03s)
* 15:09 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2339.codfw.wmnet
* 16:49 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GeoCrumbs: Replace use of deprecated ParserOutput:getProperty() - [[phab:T293894|T293894]] (duration: 01m 09s)
* 15:08 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw233[5-9].codfw.wmnet
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:58 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/GrowthExperiments/modules/help/ext.growthExperiments.HelpPanelProcessDialog.js: [[phab:T255607|T255607]] Fix help panel sizing logic (duration: 00m 56s)
* 16:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:54 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 16:37 razzi@deploy1002: Started deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f]
* 14:52 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 16:36 razzi: deploy refinery change for https://phabricator.wikimedia.org/T287084
* 14:52 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:13 jbond: upload cas_6.4.2-1_amd64.deb
* 14:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 15:42 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:49 mdholloway: rolled back recommendation-api deployment due to canary endpoint check failure ([[phab:T255683|T255683]])
* 15:39 volans@cumin2002: START - Cookbook sre.dns.netbox
* 14:44 mholloway-shell@deploy1001: Finished deploy [recommendation-api/deploy@c39d567]: Update recommendation-api to {{Gerrit|db97742}} (duration: 01m 16s)
* 14:57 moritzm: installing modsecurity-crs security updates on Buster
* 14:43 mholloway-shell@deploy1001: Started deploy [recommendation-api/deploy@c39d567]: Update recommendation-api to {{Gerrit|db97742}}
* 14:48 moritzm: installing xmlgraphics-commons security updates on Buster
* 14:30 akosiaris: redrain kubernetes1007-14
* 14:46 moritzm: installing irssi security updates on Buster
* 14:27 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:27 mutante: disabling puppet on icinga to avoid alert spam when adding new appservers
* 14:35 moritzm: installing commons-io security updates on Buster
* 14:25 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:27 ema: cp3062: test higher vsl_space values [[phab:T293879|T293879]]
* 14:22 akosiaris: uncordon kubernetes10<nowiki>{</nowiki>07..14<nowiki>}</nowiki> again
* 14:27 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:13 mutante: generating new mcrouter certs for mw2335 - mw2339 ([[phab:T247021|T247021]])
* 14:12 moritzm: installing ruby2.3 security updates
* 14:02 mutante: rebooting mw2335 through mw2339 (not in service)
* 13:40 moritzm: installing apache2 security updates on buster
* 13:51 XioNoX: cleanup msw1-codfw interfaces
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:44 akosiaris: redrain kubernetes1007-14
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:37 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 13:21 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 01m 02s)
* 13:35 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 13:20 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 13:31 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on testwiki version 1.1.0 - [[phab:T249261|T249261]] (duration: 00m 58s)
* 13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 13:30 moritzm: upgrade remaining parsoid nodes to PHP 7.2.31
* 13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 13:21 jbond42: re-enable puppet on C:memcached nodes
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=ats-tls
* 13:04 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=varnish-fe
* 13:04 marostegui: The above db1129 depool was meant to be a repool, wrong commit message
* 12:51 ema: cp3062: bump vsl_space from 80M (default) to 512M [[phab:T293879|T293879]] - varnish restart needed
* 13:03 liw@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.37
* 12:37 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 13:03 jbond42: disable puppet on C:memcache to deploy a new change
* 12:36 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 13:02 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P11567 and previous config saved to /var/cache/conftool/dbconfig/20200617-130236-marostegui.json
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:02 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:00 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 12:02 urbanecm@deploy1002: Finished scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]]) (duration: 25m 19s)
* 13:00 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:00 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:00 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2007.codfw.wmnet
* 13:00 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2007.codfw.wmnet
* 13:00 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:37 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 11:37 urbanecm@deploy1002: Started scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]])
* 12:59 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2005.codfw.wmnet
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 11:21 moritzm: installing ffmpeg security updates
* 12:59 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e520fc57411bb19123766192cd636396ea6fc59d}}: GrowthExperiments: Add campaign pattern for enwiki ([[phab:T293699|T293699]]) (duration: 01m 22s)
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 11:11 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 12:59 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2005.codfw.wmnet
* 12:54 hnowlan: upgraded cpjobqueue to newer container image, rolled back
* 10:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 12:40 marostegui@cumin2001: dbctl commit (dc=all): 'Add db2091 to s8 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11566 and previous config saved to /var/cache/conftool/dbconfig/20200617-124034-marostegui.json
* 10:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 12:32 hnowlan: Removed remaining changeprop systemd components from scb
* 09:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 12:06 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db2076 to remove triggers from sanitarium [[phab:T238966|T238966]]', diff saved to https://phabricator.wikimedia.org/P11565 and previous config saved to /var/cache/conftool/dbconfig/20200617-120622-marostegui.json
* 09:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 11:59 Amir1: not today, just EU noon
* 09:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 11:59 Amir1: B&C is done for today
* 09:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 11:58 ladsgroup@deploy1001: Synchronized wmf-config/config/trwikisource.yaml: [[gerrit:605656{{!}}Change sidebar upload link destination for tr.wikisource (T253490)]] (duration: 01m 03s)
* 09:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 11:55 ladsgroup@deploy1001: Synchronized dblists/commonsuploads.dblist: [[gerrit:605656{{!}}Change sidebar upload link destination for tr.wikisource (T253490)]] (duration: 01m 04s)
* 09:04 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 11:48 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 08:50 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 11:47 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:605652{{!}}Add extended-confirmed group and restriction level for rowiki (T254471)]] (duration: 01m 04s)
* 08:50 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1025 for reimage, give weight to es1023 (es5 master)', diff saved to https://phabricator.wikimedia.org/P11563 and previous config saved to /var/cache/conftool/dbconfig/20200617-113026-marostegui.json
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 11:23 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/GrowthExperiments/extension.json: [[gerrit:606122{{!}}Fix NewcomerTask schema (T255597)]] (duration: 01m 04s)
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 11:18 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/GrowthExperiments/extension.json: [[gerrit:606121{{!}}Fix NewcomerTask schema (T255597)]] (duration: 01m 06s)
* 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS buster
* 11:07 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:606075{{!}}Set hiwiktionary timezone to Asia/Kolkata (T255531)]] (duration: 01m 05s)
* 07:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:48 marostegui@cumin2001: dbctl commit (dc=all): 'Remove db2091 from dbctl in s2 and s4', diff saved to https://phabricator.wikimedia.org/P11562 and previous config saved to /var/cache/conftool/dbconfig/20200617-104816-marostegui.json
* 06:49 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS buster
* 10:40 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 (s1) for reimage [[phab:T290865|T290865]]', diff saved to https://phabricator.wikimedia.org/P17552 and previous config saved to /var/cache/conftool/dbconfig/20211020-064529-marostegui.json
* 10:38 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS buster
* 10:31 liw@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.37 (duration: 01m 04s)
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 (s1) after upgrade', diff saved to https://phabricator.wikimedia.org/P17551 and previous config saved to /var/cache/conftool/dbconfig/20211020-063926-marostegui.json
* 10:30 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.37
* 06:35 marostegui: Upgrade db1106
* 09:44 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17550 and previous config saved to /var/cache/conftool/dbconfig/20211020-063431-marostegui.json
* 09:42 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:31 dcausse: restarting blazegraph on wdqs1012
* 09:40 hnowlan: killing stale changeprop instances running on scb hosts
* 06:28 elukey: reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage
* 09:16 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/Flow/: [[phab:T255608|T255608]] Revert 'Hooks: Use PageMoveComplete instead of TitleMoveCompleting' (duration: 01m 05s)
* 06:21 marostegui: Depool clouddb1013 for upgrade
* 09:15 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11558 and previous config saved to /var/cache/conftool/dbconfig/20200617-091509-marostegui.json
* 06:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS buster
* 09:11 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/includes/HookContainer/DeprecatedHooks.php: [[phab:T255608|T255608]] Revert 'Hard deprecate the  hook' (duration: 01m 05s)
* 06:12 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:02 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T247943|T247943]] Install MediaModeration extension - II: Add flag to IS (duration: 01m 05s)
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17549 and previous config saved to /var/cache/conftool/dbconfig/20211020-061202-marostegui.json
* 08:56 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:06 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:56 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 06:05 XioNoX: put transport link between ulsfo and eqsin in service - [[phab:T273308|T273308]]
* 08:52 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS buster
* 08:49 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 05:26 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 08:47 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11557 and previous config saved to /var/cache/conftool/dbconfig/20200617-084751-marostegui.json
* 04:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:44 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11556 and previous config saved to /var/cache/conftool/dbconfig/20200617-084402-marostegui.json
* 04:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:43 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/includes/EditPage.php: [[phab:T255177|T255177]] [[phab:T255614|T255614]] Do not return internal edit status from EditPage (duration: 01m 08s)
* 04:40 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable $wgLocalHTTPProxy on group0 wikis ([[phab:T288848|T288848]]) (duration: 01m 05s)
* 08:31 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11554 and previous config saved to /var/cache/conftool/dbconfig/20200617-083120-marostegui.json
* 01:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:30 godog: start logstash on logstash7 - [[phab:T255243|T255243]]
* 01:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 moritzm: prune nginx from remaining mw* servers in codfw [[phab:T255565|T255565]]
* 00:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:23 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:20 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 00:00 tgr: west coast evening deploys done
* 08:10 godog: stop logstash temporarily on logstash7 hosts to test increased es shards - [[phab:T255243|T255243]]
* 08:05 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1113:3315 db1113:3316', diff saved to https://phabricator.wikimedia.org/P11553 and previous config saved to /var/cache/conftool/dbconfig/20200617-080511-marostegui.json
* 07:53 elukey: reboot kafka-jumbo1009 for kernel upgrades
* 06:40 elukey: reboot krb1001 for kernel upgrades
* 06:24 elukey: reboot an-master100[1,2] for kernel upgrades
* 06:23 XioNoX: set lacp active on cr2-esams:ae2 - [[phab:T253970|T253970]]
* 06:15 tstarling@deploy1001: Synchronized wmf-config/PoolCounterSettings.php: test fast stale mode on testwiki [[phab:T250248|T250248]] (duration: 01m 17s)
* 06:03 elukey: reboot an-conf100[1-3] for kernel upgrades
* 05:45 elukey: reboot stat1007/8 for kernel upgrades
* 05:45 elukey: clean up old systemd timer config on an-coord1001 (came up after the last reboot)
* 05:42 volker-e@deploy1001: Finished deploy [design/style-guide@37c67dd]: Deploy design/style-guide:  (duration: 00m 05s)
* 05:42 volker-e@deploy1001: Started deploy [design/style-guide@37c67dd]: Deploy design/style-guide:
* 05:34 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11552 and previous config saved to /var/cache/conftool/dbconfig/20200617-053421-marostegui.json
* 05:29 marostegui: Deploy schema change on s7 codfw (lag will appear) - [[phab:T250066|T250066]]
* 05:28 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11551 and previous config saved to /var/cache/conftool/dbconfig/20200617-052809-marostegui.json
* 05:22 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11550 and previous config saved to /var/cache/conftool/dbconfig/20200617-052202-marostegui.json
* 05:19 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11549 and previous config saved to /var/cache/conftool/dbconfig/20200617-051916-marostegui.json
* 05:10 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:08 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 04:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1090:3312, db1090:3317 for reimage', diff saved to https://phabricator.wikimedia.org/P11548 and previous config saved to /var/cache/conftool/dbconfig/20200617-045105-marostegui.json
* 04:44 marostegui: Reload pt-kill on labsdb analytics host to pick up new config
* 04:38 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P11547 and previous config saved to /var/cache/conftool/dbconfig/20200617-043826-marostegui.json
* 01:43 shdubsh: restart elasticsearch on logstash1011


== 2020-06-16 ==
== 2021-10-19 ==
* 23:43 crusnov@deploy1001: Finished deploy [netbox/deploy@5251cf1]: Deploying Netbox to netbox-dev [[phab:T253140|T253140]] (duration: 00m 05s)
* 23:59 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732103{{!}}Reorder some wikis at wgExtraNamespaces and wmgVisualEditorAvailableNamespaces (T293846)]] (duration: 01m 02s)
* 23:43 crusnov@deploy1001: Started deploy [netbox/deploy@5251cf1]: Deploying Netbox to netbox-dev [[phab:T253140|T253140]]
* 23:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:35 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrus: update ML models for ko and zh, drop ja (duration: 01m 00s)
* 23:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:34 ebernhardson@deploy1001: sync-file aborted: cirrus: update ML models for ko and zh, drop ja (duration: 00m 04s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732053{{!}}ruwikiversity: Add 'portal' and 'faculty' namespaces (T293545)]] (duration: 01m 03s)
* 22:40 krinkle@deploy1001: Synchronized src/Noc/: (no justification provided) (duration: 01m 04s)
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:31 krinkle@deploy1001: Synchronized docroot/noc: (no justification provided) (duration: 01m 05s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:12 krinkle@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/WikimediaEvents/modules/: {{Gerrit|I67794c6c7192571}} (duration: 01m 04s)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:710565{{!}}Set the project namespace and sitename for Javanese Wikipedia and Wiktionary (T287437)]] (duration: 01m 02s)
* 20:42 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.35.0-wmf.37
* 23:23 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731953{{!}}Create Portal and Portal talk namespace for shiwiki (T288909)]] (duration: 01m 03s)
* 20:41 foks: reset email and pw for CactusJack
* 23:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:32 brennen: rolling 1.35.0-wmf.37 back to group0
* 23:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:29 mutante: signing puppet cert requests for releases1002 and releases2002 - [[phab:T255590|T255590]]
* 23:13 tgr@deploy1002: Synchronized static: Config: [[gerrit:731231{{!}}Repair the size of the logo of Kashmiri Wikipedia (T293342)]] (duration: 02m 14s)
* 19:24 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.37 (duration: 01m 04s)
* 21:34 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete {{!}} fixed Icinga alert: RECOVERY - Check for large files in client bucket on mwmaint1002 is OK: OK: [[phab:T165885|T165885]]
* 19:23 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.37
* 21:32 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete
* 19:18 otto@deploy1001: Started deploy [analytics/refinery@8b8ce6e]: deploying refinery source 0.0.127 for eventlogging -> eventgate migration - [[phab:T249261|T249261]]
* 20:56 ejegg: updated payments-wiki from {{Gerrit|0f48acea49}} to {{Gerrit|30e596903d}}
* 19:15 brennen@deploy1001: Synchronized php-1.35.0-wmf.37/skins/Vector/resources/skins.vector.styles/: [[gerrit:605975{{!}}Restore Watchlist star]] (duration: 01m 05s)
* 19:03 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 19:03 brennen: CORRECTION: holding _1.35.0-wmf.37_ deploy to group1 for a few minutes while merging & testing fix for [[phab:T255574|T255574]]
* 18:46 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/MediaSearch/: {{Gerrit|a84a675}}: {{Gerrit|3231578}}: MediaSearch backports ([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 19:01 brennen: holding 1.35.0-wmf.27 deploy to group1 for a few minutes while merging & testing fix for [[phab:T255574|T255574]]
* 18:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/: {{Gerrit|694580a}}: {{Gerrit|c02e301}}: MediaSearch backports([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 18:59 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 18:52 qchris: Turning on puppet again on gerrit1002 to avoid having it lag too far behind.
* 18:30 foks: deleting 1 more email with deleteUserEmail.php
* 18:32 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1476a2d93}}: {{Gerrit|dd8393c1a0}}: foundationwiki: Restrict sensitive namespaces to editor group ([[phab:T205350|T205350]]) (duration: 01m 03s)
* 18:18 mutante: mw2293 - scap pull (because Icinga reports mismatched MW versions)
* 18:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 18:01 crusnov@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:12 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9a2893c7190e615a247674dbf7f87348bf43b91c}}: Enable topic subscriptions as a beta feature on all remaining projects ([[phab:T287802|T287802]]) (duration: 01m 04s)
* 17:55 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 18:00 legoktm@deploy1002: Synchronized wmf-config/: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (2/2) (duration: 01m 06s)
* 17:52 crusnov@cumin2001: START - Cookbook sre.ganeti.makevm
* 17:59 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (1/2) (duration: 01m 05s)
* 17:44 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@f4f5d7b]: airflow: adjust glent legal cutoff (duration: 01m 35s)
* 17:57 foks: removing six email addresses on request (with deleteUserEmail.php)
* 17:42 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@f4f5d7b]: airflow: adjust glent legal cutoff
* 17:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 17:32 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:25 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 17:03 herron: performing rolling reboots of kafka-main hosts for security updates [[phab:T254990|T254990]]
* 17:11 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 16:27 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 17:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:26 hnowlan: Updating changeprop to new container version with updated dependencies
* 16:48 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:07 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 16:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:04 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:02 elukey: reboot kafka-jumbo1008 for kernel upgrades
* 16:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 15:58 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 16:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 15:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1076', diff saved to https://phabricator.wikimedia.org/P11543 and previous config saved to /var/cache/conftool/dbconfig/20200616-154924-marostegui.json
* 16:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 15:45 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@7d4458c]: Reduce glent maximum yarn resource usage to reasonable levels (duration: 00m 41s)
* 16:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 15:44 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@7d4458c]: Reduce glent maximum yarn resource usage to reasonable levels
* 16:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 15:26 milimetric@deploy1001: Finished deploy [analytics/refinery@c652f62] (thin): Regular analytics weekly THIN train [analytics/refinery@c652f62] (duration: 00m 08s)
* 16:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 15:25 milimetric@deploy1001: Started deploy [analytics/refinery@c652f62] (thin): Regular analytics weekly THIN train [analytics/refinery@c652f62]
* 16:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 15:23 milimetric@deploy1001: Finished deploy [analytics/refinery@c652f62]: Regular analytics weekly train [analytics/refinery@c652f62] (duration: 07m 56s)
* 16:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 15:20 elukey: reboot kafka-jumbo1007 for kernel upgrades
* 15:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 15:15 moritzm: upgrading intel-microcode on jessie hosts
* 15:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 15:15 milimetric@deploy1001: Started deploy [analytics/refinery@c652f62]: Regular analytics weekly train [analytics/refinery@c652f62]
* 15:40 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - remove now redundant stream setting - [[phab:T277193|T277193]] (duration: 01m 04s)
* 15:06 elukey: reboot an-coord1001 for kernel upgrades
* 15:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 14:49 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 15:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 14:49 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 14:45 moritzm: rebooting scandium for kernel security update
* 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 14:45 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:32 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 14:43 cdanis: repool eqiad [[phab:T243080|T243080]]
* 15:32 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 14:40 papaul: power off ms-be2018 for BBU replacement
* 15:30 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:33 cdanis: eqiad router upgrades completed! 🎉 [[phab:T243080|T243080]]
* 15:28 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:33 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:31 elukey: reboot druid100[7,8] for kernel upgrades
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 14:28 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 14:25 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:34 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:22 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:29 jbond: disable puppet on lvs, cp, authdns, mc, mw-be and wcqs to while i merge G:662699
* 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076', diff saved to https://phabricator.wikimedia.org/P11541 and previous config saved to /var/cache/conftool/dbconfig/20200616-141540-marostegui.json
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:14 cdanis: [[phab:T243080|T243080]] cdanis@re1.cr2-eqiad> request chassis routing-engine master switch
* 14:11 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 45m 13s)
* 14:11 moritzm: removing stray nginx packages from mw canaries (mw1261-mw1265 and mw1276-mw1283) [[phab:T255565|T255565]]
* 13:52 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:06 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 13:45 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:03 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 13:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:03 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:03 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 13:26 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 14:03 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17547 and previous config saved to /var/cache/conftool/dbconfig/20211019-131927-root.json
* 14:03 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17546 and previous config saved to /var/cache/conftool/dbconfig/20211019-131651-root.json
* 13:56 cdanis: [[phab:T243080|T243080]] cdanis@re0.cr2-eqiad> request chassis routing-engine master switch
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17545 and previous config saved to /var/cache/conftool/dbconfig/20211019-130424-root.json
* 13:50 cdanis: cr2-eqiad: rebooting RE1 [backup] with new junos version [[phab:T243080|T243080]]
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17544 and previous config saved to /var/cache/conftool/dbconfig/20211019-130147-root.json
* 13:39 cdanis: cr2-eqiad: disable transit/peering BGP & bump fr MED [[phab:T243080|T243080]]
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17543 and previous config saved to /var/cache/conftool/dbconfig/20211019-124920-root.json
* 13:32 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db2092 [[phab:T254462|T254462]]', diff saved to https://phabricator.wikimedia.org/P11535 and previous config saved to /var/cache/conftool/dbconfig/20200616-133241-marostegui.json
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17542 and previous config saved to /var/cache/conftool/dbconfig/20211019-124644-root.json
* 13:17 XioNoX: pfw3-eqiad rollback MED to cr1 to 0 - [[phab:T243080|T243080]]
* 12:40 moritzm: installing aftpd security updates
* 13:12 XioNoX: add graceful-switchover to cr1-eqiad
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17541 and previous config saved to /var/cache/conftool/dbconfig/20211019-123416-root.json
* 13:09 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 12:34 marostegui: Upgrade dbstore1003
* 13:08 liw@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.37
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17540 and previous config saved to /var/cache/conftool/dbconfig/20211019-123140-root.json
* 13:06 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17539 and previous config saved to /var/cache/conftool/dbconfig/20211019-121913-root.json
* 13:03 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17538 and previous config saved to /var/cache/conftool/dbconfig/20211019-121636-root.json
* 13:03 cdanis: [[phab:T243080|T243080]] cdanis@re1.cr1-eqiad> request chassis routing-engine master switch
* 12:12 XioNoX: push anycast tuning to all Lumen and NTT transit links - [[phab:T288843|T288843]]
* 13:03 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1167 (s8) after upgrade', diff saved to https://phabricator.wikimedia.org/P17537 and previous config saved to /var/cache/conftool/dbconfig/20211019-120918-marostegui.json
* 13:03 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17536 and previous config saved to /var/cache/conftool/dbconfig/20211019-120458-marostegui.json
* 13:03 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17535 and previous config saved to /var/cache/conftool/dbconfig/20211019-120409-root.json
* 13:01 moritzm: rebooting mw2291-mw2334
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17534 and previous config saved to /var/cache/conftool/dbconfig/20211019-120348-root.json
* 12:54 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 12:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/WikibaseMediaInfo/: {{Gerrit|ec0125770775c1a1a54c3b592d86d287fd9e3ad6}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 55s)
* 12:51 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17533 and previous config saved to /var/cache/conftool/dbconfig/20211019-120132-root.json
* 12:47 jbond42: upload new memcache package with TLS to component/memcached16 in buster-wikimedia
* 12:00 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikibaseMediaInfo/: {{Gerrit|79808a90a95dd5dac2b532b87fb7ec1a490ea0f0}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 56s)
* 12:42 XioNoX: pfw3-eqiad set MED to cr1 to 300 - [[phab:T243080|T243080]]
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17532 and previous config saved to /var/cache/conftool/dbconfig/20211019-120024-root.json
* 12:38 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:31 cdanis: [[phab:T243080|T243080]] cr1-eqiad: request chassis routing-engine master switch
* 11:56 XioNoX: push anycast tuning to Tele2, Init7, DT transit links - [[phab:T288843|T288843]]
* 12:31 cdanis: cr1-eqiad: request chassis routing-engine master switch
* 11:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:25 cdanis: cr1-eqiad: rebooting RE1 [backup] with new junos version [[phab:T243080|T243080]]
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17531 and previous config saved to /var/cache/conftool/dbconfig/20211019-114844-root.json
* 12:15 cdanis: cdanis@re0.cr1-eqiad# commit confirmed 2 comment "force VRRP failover [[phab:T243080|T243080]]"
* 11:46 marostegui: Upgrade db1105 (s1,s2)
* 12:14 cdanis: disable transit/peering & increase frack MED on cr1-eqiad [[phab:T243080|T243080]]
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105 (s1,s2) for upgrade', diff saved to https://phabricator.wikimedia.org/P17530 and previous config saved to /var/cache/conftool/dbconfig/20211019-114649-marostegui.json
* 12:09 hnowlan@cumin2001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17529 and previous config saved to /var/cache/conftool/dbconfig/20211019-114520-root.json
* 11:48 cdanis: depooling eqiad for router upgrade [[phab:T243080|T243080]]
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17527 and previous config saved to /var/cache/conftool/dbconfig/20211019-113340-root.json
* 11:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17526 and previous config saved to /var/cache/conftool/dbconfig/20211019-113017-root.json
* 11:42 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17525 and previous config saved to /var/cache/conftool/dbconfig/20211019-111837-root.json
* 11:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17524 and previous config saved to /var/cache/conftool/dbconfig/20211019-111513-root.json
* 11:42 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7c31b04e50101a60db7ae8acae64bc031f5e1007}}: DPL: Explicitly note it is not possible to enable DPL on any more wikis (duration: 00m 55s)
* 11:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17523 and previous config saved to /var/cache/conftool/dbconfig/20211019-110333-root.json
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17522 and previous config saved to /var/cache/conftool/dbconfig/20211019-110009-root.json
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:56 marostegui: Upgrade clouddb1021
* 11:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:51 moritzm: failover master in ganeti-test to ganeti2026
* 11:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:50 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS stretch
* 11:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17521 and previous config saved to /var/cache/conftool/dbconfig/20211019-104829-root.json
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:45 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 11:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:45 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17520 and previous config saved to /var/cache/conftool/dbconfig/20211019-104506-root.json
* 11:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:38 oblivian@deploy1002: Synchronized w/static.php: Config: [[gerrit:730182{{!}}static.php: Add support for /static/current rewrites (take 2) (T285232)]] (duration: 00m 55s)
* 11:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 11:40 hnowlan: roll-restarting restbase201[0-2] for cert updates
* 10:37 marostegui: Upgrade db1101 (s7,s8)
* 11:40 hnowlan@cumin2001: START - Cookbook sre.cassandra.roll-restart
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101 (s7,s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17519 and previous config saved to /var/cache/conftool/dbconfig/20211019-103634-marostegui.json
* 11:39 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:39 jmm@cumin1001: START - Cookbook sre.hosts.downtime
* 10:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 hnowlan@cumin2001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 10:29 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 11:35 elukey: reboot an-druid100[1,2] for kernel upgrades
* 10:28 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 11:27 hnowlan: roll-restart restbase2009 for cert update
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 11:26 hnowlan@cumin2001: START - Cookbook sre.cassandra.roll-restart
* 10:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:21 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 10:22 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 11:18 jforrester@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: [[phab:T32405|T32405]] [[phab:T254731|T254731]] Drop mobile special casing of main page for simplewiki, itwikisource, vecwikisource (duration: 01m 05s)
* 10:22 godog: flip mw statsd traffic with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/731918 - [[phab:T247963|T247963]]
* 11:15 moritzm: updating perf on stretch hosts
* 10:21 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 11:14 marostegui: Deploy MCR schema change on db2087:3316
* 10:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 moritzm: updating perf on buster
* 10:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS stretch
* 11:02 moritzm: rebooting mw2350-mw2376
* 10:16 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 11:01 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting wgActorTableSchemaMigrationStage, no longer read in core (duration: 01m 05s)
* 09:52 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 10:52 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting wgTagStatisticsNewTable, no longer read in core (duration: 01m 04s)
* 09:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 10:51 hnowlan: roll-restarting restbase101[6-8].eqiad.wmnet for cert updates
* 09:44 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.3 (duration: 01m 39s)
* 10:50 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart
* 09:42 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.2 (duration: 16m 06s)
* 10:44 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting wgChangeTagsSchemaMigrationStage, no longer read in core (duration: 01m 06s)
* 09:37 godog: move graphite/statsd writes to graphite2003 - [[phab:T247963|T247963]]
* 10:26 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting wgCommentTableSchemaMigrationStage, no longer read in core (duration: 01m 07s)
* 09:34 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:54 volans: restarting netbox to pickup modified customscripts
* 09:27 hashar: sap clean --delete 1.38.0-wmf.2 && scap clean --delete 1.38.0-wmf.3  # [[phab:T281169|T281169]]
* 09:14 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=thanos-swift,name=eqiad
* 09:27 hashar: Cloned and applied security patches for 1.38.0-wmf.5 # [[phab:T281169|T281169]]
* 08:53 godog: roll restart prometheus eqiad ops to enable thanos upload
* 09:19 marostegui: Stop slave on db2112 [[phab:T290865|T290865]]
* 08:48 marostegui: Upgrade db2132
* 09:18 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 08:44 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:18 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 08:42 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 09:03 XioNoX: push anycast tuning to all Telia transit links - [[phab:T288843|T288843]]
* 08:39 liw@deploy1001: Finished scap: testwikis wikis to 1.35.0-wmf.37 (duration: 59m 05s)
* 08:50 godog: point graphite.discovery.wmnet to graphite2003 - [[phab:T247963|T247963]]
* 08:19 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:40 XioNoX: push prep-work for anycast tuning to all sites - [[phab:T288843|T288843]]
* 08:19 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:33 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:19 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:33 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:19 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:32 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php hrwiki --fix
* 08:19 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:19 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift
* 08:18 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift-ro
* 08:18 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:03 XioNoX: push prep-work for anycast tuning in ulsfo (try 2) - [[phab:T288843|T288843]]
* 08:18 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:18 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 07:32 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:18 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:24 ema: A:cp start rolling varnish upgrades to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 08:18 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17517 and previous config saved to /var/cache/conftool/dbconfig/20211019-072111-root.json
* 08:09 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin2001 now on buster (take 3bis) (duration: 00m 12s)
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17516 and previous config saved to /var/cache/conftool/dbconfig/20211019-071519-root.json
* 08:09 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin2001 now on buster (take 3bis)
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17515 and previous config saved to /var/cache/conftool/dbconfig/20211019-070607-root.json
* 08:09 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin2001 now on buster (take 3) (duration: 01m 37s)
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17514 and previous config saved to /var/cache/conftool/dbconfig/20211019-070016-root.json
* 08:08 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17513 and previous config saved to /var/cache/conftool/dbconfig/20211019-065104-root.json
* 08:08 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17512 and previous config saved to /var/cache/conftool/dbconfig/20211019-064512-root.json
* 08:08 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 08:08 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17511 and previous config saved to /var/cache/conftool/dbconfig/20211019-063559-root.json
* 08:08 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17510 and previous config saved to /var/cache/conftool/dbconfig/20211019-063008-root.json
* 08:08 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17509 and previous config saved to /var/cache/conftool/dbconfig/20211019-062054-root.json
* 08:07 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin2001 now on buster (take 3)
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17508 and previous config saved to /var/cache/conftool/dbconfig/20211019-061505-root.json
* 07:59 volans@deploy1001: Finished deploy [homer/deploy@85e92b8]: Release v0.2.3 on cumin2001 now on buster (take 2) (duration: 00m 57s)
* 06:06 marostegui: Upgrade dbstore1005
* 07:58 volans@deploy1001: Started deploy [homer/deploy@85e92b8]: Release v0.2.3 on cumin2001 now on buster (take 2)
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17507 and previous config saved to /var/cache/conftool/dbconfig/20211019-060551-root.json
* 07:49 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:49 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 06:03 marostegui: Upgrade db1184, db1178
* 07:40 liw@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.37
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1178 for upgrade', diff saved to https://phabricator.wikimedia.org/P17506 and previous config saved to /var/cache/conftool/dbconfig/20211019-060123-marostegui.json
* 07:37 liw@deploy1001: Pruned MediaWiki: 1.35.0-wmf.35 (duration: 01m 47s)
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17505 and previous config saved to /var/cache/conftool/dbconfig/20211019-060001-root.json
* 07:31 liw@deploy1001: Pruned MediaWiki: 1.35.0-wmf.34 (duration: 11m 52s)
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1184 for upgrade', diff saved to https://phabricator.wikimedia.org/P17504 and previous config saved to /var/cache/conftool/dbconfig/20211019-055429-marostegui.json
* 07:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 07:08 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 05:46 marostegui: Reimage db2112 (s1 codfw master) [[phab:T290865|T290865]]
* 07:07 liw: 1.35.0-wmf.37 was branched at {{Gerrit|f856960f17b2a477640c5d848926c04f0d56196c}} for [[phab:T254174|T254174]]
* 04:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1148', diff saved to https://phabricator.wikimedia.org/P11526 and previous config saved to /var/cache/conftool/dbconfig/20200616-070651-marostegui.json
* 03:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1148', diff saved to https://phabricator.wikimedia.org/P11525 and previous config saved to /var/cache/conftool/dbconfig/20200616-070450-marostegui.json
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084', diff saved to https://phabricator.wikimedia.org/P11524 and previous config saved to /var/cache/conftool/dbconfig/20200616-070429-marostegui.json
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084', diff saved to https://phabricator.wikimedia.org/P11523 and previous config saved to /var/cache/conftool/dbconfig/20200616-070209-marostegui.json
* 02:21 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:57 marostegui: Compress InnoDB on db1134 [[phab:T254462|T254462]]
* 02:18 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 06:56 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1134 for InnoDB compression [[phab:T254462|T254462]]', diff saved to https://phabricator.wikimedia.org/P11522 and previous config saved to /var/cache/conftool/dbconfig/20200616-065600-marostegui.json
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1093', diff saved to https://phabricator.wikimedia.org/P11521 and previous config saved to /var/cache/conftool/dbconfig/20200616-065412-marostegui.json
* 00:38 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:40 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 06:25 elukey: roll restart memcached on mc-gp* (gutter pools) to pick up new slab size distribution setting - [[phab:T252391|T252391]]
* 06:04 hashar: Restarted Zuul scheduler and merger on contint2001 a couple hotfixes # [[phab:T252310|T252310]] [[phab:T255424|T255424]]
* 05:54 volker-e@deploy1001: Finished deploy [design/style-guide@37c67dd]: Deploy design/style-guide:  (duration: 00m 05s)
* 05:54 volker-e@deploy1001: Started deploy [design/style-guide@37c67dd]: Deploy design/style-guide:
* 05:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 04:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P11520 and previous config saved to /var/cache/conftool/dbconfig/20200616-045958-marostegui.json
* 04:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146:3314', diff saved to https://phabricator.wikimedia.org/P11519 and previous config saved to /var/cache/conftool/dbconfig/20200616-045744-marostegui.json
* 04:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1147', diff saved to https://phabricator.wikimedia.org/P11518 and previous config saved to /var/cache/conftool/dbconfig/20200616-045636-marostegui.json
* 04:55 marostegui: Deploy schema change on db1147
* 04:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1147', diff saved to https://phabricator.wikimedia.org/P11517 and previous config saved to /var/cache/conftool/dbconfig/20200616-045451-marostegui.json
* 04:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1149', diff saved to https://phabricator.wikimedia.org/P11516 and previous config saved to /var/cache/conftool/dbconfig/20200616-044612-marostegui.json
* 04:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1149', diff saved to https://phabricator.wikimedia.org/P11515 and previous config saved to /var/cache/conftool/dbconfig/20200616-044409-marostegui.json
* 04:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1143', diff saved to https://phabricator.wikimedia.org/P11514 and previous config saved to /var/cache/conftool/dbconfig/20200616-044326-marostegui.json
* 04:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1143', diff saved to https://phabricator.wikimedia.org/P11513 and previous config saved to /var/cache/conftool/dbconfig/20200616-044126-marostegui.json
* 04:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1138', diff saved to https://phabricator.wikimedia.org/P11512 and previous config saved to /var/cache/conftool/dbconfig/20200616-044036-marostegui.json
* 04:37 marostegui: Deploy schema change on db1138
* 04:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1138', diff saved to https://phabricator.wikimedia.org/P11511 and previous config saved to /var/cache/conftool/dbconfig/20200616-043748-marostegui.json
* 00:28 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: limit HTTP client timeout [[phab:T245170|T245170]] (duration: 00m 56s)
* 00:25 tstarling@deploy1001: Synchronized wmf-config/set-time-limit.php: expose excimer timeout as a global variable [[phab:T245170|T245170]] (duration: 00m 56s)
* 00:16 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@17212bb]: airflow: migrate leven-dist to edit-dist (duration: 00m 45s)
* 00:16 volker-e@deploy1001: Finished deploy [design/style-guide@37c67dd]: Deploy design/style-guide:  (duration: 00m 04s)
* 00:16 volker-e@deploy1001: Started deploy [design/style-guide@37c67dd]: Deploy design/style-guide:
* 00:16 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@17212bb]: airflow: migrate leven-dist to edit-dist


== 2020-06-15 ==
== 2021-10-18 ==
* 23:56 tstarling@deploy1001: Synchronized wmf-config/PoolCounterSettings.php: reducing connect timeout per [[phab:T105378|T105378]] (duration: 01m 00s)
* 23:40 hoo: Updated the Wikidata property suggester with data from the 2021-10-04 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 23:31 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@eb0ac12]: Ship templatad table names in HivePartitionRangeSensor (duration: 00m 49s)
* 23:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b654980240d51fff3c6e9c48f7076d4609c2560f}}: Create an alias for the Draft namespace on hrwiki ([[phab:T291755|T291755]]) (duration: 00m 56s)
* 23:30 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@eb0ac12]: Ship templatad table names in HivePartitionRangeSensor
* 23:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:58 krinkle@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: {{Gerrit|If7e1613cbcf8}} (duration: 00m 56s)
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:57 krinkle@deploy1001: Synchronized wmf-config/profiler.php: {{Gerrit|If7e1613cbcf8}} (duration: 00m 59s)
* 23:12 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=thwiktionary --fix # [[phab:T291761|T291761]]
* 22:02 bstorm_: downtimed puppet alerts for testing some changes on labstore1004/5
* 23:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|abe777d28594da852e49ccb1c1597b2598f3e483}}: Create Rhymes namespace for thwiktionary ([[phab:T291761|T291761]]) (duration: 00m 57s)
* 20:59 ebernhardson@deploy1001: Finished deploy [search/airflow@62a024b]: Add pydruid to airflow (duration: 00m 50s)
* 23:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:58 ebernhardson@deploy1001: Started deploy [search/airflow@62a024b]: Add pydruid to airflow
* 23:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:55 shdubsh: update mtail to 3.0.0~rc35 on the rest of the hosts - eqiad and esams
* 22:56 legoktm@deploy1002: Synchronized php-1.38.0-wmf.4/includes/http/MWHttpRequest.php: Allow using a reverse proxy for local HTTP requests ([[phab:T288848|T288848]]) (duration: 00m 56s)
* 20:44 shdubsh: update mtail to 3.0.0~rc35 on cp nodes in eqiad and esams
* 22:06 maryum: deployed security patch for [[phab:T293589|T293589]]
* 20:30 shdubsh: update mtail to 3.0.0~rc35 on wtp in eqiad
* 21:23 maryum: deployed security patch for [[phab:T293556|T293556]]
* 19:35 shdubsh: update mtail to 3.0.0~rc35 on mw in eqiad
* 21:05 mutante: mwmaint1002 - sudo -u www-data /usr/local/bin/mw-cli-wrapper /usr/local/bin/mwscript extensions/TranslationNotifications/scripts/DigestEmailer.php --wiki mediawikiwiki {{!}} Fatal error: Uncaught Error: Class 'MediaWiki\MediaWikiServices' not found
* 18:50 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@41186c8]: port glent from oozie to airflow (duration: 00m 39s)
* 20:58 mutante: mwmaint1002 - attempt to start mediawiki_job_translationnotifications-mediawikiwiki which was alerting as failed
* 18:50 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@41186c8]: port glent from oozie to airflow
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:28 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:605584]] [[phab:T254315|T254315]] test wikidata: Use the database name in the Wikibase entity source config (duration: 00m 58s)
* 20:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:56 krinkle@deploy1001: Synchronized wmf-config: {{Gerrit|I7721f4018b07dac}} (duration: 00m 58s)
* 19:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:55 krinkle@deploy1001: Synchronized wmf-config/ProductionServices.php: {{Gerrit|I7721f4018b07dac}} (duration: 00m 57s)
* 19:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:52 krinkle@deploy1001: Synchronized lib/: {{Gerrit|I7721f4018b07dac}} (duration: 00m 58s)
* 19:29 mutante: LDAP: removed non-existent user gerrit2 from group labsadminbots ([[phab:T160122|T160122]])
* 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1142', diff saved to https://phabricator.wikimedia.org/P11504 and previous config saved to /var/cache/conftool/dbconfig/20200615-153825-marostegui.json
* 19:29 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/resources/store/state.js: {{Gerrit|ac7b4fc2ccc69589e00a42f49d18a8f6d71777f2}}: Revert 727328 ([[phab:T293554|T293554]]) (duration: 00m 56s)
* 15:37 marostegui: Deploy schema change on db1142
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1142', diff saved to https://phabricator.wikimedia.org/P11503 and previous config saved to /var/cache/conftool/dbconfig/20200615-153630-marostegui.json
* 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1141', diff saved to https://phabricator.wikimedia.org/P11502 and previous config saved to /var/cache/conftool/dbconfig/20200615-153546-marostegui.json
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1141', diff saved to https://phabricator.wikimedia.org/P11501 and previous config saved to /var/cache/conftool/dbconfig/20200615-153344-marostegui.json
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:16 moritzm: upgrading wtp1025-wtp1027 to PHP 7.2.31
* 18:45 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Convert $wgEventStreams to be an associative array - [[phab:T277193|T277193]] (duration: 00m 57s)
* 15:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1121', diff saved to https://phabricator.wikimedia.org/P11499 and previous config saved to /var/cache/conftool/dbconfig/20200615-150908-marostegui.json
* 18:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:07 marostegui: Deploy schema change on db1121 (and labs)
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P11498 and previous config saved to /var/cache/conftool/dbconfig/20200615-150639-marostegui.json
* 18:07 mutante: gerrit - removed tonina from wmde-mediawiki gerrit group ([[phab:T293621|T293621]])
* 15:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P11497 and previous config saved to /var/cache/conftool/dbconfig/20200615-150148-marostegui.json
* 17:51 mutante: puppet run on all bastion hosts via cumin
* 15:00 marostegui: Deploy schema change on db1144:3314
* 15:32 mvernon@cumin2002: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 14:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3314', diff saved to https://phabricator.wikimedia.org/P11496 and previous config saved to /var/cache/conftool/dbconfig/20200615-145914-marostegui.json
* 15:32 mvernon@cumin2002: START - Cookbook sre.discovery.service-route
* 14:55 XioNoX: delete VCP from msw1-codfw
* 15:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 14:24 marostegui: Deploy schema change on db2107 (s2 codfw master) - [[phab:T250066|T250066]]
* 15:23 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 14:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:16 herron: reprepro copied anycast-healthchecker, python3-json-logger and python3-anycast-healthchecker from buster-wikimedia to bullseye-wikimedia [[phab:T292196|T292196]]
* 14:11 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 14:09 elukey@cumin2001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)
* 15:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 13:54 marostegui: Deploy schema change on db1100 (s5 master) - [[phab:T250066|T250066]]
* 14:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 13:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 13:49 marostegui: Upgrade db2133
* 14:54 herron: rebuilt and uploaded kafkatee for bullseye [[phab:T292196|T292196]]
* 13:49 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:50 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:44 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:36 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731346{{!}}[beta] Rename $wgIPInfoGeoIP2Path to $wgIPInfoGeoIP2Prefix (T289361)]] (duration: 00m 56s)
* 13:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:40 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:38 elukey@cumin2001: START - Cookbook sre.hadoop.roll-restart-workers
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:31 volans@deploy1001: Finished deploy [homer/deploy@ac7a4c6]: Release v0.2.3 on cumin2001 now on buster (duration: 01m 15s)
* 14:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:30 moritzm: rolling reboot on the ganeti cluster in esams (for kernel security updates and to pick up the network changes to provides instances with a public IP)
* 13:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:30 volans@deploy1001: Started deploy [homer/deploy@ac7a4c6]: Release v0.2.3 on cumin2001 now on buster
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:26 hashar: Started zuul-merger on contint1001 with newer virtualenv # [[phab:T255424|T255424]]
* 13:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (2/2) (duration: 00m 56s)
* 13:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:47 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (1/2) (duration: 00m 56s)
* 13:21 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=eqiad
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:20 hashar: Stopping zuul-merger on contint1001 to rebuild the virtualenv # [[phab:T255424|T255424]]
* 13:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:731014{{!}}Unconditionally enable Wikibase dispatching via jobs (T291828)]] (duration: 00m 56s)
* 13:19 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2079.codfw.wmnet with OS buster
* 12:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2091:3312, db2091:3314 - [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11495 and previous config saved to /var/cache/conftool/dbconfig/20200615-125856-marostegui.json
* 12:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:58 vgutierrez: upgrade acme-chief to version 0.26
* 12:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:57 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:55 Lucas_WMDE: UTC morning backport window done
* 12:46 vgutierrez: upload acme-chief 0.26 to apt.wm.o (buster) - [[phab:T255249|T255249]]
* 11:55 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (2/2) (duration: 00m 56s)
* 12:43 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:54 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (1/2) (duration: 00m 56s)
* 12:38 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:34 moritzm: rolling reboot on the ganeti cluster in eqsin (for security updates and to pick up the network changes to provides instances with a public IP)
* 11:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2079.codfw.wmnet with OS buster
* 12:12 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:11 marostegui: Upgrade db2134
* 11:49 marostegui: Reimage db2079 (codfw s8 master) [[phab:T290868|T290868]]
* 12:09 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730747{{!}}Set dispatchViaJobsAllowedClients to null everywhere (T291828)]] (duration: 00m 56s)
* 12:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:57 moritzm: reimaging sretest1002 to validate the reimage script on Buster
* 11:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731239{{!}}Make deduplication actually work for DispatchChangesJob (T291118)]] (duration: 00m 55s)
* 11:43 marostegui: Reimage dbproxy2003 which points to m3-master.codfw.wmnet (not in use) - [[phab:T255408|T255408]]
* 11:10 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/Hooks/RecentChangeSaveHookHandler.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (2/2) (duration: 00m 56s)
* 11:40 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:605543{{!}}GrowthExperiments: Switch on guidance feature (T239181)]] (duration: 00m 57s)
* 11:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (duration: 00m 56s)
* 11:10 sukhe@cumin1001: START - Cookbook sre.hosts.downtime
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 hnowlan: regenerated certificates for restbase2009, restbase101[678], restbase201[012]. Did not roll-restart yet
* 10:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:47 moritzm: copied wmf-certificates from buster-wikimedia to stretch-wikimedia in reprepro
* 11:03 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:38 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:731237{{!}}Don't filter by change Id when dispatching to client wikis ()]] (duration: 00m 59s)
* 11:03 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:48 moritzm: installing node-tar security updates on buster
* 10:54 moritzm: imported python-phabricator 0.7.0-2~wmf2 to apt.wikimedia.org/buster-wikimedia [[phab:T245114|T245114]]
* 09:39 vgutierrez: updating acme-chief to version 0.34 on acmechief instances - [[phab:T292619|T292619]]
* 10:39 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:605553{{!}} Bumping portals to master (605553)]] (duration: 00m 58s)
* 09:38 godog: sync metrics from graphite1004 to graphite2003 - [[phab:T247963|T247963]]
* 10:38 hnowlan: regenerated restbase2009's cassandra certificates
* 09:13 moritzm: installing apr security updates on bullseye
* 10:38 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:605553{{!}} Bumping portals to master (605553)]] (duration: 00m 58s)
* 08:57 godog: cleanup graphite metrics not modified for >= ~3yr (1024 days)
* 10:16 jmm@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97)
* 07:34 ema: cp3060 (text), cp3061 (upload): upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 10:16 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 07:34 elukey: depool + restart blazegraph on wdqs1013
* 10:12 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254820|T254820]] [enwikivoyage] Undeploy the Listings extension (duration: 01m 00s)
* 07:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:31 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 06:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:53 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:50 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 09:46 godog: run logstash benchmark on logstash1023
* 09:42 volans: deploying esams mgmt DNS records automatically generated by Netbox ( operations/dns/+/604136/ ) - [[phab:T233183|T233183]]
* 09:41 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:35 volans@cumin1001: START - Cookbook sre.dns.netbox
* 09:29 elukey: update analytics-in4/6 filters on cr1-cr2 eqiad to update the Druid term (new nodes added)
* 09:21 jbond42: offlining puppetmaster1003 and 2003 for reboot
* 09:17 XioNoX: reduce ae device-count from 10 to 3 on asw2-a/b/c-eqiad
* 09:14 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:11 jmm@cumin1001: START - Cookbook sre.hosts.downtime
* 08:55 marostegui: Deploy schema change on db2123 (s5 codfw master) - [[phab:T250066|T250066]]
* 08:50 kart_: Updated cxserver to 2020-06-10-044445-production ([[phab:T246319|T246319]], [[phab:T254959|T254959]])
* 08:46 kartik@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 08:42 kartik@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 08:39 kartik@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 08:34 moritzm: reimaging cumin2001 [[phab:T245114|T245114]]
* 08:22 marostegui: Switchover m3-master from dbproxy1008 to dbproxy1016 - [[phab:T202367|T202367]]
* 08:17 marostegui: Deploy schema change on db1131 (s6 master) - [[phab:T250066|T250066]]
* 08:09 moritzm: installing libexif security updates
* 07:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:46 XioNoX: standardize ae device-count on all routers
* 07:36 XioNoX: push new pfw firewall policies - [[phab:T255185|T255185]]
* 07:28 marostegui: Deploy schema change on db1093
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 for schema change', diff saved to https://phabricator.wikimedia.org/P11492 and previous config saved to /var/cache/conftool/dbconfig/20200615-072835-marostegui.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2092', diff saved to https://phabricator.wikimedia.org/P11491 and previous config saved to /var/cache/conftool/dbconfig/20200615-072742-marostegui.json
* 06:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime


== 2020-06-14 ==
== 2021-10-16 ==
* 13:51 qchris: Disabling puppet on gerrit1002 (test instance) to do some more upgrade testing
* 03:56 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 01:30 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)


== 2020-06-13 ==
== 2021-10-15 ==
* 21:12 qchris: Enabling puppet on gerrit1002 (test instance). Done with testing for today.
* 23:48 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 12:51 herron: restarted logstash service on logstash1007, logstash1009
* 23:27 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 12:34 qchris: Disabling puppet on gerrit1002 (test instance) to do some more upgrade testing
* 23:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 12:33 godog: bounce logstash on logstash1008, GC death
* 22:38 mutante: apt1001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 22:36 mutante: apt2001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 22:34 mutante: apt2001 - upgraded nginx
* 22:18 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:14 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:05 dpifke@deploy1002: Finished deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes (duration: 00m 05s)
* 22:05 dpifke@deploy1002: Started deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes
* 21:51 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:44 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:36 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:09 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 17:20 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 17:17 mutante: gitlab1001 - disabling puppet for debugging
* 17:05 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold - [[phab:T283076|T283076]]
* 17:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:50 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold
* 16:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:44 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:48 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:15 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:30 elukey: start topic rebalancing for kafka main-eqiad (long maintenance, it will last a couple of days)
* 13:24 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:21 vgutierrez: updating acme-chief to version 0.34 on acmechief-test instances - [[phab:T292619|T292619]]
* 13:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:14 vgutierrez: upload acme-chief 0.34 to apt.wikimedia.org (buster) - [[phab:T292619|T292619]]
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2007.codfw.wmnet
* 11:45 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:33 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2007.codfw.wmnet
* 11:14 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 09:15 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 09:06 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 08:58 jelto: jelto@gitlab1001:~$ sudo disable-puppet "disable puppet on gitlab1001 to test 728380 on GitLab replica - [[phab:T283076|T283076]]"
* 07:41 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:20 urbanecm: Start server-side upload for 1 video file
* 02:14 ryankemper: [[phab:T288231|T288231]] `wdqs2006` data transfer complete and all tests passing on the host. All of `codfw wdqs-internal` is on the new streaming updater
* 00:09 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:07 brennen: end of UTC late backport & config training window


== 2020-06-12 ==
== 2021-10-14 ==
* 17:44 herron: restarting logstash1011 elasticsearch instance
* 23:59 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 16:49 elukey: restart php-fpm and pool mw1384 - [[phab:T255282|T255282]]
* 23:58 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 16:33 elukey: (correct) depool again mw1384 - investigation will follow up in a task
* 23:56 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 56s)
* 16:32 elukey: depool again mw1348 - investigation will follow up in a task
* 23:49 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 15:49 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:48 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 15:44 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 23:46 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 56s)
* 15:40 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 23:43 ejegg: updated payments-wiki from {{Gerrit|19d18c1852}} to {{Gerrit|0f48acea49}}
* 15:40 pt1979@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:34 cjming@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikimediaEvents/includes/VectorPrefDiffInstrumentation.php: Backport: [[gerrit:730733{{!}}Change VectorPrefDiffInstrumentation stream name to `mediawiki.skin_diff` (T289622)]] (duration: 00m 56s)
* 15:37 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 23:24 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730936{{!}}allow sysops to add and remove users to other groups on ptwikivoyage (T292806)]] (duration: 00m 56s)
* 15:36 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 23:21 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 15:27 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:11 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730933{{!}}Add americanantiquarian.org to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T292918)]] (duration: 00m 57s)
* 15:25 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 23:11 mutante: mw1452 - re-pooled, scap pull
* 15:24 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:09 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:24 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:24 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:35 ryankemper: [[phab:T288231|T288231]] Ran puppet on `wdqs2006`, now back to the cookbook run
* 15:24 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:33 ryankemper: [[phab:T288231|T288231]] Forgot about running puppet-agent on `wdqs2006`; aborted cookbook run
* 15:22 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:33 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 15:22 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:33 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:32 ryankemper: [[phab:T288231|T288231]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/730795; proceeding to data-transfer on `wdqs2006`: `sudo rm -fv /srv/wdqs/data_loaded` on `wdqs2006` followed by `ryankemper@cumin1001:~$ sudo cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "streaming updater cutover for wdqs2005" --blazegraph_instance blazegraph --task-id [[phab:T288231|T288231]]`
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:31 mutante: depooling mw1452 for testig
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:28 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo pool`: transfer completed successfully; tests passing on host (used `ssh -L 9999:localhost:80 wdqs2005.codfw.wmnet` to establish tunnel)
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:23 dpifke@deploy1002: Finished deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]] (duration: 00m 05s)
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:23 dpifke@deploy1002: Started deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]]
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:17 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 15:22 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 22:07 eileen: civicrm revision changed from {{Gerrit|018d3b19fe}} to {{Gerrit|9b5e0d015b}}, config revision is {{Gerrit|781d6a1b1f}}
* 14:51 elukey: repool mw1384 as test
* 21:34 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 21:25 robh@cumin1001: START - Cookbook sre.dns.netbox
* 14:30 akosiaris: bump cpu limits for changeprop another 50%
* 21:10 robh@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:30 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 21:06 robh@cumin1001: START - Cookbook sre.dns.netbox
* 13:36 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 19:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 13:34 akosiaris: update changeprop in eqiad+codfw for higher CPU limits
* 19:23 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 13:34 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 19:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 13:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1088 after schema change', diff saved to https://phabricator.wikimedia.org/P11483 and previous config saved to /var/cache/conftool/dbconfig/20200612-131205-marostegui.json
* 18:53 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1088 for schema change', diff saved to https://phabricator.wikimedia.org/P11482 and previous config saved to /var/cache/conftool/dbconfig/20200612-124015-marostegui.json
* 18:53 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=dagwiki --fix
* 12:18 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)
* 18:47 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=frwiktionary --logwiki=metawiki 'TURK FASTER' 'ARTHUR MORGAN'
* 11:52 filippo@cumin1001: START - Cookbook sre.hosts.upgrade-and-reboot
* 18:42 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'George Dum Fulton' 'George Fulton' # [[phab:T293403|T293403]]
* 11:23 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:41 urbanecm: UTC evening B&C done
* 11:19 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 18:40 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/extension.json: {{Gerrit|6da3523daaba85a4199721980c0a9c96b20697e7}}: Fix assessment quickview labels ([[phab:T292596|T292596]]) (duration: 01m 03s)
* 11:15 moritzm: failover ganeti master in ulsfo to ganeti4003
* 18:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c8dffefd0d095abe3709dcc962d5d24f27b55869}}: Create Salima namespace for dagwiki ([[phab:T289911|T289911]]) (duration: 01m 04s)
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2080 and db2084 into s8 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11481 and previous config saved to /var/cache/conftool/dbconfig/20200612-111422-marostegui.json
* 18:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 11:11 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bccd4bc45498db8628567574d0bb3a23f8fb378}}: Add $wgSitename and $wgMetaNamespace for kswiki and kswiktionary ([[phab:T289752|T289752]], [[phab:T289767|T289767]]) (duration: 01m 04s)
* 11:07 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 18:17 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 11:02 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|262e588b44f126fb9e1aa933a3ca59b191b42bd7}}: Enable Growth mentor dashboard backend on all wikis ([[phab:T278920|T278920]]) (duration: 01m 05s)
* 10:58 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|41baa8c41d64510986f009b9be2d70dad0915f8c}}: Add new mediawiki.skin_diff event logging stream ([[phab:T289622|T289622]]) (duration: 01m 05s)
* 10:39 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:03 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 10:36 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 18:02 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 10:33 moritzm: rolling restart of the ulsfo ganeti cluster
* 18:01 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 10:21 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)
* 17:54 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 10:02 filippo@cumin1001: START - Cookbook sre.hosts.upgrade-and-reboot
* 17:52 rzl: repooled mw1452 (with `sudo pool` so no auto log from conftool)
* 10:01 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:47 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 10:01 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 17:45 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw1452.eqiad.wmnet
* 10:01 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single
* 17:42 rzl: depool mw1452 for training
* 10:01 jmm@cumin1001: START - Cookbook sre.hosts.downtime
* 17:32 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Include db2084 in dbctl, depooled', diff saved to https://phabricator.wikimedia.org/P11480 and previous config saved to /var/cache/conftool/dbconfig/20200612-095855-marostegui.json
* 17:31 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 09:58 godog: roll-restart thanos-fe / thanos-be for microcode updates
* 17:29 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 08:51 elukey: restart gerrit on gerrit1001
* 16:44 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 08:48 elukey: update cr1/cr2 analyitics filters for [[phab:T252767|T252767]] and [[phab:T252675|T252675]]
* 16:44 ryankemper: [[phab:T288231|T288231]] Manually killed dangling `pigz` / `nc` processes on `wdqs2008` (and `wdqs2005` implicitly). Should be in the right state to re-start the `data-transfer` cookbook from again
* 08:44 marostegui: Compress InnoDB on db2092 - [[phab:T254462|T254462]]
* 16:41 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 08:36 marostegui: Clone db2084 from db2080
* 16:37 elukey: drop kubeflow-kfserving* docker images from deneb
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2080 to clone db2084', diff saved to https://phabricator.wikimedia.org/P11478 and previous config saved to /var/cache/conftool/dbconfig/20200612-083231-marostegui.json
* 16:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 08:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:34 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 16:33 moritzm: installing node-ansi-regex security updates
* 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2084 from s4 and s5', diff saved to https://phabricator.wikimedia.org/P11477 and previous config saved to /var/cache/conftool/dbconfig/20200612-081455-marostegui.json
* 16:28 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere (duration: 02m 24s)
* 07:56 elukey: depool mw1384
* 16:25 mbsantos@deploy1002: Started deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2084 from s4 and s5', diff saved to https://phabricator.wikimedia.org/P11476 and previous config saved to /var/cache/conftool/dbconfig/20200612-075202-marostegui.json
* 16:24 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 04s)
* 07:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:16 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad (duration: 02m 41s)
* 07:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 16:14 mbsantos@deploy1002: Started deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad
* 07:08 marostegui: Reimage db2086
* 16:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 07:07 elukey: depool/scap pull/pool mw1384
* 16:07 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 07:05 moritzm: installing intel-microcode security updates (regressions have been sorted out)
* 16:07 ryankemper: [[phab:T288231|T288231]] About to ctrl+c out of ongoing data transfer because puppet run following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/730794 restarted blazegraph; we'll manually disable updater and kick off the transfer again
* 05:42 moritzm: installing stretch kernel security updates  (no reboots yet)
* 16:04 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo run-puppet-agent --force`
* 05:40 moritzm: installing buster kernel security updates  (no reboots yet)
* 15:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 04:54 marostegui: Deploy schema change on s6 codfw - [[phab:T250066|T250066]]
* 15:54 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2008:~$ sudo depool`
* 01:02 ejegg: updated payments-wiki from {{Gerrit|aceddff8b5}} to {{Gerrit|5fd4eb1519}}
* 15:52 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo depool`
* 00:10 Amir1: BACON is done
* 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:13 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/VisualEditor/includes/VisualEditorHooks.php: Backport: [[gerrit:730729{{!}}Fix value of 'namespacesWithSubpages' in wgVisualEditorConfig (T293310)]] (duration: 01m 04s)
* 15:02 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 03s)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:59 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:53 kormat: upgrading orchestrator.wm.o to 3.2.6-1 [[phab:T275784|T275784]]
* 14:49 jbond@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=apt
* 14:43 jbond: migrate apt.w.o to a dns active/passiev discovery address (cc moritzm)
* 14:23 moritzm: installing krb5 security updates on KDCs
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:10 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 2/2) (duration: 01m 03s)
* 14:07 urbanecm: Run extensions/GrowthExperiments/initWikiConfig.php for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:07 urbanecm: Create growthexperiments DB tables for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 1/2) (duration: 01m 04s)
* 14:03 urbanecm@deploy1002: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|82d0a4bf45126ecba2cfcd1a0c2081a00f58dca3}}: Enable VE by default on 4 more wikis ([[phab:T290614|T290614]]) (duration: 01m 05s)
* 13:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 kormat: uploaded orchestrator 3.2.6-1 packages to apt.wm.o (buster) [[phab:T275784|T275784]]
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 12:44 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 12:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730746{{!}}Untangle “dispatch via jobs” settings in Wikibase.php (T291828)]] (no-op) (duration: 01m 04s)
* 12:12 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730725{{!}}Set wmgWikibaseDispatchViaJobsPruneChangesTableInJobEnabled for wikidatawiki (T291828)]] (no-op) (duration: 01m 05s)
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 10:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/: {{Gerrit|1f33fc3}}, {{Gerrit|e0ea1b8}}, {{Gerrit|cba2ac9}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 05s)
* 10:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|465b564}}, {{Gerrit|a8cc98b}}, {{Gerrit|6e95c48}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 06s)
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 09:20 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:20 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 volans@deploy1002: Finished deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1 (duration: 00m 50s)
* 09:17 volans@deploy1002: Started deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1
* 09:04 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 45s)
* 09:03 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 09:02 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 23s)
* 09:02 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 08:52 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:22 volans: rolling out debmonitor-client upgrade to 0.3.1 across the fleet
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:24 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:24 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift-ro,name=eqiad
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad
* 07:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:37 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:52 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:50 foks: changing user email for "Region of Peel Archives"
* 01:41 ejegg: updated payments-wiki from {{Gerrit|b329d2dea2}} to {{Gerrit|19d18c1852}}
* 01:35 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:31 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2020-06-11 ==
== 2021-10-13 ==
* 23:54 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/Wikibase: [[gerrit:604845{{!}}Fix entity id lookup for interwiki special page links (T255078)]] (duration: 00m 38s)
* 23:37 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:51 ladsgroup@deploy1001: scap failed: average error rate on 3/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
* 23:36 eileen: civicrm revision changed from {{Gerrit|946dfb6c5a}} to {{Gerrit|018d3b19fe}}, config revision is {{Gerrit|85277466ed}}
* 23:43 ladsgroup@deploy1001: Synchronized wmf-config/extension-list: [[gerrit:604778{{!}}Remove ContributionTracking extension]] ([[phab:T255216|T255216]]), Part III (duration: 00m 57s)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730575{{!}}Create an alias for the project namespace on kswiki (T291740)]] (duration: 01m 05s)
* 23:42 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:604778{{!}}Remove ContributionTracking extension]] ([[phab:T255216|T255216]]), Part II (duration: 00m 58s)
* 22:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:38 ladsgroup@deploy1001: Synchronized wmf-config/CommonSettings.php: [[gerrit:604778{{!}}Remove ContributionTracking extension]] ([[phab:T255216|T255216]]), Part I (duration: 00m 59s)
* 22:01 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Collection/includes/Specials/SpecialCollection.php: Backport: [[gerrit:730578{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 23:37 Reedy: create cn_notice_regions on metawiki and testwiki [[phab:T252596|T252596]]
* 21:50 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection: Backport: [[gerrit:730577{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 20:34 pt1979@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:47 foks: removing 8 files for legal compliance
* 20:31 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 21:03 foks: removing 2 files for legal compliance
* 20:15 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:00 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 20:13 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 20:50 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:00 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:49 brennen@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/Api/ApiGetBookCreatorBoxContent.php: Backport: [[gerrit:730574{{!}}Fall back to main page if given title is invalid (T293299)]] (duration: 01m 04s)
* 19:59 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.36
* 20:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:58 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 20:40 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:33 akosiaris: apply emergency sessionstore fixes in codfw as well
* 20:31 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 19:32 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 20:27 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:25 gilles@deploy1001: Finished deploy [performance/asoranking@0a096c4]: [[phab:T252424|T252424]] (duration: 00m 47s)
* 20:04 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:19 gilles@deploy1001: Started deploy [performance/asoranking@0a096c4]: [[phab:T252424|T252424]]
* 20:03 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:12 akosiaris: repool eqiad for sessionstore
* 20:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 19:12 akosiaris@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=sessionstore
* 19:18 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 19:10 akosiaris: remove the podaffinity restrictions for sessionstore in eqiad
* 19:16 mutante: gitlab2001 - status before was that "gitlab-ctl status" showed components "gitlab-workhorse" and "postgres-exporter" as "down". this was either pre-broken or caused by the restore process. after manually 'gitlab-ctl start gitlab-workhorse' all of the components are in "run" and https://gitlab-replica.wikimedia.org is up ( [[phab:T285867|T285867]])
* 19:10 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 19:08 mutante: gitl1b2001 - started workhorse which was for some reason marked as down after restore command ran
* 19:07 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 19:08 mutante: [gitlab2001:~] $ sudo /usr/bin/gitlab-ctl start gitlab-workhorse
* 18:08 ppchelko@deploy1001: Synchronized wmf-config/reverse-proxy-staging.php: Beta: Switch from HTCP purging to kafka purging gerrit:603530, reverse-proxy-staging.php (duration: 01m 06s)
* 19:06 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 01m 03s)
* 18:06 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Beta: Switch from HTCP purging to kafka purging gerrit:603530, IS-labs.php (duration: 01m 06s)
* 19:05 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 17:29 mbsantos@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' .
* 19:02 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87879865c35edab3ead523027681146e00d6fc02}}: Create Translation namespace for viwikisource ([[phab:T290691|T290691]]) (duration: 01m 04s)
* 17:26 mbsantos@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 18:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|06fd0f225575448771cdba0d4e6bf36bb6715bc1}}: add extendedconfimed for autoreview group on ptwiki ([[phab:T292912|T292912]]) (duration: 01m 04s)
* 17:22 mbsantos@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' .
* 18:37 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript initSiteStats.php --wiki=ptwiki --update
* 17:19 mbsantos@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 18:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=ptwiki extendedconfirmed
* 17:12 bstorm_: reboot for stretch upgrade on labstore1004 [[phab:T224582|T224582]]
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bb2b388217aa91a39ed3684f87fdf7edb06fd81}}Set autoconfirmedextended and confirmedextended for ptwiki ([[phab:T292915|T292915]]) (duration: 01m 04s)
* 16:49 bstorm_: doing stretch upgrade for labstore1004 [[phab:T224582|T224582]]
* 18:16 urbanecm@deploy1002: Synchronized static/images/project-logos: {{Gerrit|694bc234ab5dbb9a2387a6129998d45a53ac0ab3}}: Remove an old dawiki temporary logo (duration: 01m 04s)
* 16:36 bstorm_: rebooting labstore1004 for upgrades [[phab:T224582|T224582]]
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|224e2a374b1cc6327e9d8c2bca576091ce4efc74}}: Add NS_MAIN back to wgExtraSignatureNamespaces for mediawikiwiki ([[phab:T291630|T291630]]) (duration: 01m 05s)
* 16:12 bstorm_: downtimed labstore1005 for upgrades on 1004 since that will alert as well [[phab:T224582|T224582]]
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:10 bstorm_: downtimed labstore1004 for upgrades [[phab:T224582|T224582]]
* 18:12 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:50 cstone: SmashPig revision changed from {{Gerrit|b9de3c7aac}} to {{Gerrit|2246685626}}
* 18:11 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|1b96f54a518620b0dc6a0ab63b402d0ea2c6bf70}}: Update logo for liwiktionary ([[phab:T291479|T291479]]) (duration: 01m 14s)
* 15:34 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 18:10 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:31 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 18:10 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:25 moritzm: installing buster kernel security updates (no reboots yet)
* 18:09 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:04 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 18:09 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:04 mforns@deploy1001: Finished deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] (duration: 01m 39s)
* 18:08 volans: uploaded debmonitor-client_0.3.1 to apt.wikimedia.org stretch-wikimedia,buster-wikimedia,bullseye-wikimedia
* 15:04 root@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 17:14 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|dd7a3314602ffddc5b917cccc71c917301639388}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 04s)
* 15:04 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 17:13 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|5c27154cf434bebc37f5e98e2ad1b5cea7cde1d4}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 15s)
* 15:02 mforns@deploy1001: Started deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966]
* 16:57 mutante: stat1008 - short on disk space, mostly used in /tmp, high CPU usage by R proccess, sent a message about it to all shell users via wall
* 15:02 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 16:50 mutante: stat1008 - apt-get clean - freed 1.3 GB disk space - was alerting in Icinga because / was 97% full
* 14:56 herron: bounced elasticsearch on logstash1012
* 16:37 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:41 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 16:37 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:40 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 16:23 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:37 herron: enabled VO incident resolution notification in global settings
* 16:23 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:34 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 15:29 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:31 akosiaris@cumin1001: START - Cookbook sre.hosts.decommission
* 15:28 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:30 godog: bounce logstash on logstash1009, apparent GC death spiral
* 15:26 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:03 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 15:26 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:03 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 15:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:03 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1)
* 15:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:03 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single
* 15:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:35 filippo@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad
* 15:12 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 13:35 filippo@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=thanos-swift,name=eqiad
* 15:12 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 12:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
* 15:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 12:36 elukey: updated pcc facts
* 15:04 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:28 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 15:03 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 12:28 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:03 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 12:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:01 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 15:01 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 12:15 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 15:01 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 12:15 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 14:59 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:04 jforrester@deploy1001: Synchronized php-1.35.0-wmf.36/includes/title/NamespaceInfo.php: [[phab:T253098|T253098]] NamespaceInfo::makeValidNamespace: Don't throw for -1 or -2 (duration: 01m 06s)
* 14:59 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 12:03 marostegui: Reimage es2023 (es5 codfw master)
* 14:59 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2075 [[phab:T254139|T254139]]', diff saved to https://phabricator.wikimedia.org/P11469 and previous config saved to /var/cache/conftool/dbconfig/20200611-115430-marostegui.json
* 14:57 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:46 marostegui: Deploy schema change on s6 codfw - [[phab:T250066|T250066]]
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:44 volans@deploy1001: Finished deploy [homer/deploy@df83901]: Release v0.2.3 (duration: 00m 25s)
* 14:56 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:44 volans@deploy1001: Started deploy [homer/deploy@df83901]: Release v0.2.3
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:36 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:36 matthiasmullie: EU BACON done
* 14:54 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 11:35 mlitn@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/GrowthExperiments: Help panel: Update guidance behavior rules (duration: 01m 06s)
* 14:52 ema: repool cp4021, further testing can be performed on sretest1001 [[phab:T201317|T201317]]
* 11:34 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 14:51 volans: restarting ircecho.service on alert1001 to get back icinga-wm without the underscore
* 11:34 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:50 elukey: restart pybal on lvs1015 (low-traffic primary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 11:28 kartik@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/ContentTranslation/modules/tools/mw.cx.tools.IssueTrackingTool.js: Backport: [[gerrit{{!}}604587{{!}}IssueTrackingTool: Fix js error in getCurrentNodeId method (T254965)]] (duration: 01m 07s)
* 14:48 moritzm: reverted to clean package state on deneb
* 11:08 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 14:44 elukey@puppetmaster1001: conftool action : ge; selector: cluster=ml_serve,service=inference
* 11:04 mlitn@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/MachineVision: $aliases should be an array of strings, not AliasGroup objects (duration: 01m 07s)
* 14:36 elukey: restart pybal on lvs1016 (low-traffic secondary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 10:47 moritzm: repooling mw1318,mw2139,mw2145,mw2147,mw2221,mw2219,mw2250,mw2350  (these were depooled, but seem all fine in Icinga and were probably just forgotten)
* 14:27 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:41 filippo@cumin1001: conftool action : set/pooled=yes; selector: cluster=thanos,service=thanos-swift
* 14:27 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 10:40 filippo@cumin1001: conftool action : set/pooled=yes; selector: cluster=thanos,service=thanos-query
* 14:25 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:37 moritzm: installing buster kernel security updates  (no reboots yet, on hold for regression-free microcode update)
* 14:25 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 10:32 godog: roll-restart pybal in eqiad lvs low-traffic
* 14:21 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:21 mutante: restarting gerrit on gerrit-replica (gerrit2001) - java.lang.OutOfMemoryError: Java heap space
* 14:21 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 10:21 Urbanecm: Run scap pull at mwdebug1001 to revert temporary changes
* 14:20 moritzm: temporarily downgrade sphinx packages on deneb to 1.7.9-1~bpo9+1 to build a Ganeti 2.16 stretch backport with delicate toolchain needs
* 10:14 Urbanecm: Applying temporary changes on mwdebug1001
* 14:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:58 moritzm: upgrading netmon* to PHP 7.2.31
* 14:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:55 marostegui: Upgrade es2025
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:54 moritzm: upgrading mwmaint* to PHP 7.2.31
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:46 moritzm: upgrading labweb* PHP 7.2.31
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:36 elukey: switch piwik.wikimedia.org from matomo1001 to matomo1002 (new buster node)
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:59 XioNoX: push prep-work for anycast tuning in ulsfo - [[phab:T288843|T288843]]
* 09:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 13:38 jayme: imported helm-diff_3.1.3-2 to buster-wikimedia (https://gerrit.wikimedia.org/r/c/operations/debs/helm-diff/+/730509)
* 08:48 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:37 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 08:48 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:34 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 08:42 moritzm: imported memcached 1.6.6-1~wmf10u1
* 12:13 Lucas_WMDE: UTC morning backport+config window done
* 08:39 marostegui: Reimage es2024 to buster
* 12:12 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/includes: Backport: [[gerrit:730370{{!}}Add Link: Do not log "no suggestion found" errors in production log (T291251)]] (duration: 01m 04s)
* 08:30 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:11 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]' # after applying 730512 at mwmaint1002 to workaround [[phab:T293219|T293219]] # [[phab:T255037|T255037]]
* 08:30 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 12:11 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/modules: Backport: [[gerrit:730371{{!}}Suggested Edits: Update local config.presets when topics/difficulty presets change (T292536)]] (duration: 01m 07s)
* 08:25 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:56 urbanecm@deploy1002: Synchronized wmf-config/config/itwiki.yaml: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]) (duration: 01m 04s)
* 08:25 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:55 urbanecm: mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=mediawikiwiki "Growth/Communities/How to introduce yourself as a mentor" "Growth/Communities/How to configure the mentors' list" "Martin Urbanec (WMF)" --reason '[[:phab:T293184]]' # [[phab:T293184|T293184]]
* 08:25 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:55 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 2/3) (duration: 01m 04s)
* 08:25 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 1/3) (duration: 01m 05s)
* 08:24 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]' # [[phab:T255037|T255037]]
* 08:24 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:49 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=itwiki growthexperiments # [[phab:T255037|T255037]]
* 08:24 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:730380{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 07s)
* 08:24 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 11:43 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Wikibase/repo/: Backport: [[gerrit:730385{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 18s)
* 08:23 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 11:33 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 08:23 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 11:19 ema: pool cp4021 after reimage [[phab:T201317|T201317]]
* 08:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:05 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 08:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 10:15 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 08:18 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:18 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 10:09 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:728490{{!}}Add more types of QuickSurveys on beta cluster (T292459)]] (duration: 01m 53s)
* 08:01 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 10:06 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:01 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 09:22 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 07:59 moritzm: upgrading remaining job runners in eqiad to PHP 7.2.31
* 08:35 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:59 hashar: Restarted Zuul on contint2001 for config change # [[phab:T253263|T253263]]
* 08:28 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:43 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 08:21 elukey: run kafka preferred-replica-election on kafka-main1001 to rebalance partition leaders - [[phab:T288825|T288825]]
* 07:34 moritzm: upgrading remaining app servers in eqiad to PHP 7.2.31
* 08:15 godog: bounce graphite on graphite1004 to apply new config
* 07:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:33 elukey: increase kafka topic partition size of the top 4 high traffic topics of main-eqiad as described in https://phabricator.wikimedia.org/T288825#7422726
* 07:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:13 XioNoX: provision new eqsin-ulsfo link - [[phab:T273308|T273308]]
* 07:07 marostegui: Stop MySQL on dbstore1003 for reimage - [[phab:T254870|T254870]]
* 06:26 elukey: `kafka topics --alter --topic <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change --partitions 3` on kafka-main2001 - [[phab:T288825|T288825]]
* 06:38 XioNoX: make asw2-esams interfaces Homer like - [[phab:T250429|T250429]]
* 00:38 ejegg: updated payments-wiki from {{Gerrit|030b11da1a}} to {{Gerrit|b329d2dea2}}
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1127 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11467 and previous config saved to /var/cache/conftool/dbconfig/20200611-055536-marostegui.json
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1127 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11466 and previous config saved to /var/cache/conftool/dbconfig/20200611-052535-marostegui.json
* 05:04 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1127 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11465 and previous config saved to /var/cache/conftool/dbconfig/20200611-050446-marostegui.json
* 05:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1078', diff saved to https://phabricator.wikimedia.org/P11464 and previous config saved to /var/cache/conftool/dbconfig/20200611-050200-marostegui.json
* 04:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078', diff saved to https://phabricator.wikimedia.org/P11463 and previous config saved to /var/cache/conftool/dbconfig/20200611-045426-marostegui.json
* 04:50 marostegui: Deploy schema change on testwiki - [[phab:T254371|T254371]]
* 04:47 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1084 and slowly repool db1127 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11462 and previous config saved to /var/cache/conftool/dbconfig/20200611-044725-marostegui.json
* 03:13 shdubsh: removing WDQS-Streaming-Updater-POC metrics on graphite1004 - [[phab:T255044|T255044]]
* 02:43 tstarling@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/Wikibase/lib/includes/Store/EntityLinkTargetEntityIdLookup.php: investigate UBN [[phab:T255078|T255078]] (duration: 01m 07s)


== 2020-06-10 ==
== 2021-10-12 ==
* 23:55 catrope@deploy1001: Synchronized php-1.35.0-wmf.36/includes/skins/SkinTemplate.php: [[phab:T255073|T255073]] (duration: 01m 07s)
* 23:48 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:14 eileen: civicrm revision changed from {{Gerrit|80a0d22350}} to {{Gerrit|f01b036128}}, config revision is {{Gerrit|a26d023633}}
* 23:16 urbanecm: UTC late B&C window done
* 21:23 akosiaris: increase memory/cpu limits for proton
* 23:15 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 01m 04s)
* 21:23 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 23:12 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 02m 09s)
* 21:11 mbsantos@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 23:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:08 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 22:53 urbanecm: [urbanecm@labweb1001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=labswiki Jamesmontalvo3 #
* 21:06 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 22:51 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:45 mbsantos@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 20:21 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:33 jhuneidi@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 19:31 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.1 (duration: 04m 02s)
* 20:15 mbsantos@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:04 mbsantos@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:46 herron: bouncing elasticsearch on logstash1011
* 19:02 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 19:01 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Use EventRelayerNull for wikitech, gerrit:604469 (duration: 01m 05s)
* 18:47 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 45m 36s)
* 18:54 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/VisualEditor/: {{Gerrit|8958860}}: Make VisualEditorDisableForAnons only hide the tabs, not disable the editor ([[phab:T253941|T253941]]) (duration: 01m 07s)
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 18:32 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.35/extensions/VisualEditor/: {{Gerrit|5f4c609}}: Make VisualEditorDisableForAnons only hide the tabs, not disable the editor ([[phab:T253941|T253941]]) (duration: 01m 14s)
* 18:01 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 16:40 godog: EDIT: in esams
* 17:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:39 godog: restart prometheus@ops in eqiad
* 17:56 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/CentralNotice: Backport: [[gerrit:730141]] (duration: 00m 59s)
* 16:31 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable HTCP purges everywhere, gerrit:603655 (duration: 01m 05s)
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:27 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 17:46 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 16:27 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 17:43 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 16:18 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 17:41 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 16:18 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 17:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:13 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:13 ema: correction: restart purged on all *cache_upload* hosts to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604430/ [[phab:T250781|T250781]] [[phab:T133821|T133821]]
* 17:32 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SyntaxHighlight_GeSHi/includes/ResourceLoaderPygmentsModule.php: Backport: [[gerrit:730233{{!}}Include generated styles before Mediawiki overrides (T292736)]] (duration: 00m 57s)
* 16:12 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 17:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:12 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 17:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:12 ema: restart purged on all cache hosts to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604430/ [[phab:T250781|T250781]] [[phab:T133821|T133821]]
* 17:23 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730236{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 16:11 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 17:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:06 ema: cp3051: restart purged to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604430/ [[phab:T250781|T250781]] [[phab:T133821|T133821]]
* 17:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:02 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:16 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730235{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 16:00 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 17:12 moritzm: installing rsync bugfix updates
* 15:49 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:09 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:45 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 16:56 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:38 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 16:55 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 15:37 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:53 moritzm: failed over ganeti master for test cluster to ganeti2025
* 15:36 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Send kafka purges everywhere, gerrit:603654 (duration: 01m 05s)
* 16:50 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:35 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 16:48 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:32 ema: remaining-cp (non-ulsfo): rolling ats-tls-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604305/ [[phab:T255015|T255015]]
* 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 ppchelko@deploy1001: Synchronized wmf-config/CommonSettings.php: Make kafka purges config more robust, gerrit:603649, CS.php (duration: 01m 05s)
* 16:30 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts testvm2009.codfw.wmnet
* 15:27 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Make kafka purges config more robust, gerrit:603649, IS.php (duration: 01m 08s)
* 16:30 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:21 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:19 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 16:26 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 15:08 godog: roll-restart prometheus k8s to enable thanos upload
* 16:26 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes: Backport: [[gerrit:730226{{!}}Pre-format comments for non-local files too (T292570)]] (duration: 01m 15s)
* 15:02 ema: A:cp-ulsfo: rolling ats-tls-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604305/ [[phab:T255015|T255015]]
* 16:17 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 14:43 ema: A:cp rolling systemctl restart trafficserver
* 16:16 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2009.codfw.wmnet
* 14:28 ema: systemctl restart trafficserver for instances critical in icinga
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:21 ema: cp3056: ats-backend-restart
* 16:10 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 14:09 ema: A:cp rolling ats-be/ats-tls restarts to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604305/ [[phab:T255015|T255015]]
* 16:09 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 14:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 16:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730231{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 00m 57s)
* 14:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:00 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 13:59 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:57 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1094 into s7', diff saved to https://phabricator.wikimedia.org/P11458 and previous config saved to /var/cache/conftool/dbconfig/20200610-135753-marostegui.json
* 15:58 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730230{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 02m 13s)
* 13:50 ema: cp3050: ats-tls-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604305/ [[phab:T255015|T255015]]
* 15:57 volans@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2009.codfw.wmnet
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1094 into s7', diff saved to https://phabricator.wikimedia.org/P11457 and previous config saved to /var/cache/conftool/dbconfig/20200610-135039-marostegui.json
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:40 ema: cp3050: ats-backend-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/604305/ [[phab:T255015|T255015]]
* 15:51 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 13:36 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 15:49 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 13:06 liw@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.36 (duration: 01m 04s)
* 15:48 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 13:05 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.36
* 15:48 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 12:33 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:41 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for analytics1069.eqiad.wmnet
* 12:32 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 15:41 btullis@cumin1001: START - Cookbook sre.hosts.remove-downtime for analytics1069.eqiad.wmnet
* 12:32 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
* 15:02 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:13 akosiaris: pool thumbor2002, thumbor2001. [[phab:T251570|T251570]]
* 14:50 volans@cumin2002: START - Cookbook sre.dns.netbox
* 12:12 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=thumbor2002.codfw.wmnet
* 13:49 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:12 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=thumbor2001.codfw.wmnet
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
* 11:50 marostegui: Deploy schema change on commonswiki codfw [[phab:T255003|T255003]]
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 11:41 moritzm: upgrading remaining app servers in codfw to PHP 7.2.31
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 marostegui: Deploy schema change on testcommonswiki [[phab:T255003|T255003]]
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:37 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|52091b8}}: Grant cswiki accountcreators tboverride-account and override-antispoof ([[phab:T254927|T254927]]) (duration: 01m 06s)
* 13:14 godog: add 50G to prometheus/k8s in eqiad
* 11:13 moritzm: upgrading remaining job runners in codfw to PHP 7.2.31
* 13:13 otto@deploy1002: Synchronized wmf-config/CommonSettings.php: Enable x_client_ip_forwarding_enabled for eventgate-analytics and eventgate-analytics-external - [[phab:T288853|T288853]] (duration: 00m 56s)
* 11:02 marostegui: Stop MySQL on db1094 to clone db1127
* 13:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1094 moving to clone db1127 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11453 and previous config saved to /var/cache/conftool/dbconfig/20200610-110204-marostegui.json
* 13:11 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 10:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:05 volans: upgraed spicerack to 1.0.5 on cumin hosts
* 10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 12:25 volans: uploaded spicerack_1.0.5 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127 moving it to s7 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11452 and previous config saved to /var/cache/conftool/dbconfig/20200610-103742-marostegui.json
* 12:15 elukey: `kafka topics --alter --topic codfw.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1103,db1137 into x1', diff saved to https://phabricator.wikimedia.org/P11451 and previous config saved to /var/cache/conftool/dbconfig/20200610-102805-marostegui.json
* 12:15 elukey: `kafka topics --alter --topic eqiad.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 10:24 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254036|T254036]] Undeploy CollaborationKit: IV – Drop flag to load (duration: 01m 05s)
* 12:10 elukey: `kafka topics --alter --topic codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 10:23 jayme: [[phab:T254581|T254581]] re-enabled puppet on all mw, api and jobrunner servers
* 12:09 elukey: `kafka topics --alter --topic eqiad.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 10:20 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T254036|T254036]] Undeploy CollaborationKit: III – Drop ability to load (duration: 01m 05s)
* 11:58 elukey: `kafka topics --alter --topic codfw.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 10:16 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254036|T254036]] Undeploy CollaborationKit: II – Disable on Test Wikipedia (duration: 01m 37s)
* 11:49 elukey: `kafka topics --alter --topic eqiad.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103,db1137 into x1', diff saved to https://phabricator.wikimedia.org/P11450 and previous config saved to /var/cache/conftool/dbconfig/20200610-101407-marostegui.json
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 10:12 moritzm: upgrading remaining API servers in codfw to PHP 7.2.31
* 11:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103,db1137 into x1', diff saved to https://phabricator.wikimedia.org/P11449 and previous config saved to /var/cache/conftool/dbconfig/20200610-100834-marostegui.json
* 11:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:03 jynus: cloning reviewdb into reviewdb-test at db1132 with replication enabled [[phab:T254516|T254516]]
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103 into x1', diff saved to https://phabricator.wikimedia.org/P11448 and previous config saved to /var/cache/conftool/dbconfig/20200610-100306-marostegui.json
* 11:34 urbanecm: UTC morning B&C window done
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool