You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(volker-e@deploy1001: Finished deploy [design/style-guide@4bfe647]: Deploy design/style-guide: (duration: 00m 08s))
imported>Stashbot
(legoktm: uploaded python-logstash to buster-wikimedia for T294393)
 
(510 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== 2020-04-01 ==
== 2021-10-26 ==
* 22:44 volker-e@deploy1001: Finished deploy [design/style-guide@4bfe647]: Deploy design/style-guide: (duration: 00m 08s)
* 22:59 legoktm: uploaded python-logstash to buster-wikimedia for [[phab:T294393|T294393]]
* 22:43 volker-e@deploy1001: Started deploy [design/style-guide@4bfe647]: Deploy design/style-guide:
* 21:29 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 22:02 volans: forcing logrotate on netflow2001 to compress yesterday's logs
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:53 volans: force-rebooting ms-be1023, unresponsive - [[phab:T249174|T249174]]
* 21:13 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 21:50 volans: stopped and restarted kafkatee-webrequest.service on netflow2001, was in a restart loop
* 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:48 marxarelli: rollback of 1.35.0-wmf.26 from group1 ([[phab:T247773|T247773]]). blocked by [[phab:T249162|T249162]]
* 21:04 reedy@deploy1002: Synchronized php-1.38.0-wmf.5/tests/phpunit/includes/api/query/ApiQueryImageInfoTest.php: [[phab:T293783|T293783]] (duration: 01m 02s)
* 19:30 dduvall@deploy1001: rebuilt and synchronized wikiversions files: rollback 1.35.0-wmf.26 from group1
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:21 dduvall@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.26 (duration: 01m 06s)
* 21:03 reedy@deploy1002: Synchronized php-1.38.0-wmf.6/tests/phpunit/includes/api/query/ApiQueryImageInfoTest.php: [[phab:T293783|T293783]] (duration: 01m 02s)
* 19:20 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.26
* 21:01 reedy@deploy1002: Synchronized php-1.38.0-wmf.6/includes/api/ApiQueryImageInfo.php: [[phab:T293783|T293783]] (duration: 01m 03s)
* 19:18 marxarelli: promoting group1 to 1.35.0-wmf.26 to group1
* 21:00 reedy@deploy1002: Synchronized php-1.38.0-wmf.5/includes/api/ApiQueryImageInfo.php: [[phab:T293783|T293783]] (duration: 01m 03s)
* 17:21 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕐☕ homer 'cr*eqord*' commit 'enable sampling on eqord Iac15379cc'
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:54 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕐☕ homer 'cr*eqdfw*' commit 'enable sampling on eqdfw Iac15379cc'
* 20:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:39 vgutierrez: pool cp2027 - [[phab:T248816|T248816]]
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:31 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:28 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 19:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:17 ariel@deploy1001: Finished deploy [dumps/dumps@21363c1]: page range prefetch fixup (duration: 00m 09s)
* 19:51 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 16:17 ariel@deploy1001: Started deploy [dumps/dumps@21363c1]: page range prefetch fixup
* 19:48 eileen: civicrm revision changed from {{Gerrit|733a8fceda}} to {{Gerrit|dba74c443b}}, config revision is {{Gerrit|eed79486d5}}
* 15:33 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 19:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:31 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:31 vgutierrez@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)
* 19:38 twentyafterfour@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]] (duration: 25m 28s)
* 15:31 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 19:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 19:16 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:27 vgutierrez: depool & decommission cp20[16,19,23,27] - [[phab:T249125|T249125]]
* 19:13 twentyafterfour@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 15:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3318 after schema change', diff saved to https://phabricator.wikimedia.org/P10845 and previous config saved to /var/cache/conftool/dbconfig/20200401-152258-marostegui.json
* 18:47 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:11 herron: performing kafka-main rolling restarts to pick up security updates
* 17:52 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS (duration: 01m 34s)
* 14:52 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 17:50 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 14:50 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 17:09 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 37s)
* 14:49 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 17:06 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 14:49 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 17:05 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS (duration: 1100m 51s)
* 14:46 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:25 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 14:46 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 16:25 cdanis@cumin1001: START - Cookbook sre.network.cf
* 14:43 vgutierrez: depool && decommission cp[2018,2020,2022,2024-2026].codfw.wmnet - [[phab:T249115|T249115]]
* 16:24 mutante: [mwmaint1002:~] $ sudo systemctl start mediawiki_job_wikidata_resubmit_changes_for_dispatch
* 14:32 gehel: depooling wdqs1006 to allow catching up on lag
* 16:23 mutante: mwmaint1002 - running puppet, created new mw periodic job from gerrit:732972 ([[phab:T294031|T294031]])
* 14:30 vgutierrez: pool cp2042 - [[phab:T248816|T248816]]
* 16:07 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:16 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:04 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 14:13 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 15:45 lucaswerkmeister-wmde@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:09 XioNoX: remove AS-path prepending in esams
* 15:41 lucaswerkmeister-wmde@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 13:47 XioNoX: remove AS-path prepending in eqsin
* 15:38 lucaswerkmeister-wmde@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 13:39 vgutierrez: pool cp2041 - [[phab:T248816|T248816]]
* 15:27 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 13:34 mutante: sodium (mirror): sudo -u mirror ftpsync to get Debian mirror updated (Icinga says it's old)
* 15:27 cdanis@cumin1001: START - Cookbook sre.network.cf
* 13:24 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:07 topranks: Running homer against cr3-esams to create new temp GRE tunnel to asw1-b12-drmrs
* 13:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 15:02 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 13:17 marostegui: Deploy schema change on db1099:3318
* 15:02 cdanis@cumin1001: START - Cookbook sre.network.cf
* 13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P10843 and previous config saved to /var/cache/conftool/dbconfig/20200401-131719-marostegui.json
* 14:55 topranks: Adding static route on cr3-esams to asw1-b12-drmrs Telia link IP to allow GRE to be built.
* 13:13 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:50 elukey: ran "Capirca Host Definition" script on netbox - output https://netbox.wikimedia.org/extras/scripts/results/1787315/
* 13:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase: {{Gerrit|7723cf724df9ede49129443e43336e93efcd7a41}}: RecentChangeFactory: Add missing rc_logid value ([[phab:T293885|T293885]]) (duration: 01m 02s)
* 12:19 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 13:40 elukey: ran "Capirca Host Definition" script on netbox-next to get up-to-date aqs_group host definition - result https://netbox-next.wikimedia.org/extras/scripts/results/894348/
* 12:19 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 13:24 kart_: Updated cxserver to 2021-10-25-123807-production ([[phab:T217747|T217747]], [[phab:T218217|T218217]], [[phab:T292421|T292421]])
* 12:19 tgr@deploy1001: Synchronized wmf-config/config: SWAT: [[gerrit:584579{{!}}Sync growthexperiments dblist with actual state of wmgUseGrowthExperiments (T248844)]] (duration: 01m 06s)
* 13:19 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 12:18 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:13 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 12:18 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:05 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 12:17 tgr@deploy1001: Synchronized dblists/growthexperiments.dblist: SWAT: [[gerrit:584579{{!}}Sync growthexperiments dblist with actual state of wmgUseGrowthExperiments (T248844)]] (duration: 01m 05s)
* 13:05 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.4 (duration: 31m 07s)
* 12:17 XioNoX: restart nfacct on netflow4001 for kafka tls tests - [[phab:T248980|T248980]]
* 12:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:15 vgutierrez: depool & decommission cp2013 - [[phab:T249088|T249088]]
* 12:35 hashar: scap clean --delete 1.38.0-wmf.4 # [[phab:T293947|T293947]]
* 12:14 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: re-sync (duration: 01m 06s)
* 12:32 hashar: Applied security patches to 1.38.0-wmf.6 # [[phab:T293947|T293947]]
* 12:12 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:585059{{!}}Enable password-reset-update on all other than Wikipedias (T245791)]] (duration: 01m 07s)
* 12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:09 marostegui: Deploy schema change on db1116:3318
* 12:31 hashar: scap prep 1.38.0-wmf.6 # [[phab:T293947|T293947]]
* 12:05 cparle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [SDC] Revert enabling WikibaseQualityConstraints on Commons take 2 (duration: 01m 08s)
* 12:16 jbond: upload cas_6.4.2-1+wmf10u3_amd64
* 12:04 cparle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [SDC] Revert enabling WikibaseQualityConstraints on Commons (duration: 01m 05s)
* 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:54 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|4968501}}: Restrict short URL management log to stewards ([[phab:T221073|T221073]]; take II) (duration: 01m 05s)
* 12:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:53 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|4968501}}: Restrict short URL management log to stewards ([[phab:T221073|T221073]]) (duration: 01m 07s)
* 11:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:48 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [SDC] Enable WikibaseQualityConstraints on Commons take II (duration: 01m 06s)
* 11:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 cparle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [SDC] Enable WikibaseQualityConstraints on Commons (duration: 01m 18s)
* 11:51 urbanecm@deploy1002: Finished scap: {{Gerrit|c131f32e5e0804c8f5c2ec768b334c81a1b35151}}: Add namespace translations for [ami] Amis and [pwn] Paiwan ([[phab:T292414|T292414]], [[phab:T292415|T292415]]) (duration: 02m 25s)
* 11:20 cormacparle__: created table wbqc_constraints on commonswiki
* 11:49 urbanecm@deploy1002: Started scap: {{Gerrit|c131f32e5e0804c8f5c2ec768b334c81a1b35151}}: Add namespace translations for [ami] Amis and [pwn] Paiwan ([[phab:T292414|T292414]], [[phab:T292415|T292415]])
* 11:03 jbond42: install bluez update on ganeti-canary and cloudvirt/cloudcontrol-dev
* 11:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:01 mutante: planet1001 - reinstall OS to test install_server switch, ATS switched to planet1002 earlier
* 11:13 urbanecm@deploy1002: Synchronized logos/config.yaml: {{Gerrit|575a6a66b279c3d2d8974ffcc4911cc5b927be47}}: Fix HD logo size in some wikis ([[phab:T250731|T250731]]; 2/2) (duration: 00m 55s)
* 10:47 marostegui: Deploy schema change on dbstore1005:3318
* 11:13 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|575a6a66b279c3d2d8974ffcc4911cc5b927be47}}: Fix HD logo size in some wikis ([[phab:T250731|T250731]]; 1/2) (duration: 00m 57s)
* 10:25 vgutierrez: pool cp2040 - [[phab:T248816|T248816]]
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:16 oblivian@puppetmaster1001: conftool action : set/pooled=yes:weight=1; selector: service=canary
* 10:46 jbond: upload cas_6.4.2-1+wmf10u2_amd64.deb
* 09:55 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 10:40 mvernon@cumin2002: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=swift
* 09:46 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 10:39 mvernon@cumin2002: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=swift-ro
* 09:37 marostegui: Deploy schema change on s8 codfw, this will generate lag on codfw
* 10:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:35 XioNoX: Update install servers IPs (dhcp helpers + firewall rules) - [[phab:T224576|T224576]]
* 10:07 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Switching back graphite to eqiad (duration: 00m 55s)
* 09:34 mutante: install_servers: DHCP_relay in routers and TFTP server in DHCP server config have been switched from install1002/2002 to install1003/2003 - doing a test install, but if any issues report on [[phab:T224576|T224576]]
* 10:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:26 marostegui: last entry was for db2093
* 10:06 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Switching back graphite to eqiad (duration: 01m 04s)
* 09:26 marostegui: Downgrade mariadb package from 10.4.12-2 to 10.4.12-1
* 09:49 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 09:09 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:49 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 09:07 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:47 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 09:05 mutante: planet - the backend server has been switched from planet1001 (stretch) to planet1002 (buster) - [[phab:T247651|T247651]]
* 09:40 godog: flip back write traffic to graphite1004 (all but mediawiki) - [[phab:T247963|T247963]]
* 08:46 mutante: deneb, boron: systemctl reset-failed to clear up systemd state alerts
* 09:27 godog: move read traffic back to graphite1004 - [[phab:T247963|T247963]]
* 08:43 marostegui: Stop haproxy on dbproxy1010 [[phab:T248944|T248944]]
* 08:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:37 jynus: restart bacula at backup1001
* 08:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:30 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 08:33 ema: upload varnish_6.0.8-1wm2 to component/varnish6 on apt.wm.org [[phab:T293879|T293879]]
* 08:30 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 08:31 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GrowthExperiments/maintenance: {{Gerrit|91316ed5714c4426a29fefded5c4db08dbba48bb}}: Add purgeExpiredMentorStatus.php ([[phab:T280307|T280307]]) (duration: 00m 56s)
* 08:28 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 vgutierrez: depool & decommission cp2017 - [[phab:T249084|T249084]]
* 07:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:21 vgutierrez: pool cp2039 - [[phab:T248816|T248816]]
* 07:07 effie: pool mw1319 and mw1312
* 08:09 marostegui: Deploy schema change on db1138 (s4 primary master)
* 07:05 effie: pool  wtp1026.eqiad.wmnet
* 08:06 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17606 and previous config saved to /var/cache/conftool/dbconfig/20211026-063647-root.json
* 08:04 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17605 and previous config saved to /var/cache/conftool/dbconfig/20211026-062144-root.json
* 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1121 after schema change', diff saved to https://phabricator.wikimedia.org/P10841 and previous config saved to /var/cache/conftool/dbconfig/20200401-071339-marostegui.json
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17604 and previous config saved to /var/cache/conftool/dbconfig/20211026-060640-root.json
* 07:12 vgutierrez: pool cp2038 - [[phab:T248816|T248816]]
* 05:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17603 and previous config saved to /var/cache/conftool/dbconfig/20211026-055136-root.json
* 06:38 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 05:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17602 and previous config saved to /var/cache/conftool/dbconfig/20211026-053633-root.json
* 06:38 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 05:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17601 and previous config saved to /var/cache/conftool/dbconfig/20211026-052129-root.json
* 06:36 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:36 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:36 vgutierrez: depool & decommission cp2012 - [[phab:T249080|T249080]]
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:24 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:22 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 01:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:39 marostegui: Deploy schema change on db1121 (this will create lag on s4 labs)
* 01:24 krinkle@deploy1002: Synchronized wmf-config/logging.php: {{Gerrit|I0211e1c77}} (duration: 00m 55s)
* 05:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121 for schema change', diff saved to https://phabricator.wikimedia.org/P10840 and previous config saved to /var/cache/conftool/dbconfig/20200401-053827-marostegui.json
* 01:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:39 reedy@deploy1001: Synchronized docroot/mediawiki.org/xml/: Update http and prot rel links to https, fix link to sitelist in MW Core (duration: 01m 06s)
* 00:12 reedy@deploy1001: Synchronized docroot/mediawiki.org/xml/: Add export-0.11 (duration: 01m 05s)


== 2020-03-31 ==
== 2021-10-25 ==
* 22:23 marxarelli: group0 to 1.35.0-wmf.26 ([[phab:T247773|T247773]]); no rise in error rates following redeployment
* 23:12 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Create alias for Appendix and Appendix_talk namespaces on mywiktionary ([[phab:T291146|T291146]]) (duration: 00m 55s)
* 22:13 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.26
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:07 dduvall@deploy1001: rebuilt and synchronized wikiversions files: testwiki to php-1.35.0-wmf.26 ([[phab:T247773|T247773]])
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:54 dduvall@deploy1001: sync aborted: testwiki to php-1.35.0-wmf.26 ([[phab:T247773|T247773]]) (duration: 07m 31s)
* 22:57 ryankemper: [wcqs] Downtimed `wcqs*` until roughly a week from now (while we setup oauth)
* 21:47 dduvall@deploy1001: Started scap: testwiki to php-1.35.0-wmf.26 ([[phab:T247773|T247773]])
* 22:53 legoktm: uploaded PHP 7.4.25 to apt.wm.o (DSA-4992-1)
* 21:46 jforrester@deploy1001: Synchronized php-1.35.0-wmf.26/includes/user/UserNameUtils.php: [[phab:T249045|T249045]] Use wfMessage in UserNameUtils::isUsable for now (duration: 00m 58s)
* 22:44 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 21:05 eileen: process-control config revision is {{Gerrit|f80d248113}} - (catch up dedupe now off - fyi MBeat )
* 22:30 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 03m 04s)
* 20:59 hashar: contint1001: manually reverted /lib/systemd/system/jenkins.service
* 22:27 ryankemper@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 20:51 hashar: Restarting Jenkins for new CSP rules # [[phab:T245658|T245658]]
* 21:53 mutante: new project language "pwn" added - Paiwan is a native language of Taiwan, spoken by the Paiwan, a Taiwanese indigenous people. [[phab:T292415|T292415]]
* 20:26 dduvall@deploy1001: rebuilt and synchronized wikiversions files: rolling back 1.35.0-wmf.26 testwiki deployment following significant increase in error rate (cc [[phab:T247773|T247773]])
* 21:52 mutante: new project language "ami" added - Sowal no 'Amis is the Formosan language of the 'Amis (or Ami), an indigenous people living along the east coast of Taiwan. - [[phab:T292414|T292414]]
* 20:14 marxarelli: correction: RequestContext::getLanguage errors are for testwiki deployment, pre group0
* 21:50 mutante: log authdns1001 (DNS) - sudo authdns-update, add new project language "ami" (Amis) for [[phab:T292414|T292414]] - edited langlist.tmpl which regenerates all project zones
* 20:08 marxarelli: a slew of "ErrorException from line 334 of /srv/mediawiki/php-1.35.0-wmf.26/includes/context/RequestContext.php: PHP Warning: Recursion detected in RequestContext::getLanguage" after group0 deployment (cc [[phab:T247773|T247773]])
* 21:40 mutante: authdns1001 (DNS) - sudo authdns-update, add new project language "pwn" (Paiwan) for [[phab:T292415|T292415]]
* 20:04 dduvall@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.26 and rebuild l10n cache (duration: 142m 48s)
* 19:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 19:20 ariel@deploy1001: Finished deploy [dumps/dumps@713c297]: more filelist methods cleanup, sort prefetch possible files properly (duration: 00m 04s)
* 19:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 19:20 ariel@deploy1001: Started deploy [dumps/dumps@713c297]: more filelist methods cleanup, sort prefetch possible files properly
* 19:47 mutante: mw2255 - depooled=inactive (incl "dsh groups"), shut down physically for [[phab:T283582|T283582]] - can be worked on anytime
* 18:08 ariel@deploy1001: Finished deploy [dumps/dumps@8376c62]: bring snapshot1010 up to date (duration: 00m 05s)
* 19:45 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2255.codfw.wmnet
* 18:07 ariel@deploy1001: Started deploy [dumps/dumps@8376c62]: bring snapshot1010 up to date
* 19:45 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet
* 17:42 dduvall@deploy1001: Started scap: testwiki to php-1.35.0-wmf.26 and rebuild l10n cache
* 19:42 mutante: icinga - ACKing all unhandled CRIT alerts on hosts with "dev" or "test" in their name, regardless of notifications being disabled or not. just so that we get more signal than noise in actual unhandled CRITs in web UI
* 17:40 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.23 (duration: 26m 51s)
* 19:40 mutante: cumin2002 - sudo systemctl reset-failed to clear Icinga alert about failed but (now) non-existing service database-backups-snapshots.service, assuming it's a case of "only in active DC"
* 17:38 elukey: restart elasticsearch_6@cloudelastic-chi-eqiad.service on cloudelastic1001 to see if it recovers from a trashing/gc state - [[phab:T231517|T231517]]
* 19:12 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 16:30 marxarelli: 1.35.0-wmf.26 was branched at {{Gerrit|bec758b668aaa57fc259a1d0ecf3b35340d2661b}} for [[phab:T247773|T247773]]
* 19:12 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 16:24 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 01m 00s)
* 19:07 kormat@cumin1001: dbctl commit (dc=all): 'Temporarily move mw groups to db1123 [[phab:T294295|T294295]]', diff saved to https://phabricator.wikimedia.org/P17597 and previous config saved to /var/cache/conftool/dbconfig/20211025-190717-kormat.json
* 16:15 vgutierrez: pool cp2037 - [[phab:T248816|T248816]]
* 19:06 mutante: db1112 - powercycling
* 15:39 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 19:04 legoktm@cumin1001: dbctl commit (dc=all): 'Depool db1112 ([[phab:T294295|T294295]])', diff saved to https://phabricator.wikimedia.org/P17596 and previous config saved to /var/cache/conftool/dbconfig/20211025-190436-legoktm.json
* 15:36 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:35 mutante: decom mw1254 through mw1258 (last remaining old servers in rack D5, depooled a while ago and average response time is again under 200ms) [[phab:T247780|T247780]]
* 18:40 jforrester@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/timeline/includes/Timeline.php: Backport: [[gerrit:734312{{!}}Input may be null when rendering a self-closing tag `<timeline />` (T294020)]] (duration: 00m 55s)
* 15:33 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:28 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:24 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 55s)
* 15:27 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 18:22 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 54s)
* 15:27 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:19 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732840{{!}}Fix array declaration of NS_USER_TALK abbreviation on ruwikiquote (T197058)]] (duration: 00m 55s)
* 15:27 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:26 vgutierrez: depool & decommission cp2010 - [[phab:T249002|T249002]]
* 18:15 jforrester@deploy1002: Synchronized wmf-config/flaggedrevs.php: Config: [[gerrit:732836{{!}}flaggedrevs: Drop legacy wgFlaggedRevsStatsAge config, no longer read]] (duration: 00m 55s)
* 15:15 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 00m 58s)
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:14 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T245794|T245794]] Enable DiscussionTools as a beta feature on four wikis (duration: 01m 00s)
* 18:12 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732254{{!}}Make reply tool available as opt-out on frwiki (T293687)]] (duration: 00m 56s)
* 15:05 cdanis: cr1-eqiad: commit flex-flow-sizing [[phab:T248394|T248394]]
* 17:41 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2253.codfw.wmnet
* 15:01 cdanis: cr2-eqiad: commit flex-flow-sizing [[phab:T248394|T248394]]
* 17:40 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 14:43 vgutierrez: pool cp2036 - [[phab:T248816|T248816]]
* 17:39 mutante: mw2253 - scap pull after hw maintenance is over
* 14:21 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw125[4-8].eqiad.wmnet
* 17:32 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:20 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:20 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:24 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:19 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:23 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:17 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 17:22 XioNoX: update core routers ACLs
* 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1091 after schema change', diff saved to https://phabricator.wikimedia.org/P10834 and previous config saved to /var/cache/conftool/dbconfig/20200331-141459-marostegui.json
* 17:20 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 14:10 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:49 XioNoX: update management routers ACLs
* 14:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 16:36 XioNoX: DNS - Add eqsin-ulsfo transport v6 prefix - [[phab:T273308|T273308]]
* 14:05 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw125[4-8].eqiad.wmnet
* 16:31 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:31 vgutierrez: Enable TLS Session tickets in eqsin - [[phab:T245616|T245616]]
* 16:28 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 13:05 XioNoX: update nat on pfw3-codfw - [[phab:T248906|T248906]]
* 16:25 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:03 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:25 mmandere@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 16:21 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 12:49 _joe_: switching all appserver canaries to envoy
* 16:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:46 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:10 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2253.codfw.wmnet
* 12:45 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 16:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:45 marostegui: Deploy schema change on db1091
* 16:08 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:734298{{!}}Empty wikibase disabled access entity types on Beta (T294159)]] (beta-only) (duration: 01m 47s)
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1091 for schema change', diff saved to https://phabricator.wikimedia.org/P10833 and previous config saved to /var/cache/conftool/dbconfig/20200331-124452-marostegui.json
* 16:04 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:34 _joe_: transitioning mw1261 to envoy
* 16:01 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 12:23 vgutierrez: rolling upgrade of ATS to version 8.0.6-1wm5 - [[phab:T248938|T248938]]
* 15:57 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 52s)
* 11:30 Lucas_WMDE: EU SWAT done
* 15:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:30 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:584874{{!}}Disable TwoColConflict talk page workflow (T230231)]], take II (duration: 00m 57s)
* 15:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:29 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:584874{{!}}Disable TwoColConflict talk page workflow (T230231)]] (duration: 00m 58s)
* 15:49 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 54s)
* 11:11 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:584574{{!}}Enable ContentTranslation in Lithuanian Wikipedia as a default tool (T248179)]], take II (duration: 00m 59s)
* 15:46 jbond: upgrade cas/idp to 6.4.2
* 11:10 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:584574{{!}}Enable ContentTranslation in Lithuanian Wikipedia as a default tool (T248179)]] (duration: 01m 00s)
* 14:56 mutante: mw2253 - shut down and downtimed for 2 days
* 10:46 _joe_: disabled puppet on canary appservers, potentially dangerous change ahead
* 14:50 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 10:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084 after schema change', diff saved to https://phabricator.wikimedia.org/P10831 and previous config saved to /var/cache/conftool/dbconfig/20200331-101953-marostegui.json
* 14:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 10:03 XioNoX: add BGP to AS41327 in AMS-IX
* 14:49 mutante: depooling mw2253 for DRAC upgrade ([[phab:T283582|T283582]])
* 09:49 XioNoX: push homer diffs to mr1-eqsin
* 14:48 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 09:36 XioNoX: push homer diffs to mr1-eqiad
* 14:45 jbond: update cas package
* 09:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:31 marostegui: Deploy schema change on s3 codfw - [[phab:T291719|T291719]]
* 09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 12:04 ema: cp3062: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 09:10 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 11:57 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 09:09 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:05 vgutierrez: upload trafficserver 8.0.5-1wm6 to apt.wm.o (buster) - [[phab:T248938|T248938]]
* 11:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:00 vgutierrez: depool & decommission cp2011 - [[phab:T248950|T248950]]
* 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:44 vgutierrez: pool cp2035 - [[phab:T248816|T248816]]
* 11:24 Lucas_WMDE: UTC morning backport+config window done
* 08:31 mutante: signed puppet cert for planet1002.eqiad.wmnet
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732969{{!}}Remove dispatchLagToMaxLagFactor Wikibase setting (T292604)]] (duration: 00m 54s)
* 08:29 marostegui: Depool db1084 for schema change
* 11:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 for schema change', diff saved to https://phabricator.wikimedia.org/P10829 and previous config saved to /var/cache/conftool/dbconfig/20200331-082904-marostegui.json
* 11:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732951{{!}}Remove wikibaseDispatchRedisLockManager config (T292604)]] (duration: 00m 54s)
* 08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1081 after schema change', diff saved to https://phabricator.wikimedia.org/P10828 and previous config saved to /var/cache/conftool/dbconfig/20200331-082711-marostegui.json
* 11:14 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732950{{!}}Remove wmg variables for dispatchChanges.php Wikibase settings (T292604)]] (duration: 00m 55s)
* 08:17 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:08 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732949{{!}}Remove dispatchChanges.php-related Wikibase settings (T292604)]] (duration: 00m 55s)
* 08:01 XioNoX: delete unused ROA for ARIN v4 prefixes - [[phab:T235886|T235886]]
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:49 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 11:05 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732372{{!}}Remove dispatchViaJobs-related Wikibase settings (T291828)]] (duration: 00m 56s)
* 07:49 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:52 godog: bounce uwsgi graphite web on graphite2003 - [[phab:T294220|T294220]]
* 07:17 vgutierrez: pool cp2034 - [[phab:T248816|T248816]]
* 09:52 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:16 marostegui: Deploy schema change on db1081
* 09:48 volans@cumin1001: START - Cookbook sre.dns.netbox
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081 for schema change', diff saved to https://phabricator.wikimedia.org/P10827 and previous config saved to /var/cache/conftool/dbconfig/20200331-071547-marostegui.json
* 09:43 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:733089{{!}}[BETA CLUSTER] Enable WikibaseLexeme Scribunto access (T294159)]] (merged on Friday, syncing now to avoid outdated files even if it’s just -labs.php) (duration: 00m 55s)
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3314 after schema change', diff saved to https://phabricator.wikimedia.org/P10826 and previous config saved to /var/cache/conftool/dbconfig/20200331-071401-marostegui.json
* 09:18 godog: bounce graphite-web on graphite2003 to test timeout bump - [[phab:T294220|T294220]]
* 06:48 marostegui: Deploy schema change on db1103:3314
* 08:08 XioNoX: merge DNS changes to add drmrs
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3314 for schema change', diff saved to https://phabricator.wikimedia.org/P10825 and previous config saved to /var/cache/conftool/dbconfig/20200331-064707-marostegui.json
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3314 after schema change', diff saved to https://phabricator.wikimedia.org/P10824 and previous config saved to /var/cache/conftool/dbconfig/20200331-064627-marostegui.json
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 05:55 marostegui: Drop nova and nova_api from m5 master (db1133) - [[phab:T248313|T248313]]
* 05:47 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,name=wtp1026.*
* 05:55 kart_: Updated cxserver to 2020-03-30-145349-production ([[phab:T248578|T248578]])
* 05:43 _joe_: pooling wtp1042 [[phab:T294212|T294212]]
* 05:55 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 05:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1109.eqiad.wmnet with OS buster
* 05:54 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 05:01 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1109.eqiad.wmnet with OS buster
* 05:53 vgutierrez: depool && decommission cp2007 - [[phab:T248941|T248941]]
* 04:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 (s8) for reimage [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17590 and previous config saved to /var/cache/conftool/dbconfig/20211025-043028-marostegui.json
* 05:48 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 05:46 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 05:46 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 05:46 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 05:26 marostegui: Deploy schema change on db1097:3314
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314 for schema change', diff saved to https://phabricator.wikimedia.org/P10822 and previous config saved to /var/cache/conftool/dbconfig/20200331-051354-marostegui.json
* 00:26 eileen: civicrm revision changed from {{Gerrit|cf2e2c11c3}} to {{Gerrit|524b162174}}, config revision is {{Gerrit|708198a154}}


== 2020-03-30 ==
== 2021-10-23 ==
* 23:30 cdanis: cr3-esams: commit flex-flow-sizing [[phab:T248394|T248394]]
* 16:40 dcausse: restarting blazegraph on wdqs1004 and wdqs1006 (free allocators alert)
* 23:20 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 00m 58s)
* 15:45 urbanecm: Start server-side upload for 1 video file ([[phab:T289781|T289781]]), testing whether [[phab:T291137|T291137]] is still an issue
* 23:19 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Alphabetize wikis in each GrowthExperiments settings (duration: 00m 58s)
* 23:16 cdanis: cr2-esams: commit flex-flow-sizing [[phab:T248394|T248394]]
* 23:08 cdanis: cdanis@cr3-knams# commit comment "sensible flow table sizes [[phab:T248394|T248394]]"
* 22:56 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 00m 58s)
* 22:53 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Provide wmgSiteLogoIcon (duration: 00m 57s)
* 22:52 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Set wmgSiteLogoIcon for each project family and four special wikis (duration: 00m 58s)
* 22:50 jforrester@deploy1001: Synchronized wmf-config/mobile.php: Set wgMobileFrontendLogo from wgLogos['icon'] if set (duration: 00m 59s)
* 22:37 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 00m 57s)
* 22:36 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Split wgLogos setting into wmgSiteLogo1x etc. (duration: 00m 59s)
* 22:33 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Construct wgLogos in CommonSettings so that projects can inherit values (duration: 01m 02s)
* 19:55 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:55 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 15:36 ejegg: updated payments listener (standalone SmashPig) from {{Gerrit|dc0c6b208b}} to {{Gerrit|d80e4c5abd}}
* 15:32 vgutierrez: pool cp2033 - [[phab:T248816|T248816]]
* 15:25 jeh: add icinga 2h downtime and soft reset iDRAC on labstore1005.mgmt.eqiad.wmnet [[phab:T247965|T247965]]
* 14:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:57 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 14:57 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 14:55 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 14:53 vgutierrez: depool & decommission cp2008 - [[phab:T248864|T248864]]
* 14:23 vgutierrez: pool cp2032 - [[phab:T248816|T248816]]
* 14:17 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 14:17 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 14:01 vgutierrez: depool & decommission cp2006 - [[phab:T248856|T248856]]
* 13:57 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:45 vgutierrez: pool cp2031 - [[phab:T248816|T248816]]
* 13:09 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:07 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 13:07 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 13:06 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 12:56 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 12:56 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 12:53 vgutierrez: depool & decommission cp2005 - [[phab:T248848|T248848]]
* 12:26 cdanis: cdanis@re0.cr2-codfw# set chassis fpc 5 inline-services flex-flow-sizing    cdanis@re0.cr2-codfw# commit comment "flex-flow-sizing [[phab:T248394|T248394]]"
* 12:24 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 12:23 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 12:21 vgutierrez: depool & decommission cp2004 - [[phab:T248824|T248824]]
* 12:03 XioNoX: delete unused ROA for ARIN v6 prefixes - [[phab:T235886|T235886]]
* 11:59 XioNoX: delete unused ROAs for RIPE prefixes - [[phab:T235886|T235886]]
* 11:42 mutante: miscweb2002 - race condition with apache2 mpm and php7.3 module met - a2dismond mpm_event ; systemctl restart apache2 ; puppet agent -tv (also see [[phab:T196968|T196968]], https://gerrit.wikimedia.org/r/c/operations/puppet/+/451206) [[phab:T247887|T247887]]
* 11:37 mutante: miscweb2002 - installed OS, added to puppet, added role and  ... sed -i 's/tin.eqiad/deployment.eqiad/g' /srv/deployment/iegreview/iegreview-cache/.config ([[phab:T247648|T247648]])
* 11:30 marostegui: Deploy schema change on dbstore1004:3314
* 11:22 XioNoX: delete ARIN allocations from RIPE's IRR - [[phab:T235886|T235886]]
* 11:11 Urbanecm: EU SWAT done
* 11:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ac7e625}}: Add collections.nmnh.si.edu to $wgCopyUploadsDomains ([[phab:T248659|T248659]]; take II) (duration: 00m 58s)
* 11:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ac7e625}}: Add collections.nmnh.si.edu to $wgCopyUploadsDomains ([[phab:T248659|T248659]]) (duration: 00m 58s)
* 11:08 vgutierrez: pool cp2030 - [[phab:T248816|T248816]]
* 11:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|c8c06f9}}: Add 3 additional namespaces and assoicated talk pages to trwiktionary ([[phab:T248734|T248734]]; take II) (duration: 00m 59s)
* 11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|c8c06f9}}: Add 3 additional namespaces and assoicated talk pages to trwiktionary ([[phab:T248734|T248734]]) (duration: 00m 59s)
* 10:43 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 10:34 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 10:33 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 10:33 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 09:59 hoo: Temporary modified dumpsgen's crontab on snapshot1008 so that the Wikidata JSON dumps start at 9:59 UTC today ([[phab:T248612|T248612]])
* 09:56 hoo@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/Wikibase/repo/maintenance/DumpEntities.php: DumpEntities: Fix DB group default override ([[phab:T248612|T248612]]) (duration: 01m 02s)
* 09:19 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:15 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:30 vgutierrez: pool cp2029 - [[phab:T248816|T248816]]
* 08:12 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:12 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 08:12 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 08:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 07:53 vgutierrez: depool & decommission cp2002 - [[phab:T248818|T248818]]
* 07:48 marostegui: Run cloudcontrol1003:~# wmcs-wikireplica-dns to promote dbproxy1018 to wikireplicas active proxy [[phab:T231520|T231520]]
* 07:40 marostegui: Replace dbproxy1010 with dbproxy1011 for wiki replicas, analytics - [[phab:T231520|T231520]]
* 07:28 marostegui: Deploy schema change on labswiki (wikitech) - [[phab:T248333|T248333]]
* 07:26 marostegui: Deploy schema change on s4 codfw, this will generate lag on codfw - [[phab:T248333|T248333]]
* 07:17 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 07:17 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
* 07:10 vgutierrez: depool and decommission cp2001 - [[phab:T248815|T248815]]
* 06:52 vgutierrez: pool cp2028 - [[phab:T247340|T247340]]
* 06:29 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1074 after schema change', diff saved to https://phabricator.wikimedia.org/P10813 and previous config saved to /var/cache/conftool/dbconfig/20200330-062858-marostegui.json
* 06:26 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 06:04 marostegui: Deploy schema change on db1074 with replication, this will generate lag on s2 labs
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 for schema change', diff saved to https://phabricator.wikimedia.org/P10812 and previous config saved to /var/cache/conftool/dbconfig/20200330-060338-marostegui.json
* 05:40 vgutierrez: pool cp2027 - [[phab:T247340|T247340]]
* 05:13 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 04:55 vgutierrez: Enable TLS Session tickets in ulsfo - [[phab:T245616|T245616]]
* 04:32 vgutierrez: upgrade ATS to version 8.0.6-1wm4 on ulsfo - [[phab:T245616|T245616]]


== 2020-03-29 ==
== 2021-10-22 ==
* 08:24 elukey: powercycle elastic1059 - mgmt/serial console stuck, no ssh - racadm getsel shows a lot of OEM errors occurred, nothing specific
* 23:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:57 bblack: re-pooling eqiad in DNS
* 20:54 legoktm: <XioNoX> I disabled the interface on cr1, going to re-enabled the active on on cr2
* 20:48 legoktm: bblack has temporarily depooled eqiad https://gerrit.wikimedia.org/r/733043
* 20:41 XioNoX: disable sessions to equinix eqiad IXP
* 19:17 urbanecm: Start server-side upload of 1 video file ([[phab:T294134|T294134]])
* 15:06 jbond: upload puppetboard_3.1.0-1_all.deb to ullseye-wikimedia
* 13:42 ema: deployment-cache-upload06: restart varnish-frontend, package got upgraded to 6.0.8 [[phab:T294116|T294116]]
* 13:30 jbond: upload python3-pypuppetdb_2.4.0-1_all.deb to bullseye
* 10:46 jbond: upload cas_6.4.2-1+wmf10u1
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 10:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 09:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294029|T294029]]
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2025.codfw.wmnet with OS buster
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 08:27 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:23 ema: cp3062: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 08:00 ema: deployment-cache-text06: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17580 and previous config saved to /var/cache/conftool/dbconfig/20211022-055403-root.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17579 and previous config saved to /var/cache/conftool/dbconfig/20211022-053900-root.json
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17578 and previous config saved to /var/cache/conftool/dbconfig/20211022-052356-root.json
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17577 and previous config saved to /var/cache/conftool/dbconfig/20211022-050852-root.json
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17576 and previous config saved to /var/cache/conftool/dbconfig/20211022-045349-root.json
* 04:46 marostegui_: Deploy schema change on s8 codfw - [[phab:T291719|T291719]]
* 04:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17575 and previous config saved to /var/cache/conftool/dbconfig/20211022-043845-root.json
* 02:59 ejegg: updated payments-wiki from {{Gerrit|088a8cda1e}} to {{Gerrit|6e810fb401}}


== 2020-03-28 ==
== 2021-10-21 ==
* 16:54 elukey: restart yarn on analytics1071
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:05 vgutierrez: preemptive restart of ats-tls on cp1081 and cp3062 - [[phab:T248736|T248736]]
* 23:38 jforrester@deploy1002: Synchronized w/fatal-error.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 11:32 vgutierrez: restart ats-tls on cp1077 - [[phab:T248736|T248736]]
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:34 vgutierrez: pool cp1089
* 23:37 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 08:30 vgutierrez: restarting ats-tls on cp1089
* 23:36 jforrester@deploy1002: Synchronized multiversion/: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 23:34 jforrester@deploy1002: Synchronized docroot/noc/conf/index.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:33 jforrester@deploy1002: Synchronized wmf-config: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 23:32 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:25 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:730946{{!}}CommonSettings: Drop legacy CentralAuth config flag, never read (T277932)]] (duration: 00m 55s)
* 23:18 thcipriani@deploy1002: Synchronized tests/multiversion/StaticSettingsTest.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 23:15 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:42 mutante: [[phab:T294038|T294038]] [krb1001:~] $ sudo manage_principals.py create effeietsanders ... Principal successfully created.  . .Successfully sent email
* 21:44 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 47s)
* 21:41 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 20:54 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 13s)
* 20:53 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 20:53 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 35s)
* 20:52 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:42 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Update $wgTimelineFonts for new path to unifont in Shellbox container ([[phab:T293050|T293050]]) (duration: 00m 55s)
* 19:38 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 19:35 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 19:31 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 19:23 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 19:10 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs (duration: 00m 23s)
* 19:09 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs
* 19:07 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: (no justification provided) (duration: 00m 08s)
* 19:07 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: (no justification provided)
* 18:53 urbanecm: Deploy security patch for [[phab:T285116|T285116]] (wmf.4, wmf.5)
* 18:53 mutante: dumpsdata1003 - sudo systemctl reset-failed to clear Icinga alert about failed cleanup_tmpdumps.service
* 17:55 mutante: that's a key for https://www.worldcat.org/whatis/default.jsp btw for those wondering
* 17:53 mutante: citoid - replaced "wskey" for worldcat in private repo as requested on [[phab:T294010|T294010]] (is in 4 places, 3 for deployment_server/k8s and one remnant for scb)
* 17:53 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 17:52 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 17:50 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 16:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 16:12 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:07 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 16:06 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (2/3) (duration: 00m 54s)
* 16:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:04 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 16:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 15:59 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (2/3) (duration: 00m 55s)
* 15:58 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (1/3) (duration: 00m 57s)
* 15:43 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 robh@cumin1001: START - Cookbook sre.dns.netbox
* 15:14 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:13 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 54s)
* 15:12 Lucas_WMDE: my next message accidentally says 1/3 again but it’s 2/3, sorry
* 15:11 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 15:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:56 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 14:42 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/Wikibase.default.php: Backport: [[gerrit:732666{{!}}Enable dispatching via jobs by default (T291828)]] (duration: 00m 55s)
* 14:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:39 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client/: Backport: [[gerrit:732674{{!}}Fix ExternalUserNames service wiring for local database]] (duration: 00m 57s)
* 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:56 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:55 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:34 volans: uploaded spicerack_1.0.6 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:04 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 12:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 12:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 12:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 12:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 12:48 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 12:48 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 12:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 12:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 11:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:13 Lucas_WMDE: UTC morning backport+config window done
* 11:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294008|T294008]]
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 jgiannelos@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730848{{!}}Configure event stream for map tiles state change (T289771)]] (duration: 01m 04s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:48 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:47 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:14 jbond: mergeing refactor of P:base Gerrit:714975
* 09:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:49 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:56 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 01m 03s)
* 08:33 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:26 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:25 ema: cp3062: revert vsl_space experiment [[phab:T293879|T293879]]
* 08:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1004.eqiad.wmnet with OS bullseye
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17563 and previous config saved to /var/cache/conftool/dbconfig/20211021-080330-root.json
* 07:56 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite1004.eqiad.wmnet with OS bullseye
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17562 and previous config saved to /var/cache/conftool/dbconfig/20211021-074826-root.json
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17561 and previous config saved to /var/cache/conftool/dbconfig/20211021-073323-root.json
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17560 and previous config saved to /var/cache/conftool/dbconfig/20211021-071819-root.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17559 and previous config saved to /var/cache/conftool/dbconfig/20211021-070315-root.json
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17558 and previous config saved to /var/cache/conftool/dbconfig/20211021-064812-root.json
* 06:35 elukey: `systemctl reload nginx` on cloudelastic100[5,6] to pick up the new TLS certificate and clear alerts - [[phab:T293826|T293826]]
* 04:47 marostegui: Deploy schema change on s5 codfw - [[phab:T291719|T291719]]
* 04:37 marostegui: Deploy schema change on s6 codfw - [[phab:T291719|T291719]]
* 04:04 legoktm: restarted apache on lists1001 so it only uses new TLS cert ([[phab:T293826|T293826]])
* 03:29 eileen: civicrm revision changed from {{Gerrit|e889831012}} to {{Gerrit|733a8fceda}}, config revision is {{Gerrit|eed79486d5}}
* 00:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2020-03-27 ==
== 2021-10-20 ==
* 20:51 ejegg: updated payments-wiki from {{Gerrit|db618f429d}} to {{Gerrit|1640f5e21e}}
* 23:56 thcipriani@deploy1002: Finished scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]] (duration: 11m 41s)
* 15:15 andrew@deploy1001: Finished deploy [horizon/deploy@33e67f9]: fix Identity->Projects with keystone Queens (duration: 03m 35s)
* 23:44 thcipriani@deploy1002: Started scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]]
* 15:12 andrew@deploy1001: Started deploy [horizon/deploy@33e67f9]: fix Identity->Projects with keystone Queens
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1129 after schema change', diff saved to https://phabricator.wikimedia.org/P10807 and previous config saved to /var/cache/conftool/dbconfig/20200327-144125-marostegui.json
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129 for schema change', diff saved to https://phabricator.wikimedia.org/P10806 and previous config saved to /var/cache/conftool/dbconfig/20200327-142240-marostegui.json
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:19 moritzm: updating linux-image-4.9.0-11-amd64 where applicable
* 23:29 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: fawiki require login for creation of pages in the draft namespace [[phab:T291018|T291018]] (duration: 01m 02s)
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1076 after schema change', diff saved to https://phabricator.wikimedia.org/P10805 and previous config saved to /var/cache/conftool/dbconfig/20200327-133022-marostegui.json
* 23:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076 for schema change', diff saved to https://phabricator.wikimedia.org/P10804 and previous config saved to /var/cache/conftool/dbconfig/20200327-130706-marostegui.json
* 23:27 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: fawiki require login to edit main namespace [[phab:T291018|T291018]] (duration: 01m 04s)
* 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after schema change', diff saved to https://phabricator.wikimedia.org/P10803 and previous config saved to /var/cache/conftool/dbconfig/20200327-130542-marostegui.json
* 22:13 dancy@deploy1002: Synchronized README: testing (4/4) (duration: 02m 52s)
* 12:49 Amir1: ladsgroup@mwmaint1002:~$ mwscript createAndPromote.php --wiki=labswiki --force "Ladsgroup" --interface-admin
* 22:00 dancy@deploy1002: Synchronized README: testing (3/4) (duration: 02m 57s)
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 for schema change', diff saved to https://phabricator.wikimedia.org/P10802 and previous config saved to /var/cache/conftool/dbconfig/20200327-122144-marostegui.json
* 21:54 dancy@deploy1002: Synchronized README: testing (2) (duration: 01m 02s)
* 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3312 after schema change', diff saved to https://phabricator.wikimedia.org/P10801 and previous config saved to /var/cache/conftool/dbconfig/20200327-122058-marostegui.json
* 21:52 dancy@deploy1002: Synchronized README: (no justification provided) (duration: 01m 03s)
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 for schema change', diff saved to https://phabricator.wikimedia.org/P10800 and previous config saved to /var/cache/conftool/dbconfig/20200327-120234-marostegui.json
* 21:50 dancy: Testing a series of one-file scap sync-file runs
* 11:54 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: dc=codfw,cluster=restbase,service=restbase-backend,name=restbase202[123].codfw.wmnet
* 21:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:51 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: dc=codfw,cluster=restbase,service=restbase-ssl,name=restbase202[123].codfw.wmnet
* 21:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:45 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: dc=codfw,cluster=restbase,service=restbase,name=restbase2023.codfw.wmnet
* 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: dc=codfw,cluster=restbase,service=restbase,name=restbase2022.codfw.wmnet
* 21:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b9cf996a38d82fdd67e600a5a951e88423957e8d}}: Promote Growth features out of darkmode on several wikis  ([[phab:T291826|T291826]], [[phab:T255037|T255037]], [[phab:T287878|T287878]]) (duration: 01m 04s)
* 11:44 oblivian@puppetmaster1001: conftool action : edit; selector: dc=codfw,cluster=restbase,service=restbase-ssl,name=restbase202[1].codfw.wmnet
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: dc=codfw,cluster=restbase,service=restbase,name=restbase2021.codfw.wmnet
* 20:38 eileen: civicrm revision changed from {{Gerrit|9b5e0d015b}} to {{Gerrit|e889831012}}, config revision is {{Gerrit|eed79486d5}}
* 10:55 mutante: revoke puppet cert webserver-misc-apps.discovery.wmnet and recreate with additional SANs for new VMs
* 20:25 legoktm: uploaded php7.4 on buster to apt.wm.o ([[phab:T293449|T293449]])
* 10:45 mutante: miscweb1002 - upload and unpack RackTables-0.21.4 ([[phab:T247646|T247646]] [[phab:T247648|T247648]])
* 19:24 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations (duration: 00m 46s)
* 10:28 marostegui: Alter db2125 s2 to set page_restrictions to default NULL - [[phab:T248333|T248333]]
* 19:24 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations
* 10:12 mutante: miscweb1002 - sed -i 's/tin.eqiad/deployment.eqiad/g' /srv/deployment/iegreview/iegreview-cache/.config  [[phab:T247648|T247648]]
* 19:09 mutante: disabling puppet on mw* for a minute to deploy a change
* 10:04 vgutierrez: upload trafficserver 8.0.6-1wm4 to apt.wm.o (buster) - [[phab:T245616|T245616]] [[phab:T170567|T170567]]
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 10:03 mutante: sodium - find /srv/mirrors/debian/ -user root -exec chown -h mirror:mirror <nowiki>{</nowiki><nowiki>}</nowiki> \;  (-h to also fix symbolic links); sudo -u mirror ftpsync ([[phab:T248660|T248660]])
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 10:02 marostegui: Alter db2084:3315 enwikivoyage.page to set page_restrictions to default NULL - [[phab:T248333|T248333]]
* 18:31 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 10:01 marostegui: Alter db1096:3315 enwikivoyage.page to set page_restrictions to default NULL - [[phab:T248333|T248333]]
* 18:30 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 09:37 mutante: sodium - running ftpsync as user mirror ([[phab:T248660|T248660]])
* 18:24 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 09:36 mutante: sodium fixing root owned files in /srv/mirrors/debian to be owned by mirror:mirror ([[phab:T248660|T248660]])
* 17:28 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org -  [[phab:T293810|T293810]]
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P10799 and previous config saved to /var/cache/conftool/dbconfig/20200327-093214-marostegui.json
* 17:27 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P10798 and previous config saved to /var/cache/conftool/dbconfig/20200327-093106-marostegui.json
* 17:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:58 marostegui: Deploy schema change on s2 codfw - this will generate lag on s2 codfw - [[phab:T248333|T248333]]
* 17:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:36 elukey: execute 'rm /etc/logrotate.d/ceph-common' on cloudvirt[1,2]* and cloudcontrol* to stop daily cronspam (file not in the puppet catalog anymore)
* 17:01 razzi@deploy1002: Finished deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f] (duration: 23m 42s)
* 07:32 moritzm: installing grub2 updates from Stretch point release
* 17:00 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client: Update deprecated calls to ParserOutput in ShortDescHandler - [[phab:T293860|T293860]] (duration: 01m 03s)
* 07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P10796 and previous config saved to /var/cache/conftool/dbconfig/20200327-072334-marostegui.json
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 for schema change', diff saved to https://phabricator.wikimedia.org/P10795 and previous config saved to /var/cache/conftool/dbconfig/20200327-070224-marostegui.json
* 16:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1082 after schema change', diff saved to https://phabricator.wikimedia.org/P10794 and previous config saved to /var/cache/conftool/dbconfig/20200327-070014-marostegui.json
* 16:53 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/LiquidThreads/pages/LqtDiscussionPager.php: Remove deprecated usage of setProperty - [[phab:T293895|T293895]] (duration: 01m 03s)
* 06:31 marostegui: Deploy schema change on db1082, this will generate lag on s5 labs
* 16:49 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GeoCrumbs: Replace use of deprecated ParserOutput:getProperty() - [[phab:T293894|T293894]] (duration: 01m 09s)
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082 for schema change', diff saved to https://phabricator.wikimedia.org/P10793 and previous config saved to /var/cache/conftool/dbconfig/20200327-063042-marostegui.json
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:37 razzi@deploy1002: Started deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f]
* 16:36 razzi: deploy refinery change for https://phabricator.wikimedia.org/T287084
* 16:13 jbond: upload cas_6.4.2-1_amd64.deb
* 15:42 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:39 volans@cumin2002: START - Cookbook sre.dns.netbox
* 14:57 moritzm: installing modsecurity-crs security updates on Buster
* 14:48 moritzm: installing xmlgraphics-commons security updates on Buster
* 14:46 moritzm: installing irssi security updates on Buster
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:35 moritzm: installing commons-io security updates on Buster
* 14:27 ema: cp3062: test higher vsl_space values [[phab:T293879|T293879]]
* 14:27 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:12 moritzm: installing ruby2.3 security updates
* 13:40 moritzm: installing apache2 security updates on buster
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:21 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 01m 02s)
* 13:20 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=ats-tls
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=varnish-fe
* 12:51 ema: cp3062: bump vsl_space from 80M (default) to 512M [[phab:T293879|T293879]] - varnish restart needed
* 12:37 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 12:36 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:02 urbanecm@deploy1002: Finished scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]]) (duration: 25m 19s)
* 11:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2007.codfw.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2007.codfw.wmnet
* 11:37 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:37 urbanecm@deploy1002: Started scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]])
* 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2005.codfw.wmnet
* 11:21 moritzm: installing ffmpeg security updates
* 11:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e520fc57411bb19123766192cd636396ea6fc59d}}: GrowthExperiments: Add campaign pattern for enwiki ([[phab:T293699|T293699]]) (duration: 01m 22s)
* 11:11 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2005.codfw.wmnet
* 10:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 10:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 09:04 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS buster
* 07:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:49 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS buster
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 (s1) for reimage [[phab:T290865|T290865]]', diff saved to https://phabricator.wikimedia.org/P17552 and previous config saved to /var/cache/conftool/dbconfig/20211020-064529-marostegui.json
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS buster
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 (s1) after upgrade', diff saved to https://phabricator.wikimedia.org/P17551 and previous config saved to /var/cache/conftool/dbconfig/20211020-063926-marostegui.json
* 06:35 marostegui: Upgrade db1106
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17550 and previous config saved to /var/cache/conftool/dbconfig/20211020-063431-marostegui.json
* 06:31 dcausse: restarting blazegraph on wdqs1012
* 06:28 elukey: reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage
* 06:21 marostegui: Depool clouddb1013 for upgrade
* 06:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS buster
* 06:12 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17549 and previous config saved to /var/cache/conftool/dbconfig/20211020-061202-marostegui.json
* 06:06 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:05 XioNoX: put transport link between ulsfo and eqsin in service - [[phab:T273308|T273308]]
* 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS buster
* 05:26 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 04:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:40 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable $wgLocalHTTPProxy on group0 wikis ([[phab:T288848|T288848]]) (duration: 01m 05s)
* 01:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 tgr: west coast evening deploys done


== 2020-03-26 ==
== 2021-10-19 ==
* 23:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ce63a4e}}: Enable wmgUseFooterContactLink for cswiki ([[phab:T248584|T248584]]; take II) (duration: 00m 57s)
* 23:59 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732103{{!}}Reorder some wikis at wgExtraNamespaces and wmgVisualEditorAvailableNamespaces (T293846)]] (duration: 01m 02s)
* 23:05 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ce63a4e}}: Enable wmgUseFooterContactLink for cswiki ([[phab:T248584|T248584]]) (duration: 00m 58s)
* 23:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:51 krinkle@deploy1001: Synchronized php-1.35.0-wmf.25/includes/user/UserRightsProxy.php: {{Gerrit|I9121f5aae}} (4/4) (duration: 00m 58s)
* 23:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:50 krinkle@deploy1001: Synchronized php-1.35.0-wmf.25/includes/search/SearchMySQL.php: {{Gerrit|I9121f5aae}} (3/4) (duration: 00m 58s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732053{{!}}ruwikiversity: Add 'portal' and 'faculty' namespaces (T293545)]] (duration: 01m 03s)
* 22:48 krinkle@deploy1001: Synchronized php-1.35.0-wmf.25/includes/objectcache/SqlBagOStuff.php: {{Gerrit|I9121f5aae}} (2/4) (duration: 00m 58s)
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:44 krinkle@deploy1001: Synchronized php-1.35.0-wmf.25/includes/jobqueue/jobs/RecentChangesUpdateJob.php: {{Gerrit|I9121f5aae}} (1/4) (duration: 01m 00s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 ejegg: updated fundraising CiviCRM from {{Gerrit|f1cb23e809}} to {{Gerrit|cf2e2c11c3}}
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:710565{{!}}Set the project namespace and sitename for Javanese Wikipedia and Wiktionary (T287437)]] (duration: 01m 02s)
* 21:43 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/MachineVision: Fix: Stop sorting label suggestions by Wikidata ID in ApiQueryImageLabels (duration: 01m 00s)
* 23:23 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731953{{!}}Create Portal and Portal talk namespace for shiwiki (T288909)]] (duration: 01m 03s)
* 21:34 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:32 cdanis: cdanis@re0.cr1-eqsin# set chassis afeb slot 0 inline-services flex-flow-sizing    cdanis@re0.cr1-eqsin# commit comment "flex-flow-sizing [[phab:T248394|T248394]]"
* 23:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:31 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 23:13 tgr@deploy1002: Synchronized static: Config: [[gerrit:731231{{!}}Repair the size of the logo of Kashmiri Wikipedia (T293342)]] (duration: 02m 14s)
* 21:30 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:34 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete {{!}} fixed Icinga alert:  RECOVERY - Check for large files in client bucket on mwmaint1002 is OK: OK: [[phab:T165885|T165885]]
* 21:27 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 21:32 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete
* 21:27 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@f34260c]: Update mobileapps to {{Gerrit|3f30f20c}} (duration: 03m 07s)
* 20:56 ejegg: updated payments-wiki from {{Gerrit|0f48acea49}} to {{Gerrit|30e596903d}}
* 21:24 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@f34260c]: Update mobileapps to {{Gerrit|3f30f20c}}
* 19:03 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 21:15 cdanis: repool ulsfo
* 18:46 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/MediaSearch/: {{Gerrit|a84a675}}: {{Gerrit|3231578}}: MediaSearch backports ([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 21:12 cdanis: applied flow-table-size configuration to cr4-ulsfo which did not need a reboot to apply it [[phab:T248394|T248394]]
* 18:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/: {{Gerrit|694580a}}: {{Gerrit|c02e301}}: MediaSearch backports([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 20:51 cdanis: cdanis@cr3-ulsfo> request system reboot
* 18:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 20:36 cdanis: depool ulsfo
* 18:30 foks: deleting 1 more email with deleteUserEmail.php
* 16:52 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1476a2d93}}: {{Gerrit|dd8393c1a0}}: foundationwiki: Restrict sensitive namespaces to editor group ([[phab:T205350|T205350]]) (duration: 01m 03s)
* 16:50 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 18:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:43 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:12 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9a2893c7190e615a247674dbf7f87348bf43b91c}}: Enable topic subscriptions as a beta feature on all remaining projects ([[phab:T287802|T287802]]) (duration: 01m 04s)
* 16:40 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 18:00 legoktm@deploy1002: Synchronized wmf-config/: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (2/2) (duration: 01m 06s)
* 16:34 XioNoX: stop exchanging full BGP view between eqiad and codfw - [[phab:T246721|T246721]]
* 17:59 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (1/2) (duration: 01m 05s)
* 16:19 pt1979@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:57 foks: removing six email addresses on request (with deleteUserEmail.php)
* 16:18 XioNoX: stop advertising 208.80.152.0/22 from eqiad - [[phab:T246721|T246721]]
* 17:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 16:15 mutante: signing puppet cert for miscweb1002, installed buster, added insetup role ([[phab:T247887|T247887]])
* 17:25 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:15 ebernhardson: set cloudelastic-chi wikidatawiki_content to 0 replicas while reindexing
* 17:11 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 16:14 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 17:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:14 moritzm: rebooting mw2150 for some tests
* 16:48 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:12 XioNoX: stop advertising 2620:0:860::/46 from eqiad - [[phab:T246721|T246721]]
* 16:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:12 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:11 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 16:11 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 16:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 16:10 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 16:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 15:58 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 15:53 volans@cumin1001: START - Cookbook sre.dns.netbox
* 16:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 15:51 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 15:51 moritzm: installing grub2 updates from Stretch point release
* 16:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 15:49 XioNoX: start advertising 208.80.154.0/23 from eqiad - [[phab:T246721|T246721]]
* 16:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 15:49 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 15:48 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 15:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 15:46 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 15:40 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - remove now redundant stream setting - [[phab:T277193|T277193]] (duration: 01m 04s)
* 15:40 XioNoX: start advertising 2620:0:861::/48 from eqiad - [[phab:T246721|T246721]]
* 15:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 15:20 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 15:17 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:15 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:12 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 15:32 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:10 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:32 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:30 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 15:28 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 15:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 15:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 14:34 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:29 jbond: disable puppet on lvs, cp, authdns, mc, mw-be and wcqs to while i merge G:662699
* 15:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:01 mutante: [[phab:T247887|T247887]] - create Ganeti VM miscweb1002.eqiad.wmnet in the ganeti01.svc.eqiad.wmnet cluster on row C with 1 vCPUs, 2GB of RAM, 20GB of disk in the private network.
* 14:11 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 45m 13s)
* 15:01 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:52 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 15:01 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 13:45 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:59 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 13:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:59 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:59 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 13:26 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 14:50 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17547 and previous config saved to /var/cache/conftool/dbconfig/20211019-131927-root.json
* 14:47 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17546 and previous config saved to /var/cache/conftool/dbconfig/20211019-131651-root.json
* 14:26 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17545 and previous config saved to /var/cache/conftool/dbconfig/20211019-130424-root.json
* 14:23 pt1979@cumin2001: START - Cookbook sre.hosts.downtime
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17544 and previous config saved to /var/cache/conftool/dbconfig/20211019-130147-root.json
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P10787 and previous config saved to /var/cache/conftool/dbconfig/20200326-135625-marostegui.json
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17543 and previous config saved to /var/cache/conftool/dbconfig/20211019-124920-root.json
* 13:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 for schema change', diff saved to https://phabricator.wikimedia.org/P10786 and previous config saved to /var/cache/conftool/dbconfig/20200326-132940-marostegui.json
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17542 and previous config saved to /var/cache/conftool/dbconfig/20211019-124644-root.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P10785 and previous config saved to /var/cache/conftool/dbconfig/20200326-130122-marostegui.json
* 12:40 moritzm: installing aftpd security updates
* 12:57 oblivian@deploy1001: Synchronized wmf-config/ProductionServices.php: eventgate-main to use envoy [[phab:T244843|T244843]] (duration: 01m 07s)
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17541 and previous config saved to /var/cache/conftool/dbconfig/20211019-123416-root.json
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315 for schema change', diff saved to https://phabricator.wikimedia.org/P10784 and previous config saved to /var/cache/conftool/dbconfig/20200326-123302-marostegui.json
* 12:34 marostegui: Upgrade dbstore1003
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P10783 and previous config saved to /var/cache/conftool/dbconfig/20200326-123157-marostegui.json
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17540 and previous config saved to /var/cache/conftool/dbconfig/20211019-123140-root.json
* 12:25 mutante: analytics1028 - performing a puppet change on every run (all other hosts doing this were fixed just recently)
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17539 and previous config saved to /var/cache/conftool/dbconfig/20211019-121913-root.json
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315 for schema change', diff saved to https://phabricator.wikimedia.org/P10782 and previous config saved to /var/cache/conftool/dbconfig/20200326-121859-marostegui.json
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17538 and previous config saved to /var/cache/conftool/dbconfig/20211019-121636-root.json
* 11:38 awight: EU SWAT done
* 12:12 XioNoX: push anycast tuning to all Lumen and NTT transit links - [[phab:T288843|T288843]]
* 11:37 awight@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/TwoColConflict: SWAT: [[gerrit:583576{{!}}Two hotfixes for guided tour (T248465)]] (duration: 01m 07s)
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1167 (s8) after upgrade', diff saved to https://phabricator.wikimedia.org/P17537 and previous config saved to /var/cache/conftool/dbconfig/20211019-120918-marostegui.json
* 11:25 mutante: sodium - running ftpsync to get Debian mirror in sync
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17536 and previous config saved to /var/cache/conftool/dbconfig/20211019-120458-marostegui.json
* 11:23 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T231517|T231517]]: [cirrus] force cloudelastic replica count to 1 (duration: 01m 05s)
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17535 and previous config saved to /var/cache/conftool/dbconfig/20211019-120409-root.json
* 11:21 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T231517|T231517]]: [cirrus] force cloudelastic replica count to 1 (duration: 01m 06s)
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17534 and previous config saved to /var/cache/conftool/dbconfig/20211019-120348-root.json
* 11:12 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/ContentTranslation/modules/ui/mw.cx.ui.Categories.js: SWAT: {{Gerrit|1ea6bad}}: Allow publishing to continue even with broken categories ([[phab:T248302|T248302]]) (duration: 01m 07s)
* 12:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/WikibaseMediaInfo/: {{Gerrit|ec0125770775c1a1a54c3b592d86d287fd9e3ad6}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 55s)
* 11:06 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: SWAT: {{Gerrit|d1bb0b1}}: Removed expired throttle.php entries (duration: 01m 09s)
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17533 and previous config saved to /var/cache/conftool/dbconfig/20211019-120132-root.json
* 11:00 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 12:00 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikibaseMediaInfo/: {{Gerrit|79808a90a95dd5dac2b532b87fb7ec1a490ea0f0}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 56s)
* 10:58 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17532 and previous config saved to /var/cache/conftool/dbconfig/20211019-120024-root.json
* 10:54 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:16 XioNoX: esams/knams: advertise 185.15.58.0/23 instead of 185.15.56.0/22 - [[phab:T207753|T207753]]
* 11:56 XioNoX: push anycast tuning to Tele2, Init7, DT transit links - [[phab:T288843|T288843]]
* 09:50 elukey: reboot stat1008 - gpu + drivers in a weird state after multiple tests
* 11:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:00 XioNoX: push v4 conditional advertising on cr3-knams - [[phab:T236785|T236785]]
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17531 and previous config saved to /var/cache/conftool/dbconfig/20211019-114844-root.json
* 08:44 marostegui: Deploy schema change on s5 codfw, lag will show up on codfw - [[phab:T248333|T248333]]
* 11:46 marostegui: Upgrade db1105 (s1,s2)
* 08:27 XioNoX: troubleshot v6 conditional advertisement from cr3-knams - [[phab:T236785|T236785]]
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105 (s1,s2) for upgrade', diff saved to https://phabricator.wikimedia.org/P17530 and previous config saved to /var/cache/conftool/dbconfig/20211019-114649-marostegui.json
* 07:58 XioNoX: remove BGP session to AS8001 in eqiad (down and not replying to email)
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17529 and previous config saved to /var/cache/conftool/dbconfig/20211019-114520-root.json
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1085 after schema change', diff saved to https://phabricator.wikimedia.org/P10781 and previous config saved to /var/cache/conftool/dbconfig/20200326-074033-marostegui.json
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17527 and previous config saved to /var/cache/conftool/dbconfig/20211019-113340-root.json
* 07:31 marostegui: Deploy schema change on db1085, lag will appear on s6 on labs
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17526 and previous config saved to /var/cache/conftool/dbconfig/20211019-113017-root.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085 for schema change', diff saved to https://phabricator.wikimedia.org/P10780 and previous config saved to /var/cache/conftool/dbconfig/20200326-073048-marostegui.json
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17525 and previous config saved to /var/cache/conftool/dbconfig/20211019-111837-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1093 after schema change', diff saved to https://phabricator.wikimedia.org/P10779 and previous config saved to /var/cache/conftool/dbconfig/20200326-070746-marostegui.json
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17524 and previous config saved to /var/cache/conftool/dbconfig/20211019-111513-root.json
* 06:59 marostegui: Deploy schema change on db1093
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 for schema change', diff saved to https://phabricator.wikimedia.org/P10778 and previous config saved to /var/cache/conftool/dbconfig/20200326-065929-marostegui.json
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1088 after schema change', diff saved to https://phabricator.wikimedia.org/P10777 and previous config saved to /var/cache/conftool/dbconfig/20200326-065814-marostegui.json
* 11:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7c31b04e50101a60db7ae8acae64bc031f5e1007}}: DPL: Explicitly note it is not possible to enable DPL on any more wikis (duration: 00m 55s)
* 06:48 marostegui: Deploy schema change on db1088
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17523 and previous config saved to /var/cache/conftool/dbconfig/20211019-110333-root.json
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1088 for schema change', diff saved to https://phabricator.wikimedia.org/P10776 and previous config saved to /var/cache/conftool/dbconfig/20200326-064748-marostegui.json
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P10775 and previous config saved to /var/cache/conftool/dbconfig/20200326-064648-marostegui.json
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17522 and previous config saved to /var/cache/conftool/dbconfig/20211019-110009-root.json
* 06:39 marostegui: Deploy schema change on db1098:3316
* 10:56 marostegui: Upgrade clouddb1021
* 06:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P10774 and previous config saved to /var/cache/conftool/dbconfig/20200326-063844-marostegui.json
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P10773 and previous config saved to /var/cache/conftool/dbconfig/20200326-063633-marostegui.json
* 10:51 moritzm: failover master in ganeti-test to ganeti2026
* 06:26 marostegui: Deploy schema change on db1096:3316
* 10:50 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P10772 and previous config saved to /var/cache/conftool/dbconfig/20200326-062631-marostegui.json
* 10:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS stretch
* 06:22 marostegui: Rename nova and nova_api tables on db1117:3325 - [[phab:T248313|T248313]]
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17521 and previous config saved to /var/cache/conftool/dbconfig/20211019-104829-root.json
* 00:06 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable Special:Investigate on testwiki ([[phab:T247645|T247645]]) (duration: 03m 14s)
* 10:45 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17520 and previous config saved to /var/cache/conftool/dbconfig/20211019-104506-root.json
* 10:38 oblivian@deploy1002: Synchronized w/static.php: Config: [[gerrit:730182{{!}}static.php: Add support for /static/current rewrites (take 2) (T285232)]] (duration: 00m 55s)
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 10:37 marostegui: Upgrade db1101 (s7,s8)
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101 (s7,s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17519 and previous config saved to /var/cache/conftool/dbconfig/20211019-103634-marostegui.json
* 10:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:29 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 10:28 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 10:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:22 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:22 godog: flip mw statsd traffic with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/731918 - [[phab:T247963|T247963]]
* 10:21 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS stretch
* 10:16 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:52 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:44 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.3 (duration: 01m 39s)
* 09:42 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.2 (duration: 16m 06s)
* 09:37 godog: move graphite/statsd writes to graphite2003 - [[phab:T247963|T247963]]
* 09:34 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:27 hashar: sap clean --delete 1.38.0-wmf.2 && scap clean --delete 1.38.0-wmf.3  # [[phab:T281169|T281169]]
* 09:27 hashar: Cloned and applied security patches for 1.38.0-wmf.5 # [[phab:T281169|T281169]]
* 09:19 marostegui: Stop slave on db2112 [[phab:T290865|T290865]]
* 09:18 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:18 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:03 XioNoX: push anycast tuning to all Telia transit links - [[phab:T288843|T288843]]
* 08:50 godog: point graphite.discovery.wmnet to graphite2003 - [[phab:T247963|T247963]]
* 08:40 XioNoX: push prep-work for anycast tuning to all sites - [[phab:T288843|T288843]]
* 08:33 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:33 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:32 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php hrwiki --fix
* 08:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift-ro
* 08:03 XioNoX: push prep-work for anycast tuning in ulsfo (try 2) - [[phab:T288843|T288843]]
* 08:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:32 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:24 ema: A:cp start rolling varnish upgrades to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17517 and previous config saved to /var/cache/conftool/dbconfig/20211019-072111-root.json
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17516 and previous config saved to /var/cache/conftool/dbconfig/20211019-071519-root.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17515 and previous config saved to /var/cache/conftool/dbconfig/20211019-070607-root.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17514 and previous config saved to /var/cache/conftool/dbconfig/20211019-070016-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17513 and previous config saved to /var/cache/conftool/dbconfig/20211019-065104-root.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17512 and previous config saved to /var/cache/conftool/dbconfig/20211019-064512-root.json
* 06:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17511 and previous config saved to /var/cache/conftool/dbconfig/20211019-063559-root.json
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17510 and previous config saved to /var/cache/conftool/dbconfig/20211019-063008-root.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17509 and previous config saved to /var/cache/conftool/dbconfig/20211019-062054-root.json
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17508 and previous config saved to /var/cache/conftool/dbconfig/20211019-061505-root.json
* 06:06 marostegui: Upgrade dbstore1005
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17507 and previous config saved to /var/cache/conftool/dbconfig/20211019-060551-root.json
* 06:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 06:03 marostegui: Upgrade db1184, db1178
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1178 for upgrade', diff saved to https://phabricator.wikimedia.org/P17506 and previous config saved to /var/cache/conftool/dbconfig/20211019-060123-marostegui.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17505 and previous config saved to /var/cache/conftool/dbconfig/20211019-060001-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1184 for upgrade', diff saved to https://phabricator.wikimedia.org/P17504 and previous config saved to /var/cache/conftool/dbconfig/20211019-055429-marostegui.json
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 05:46 marostegui: Reimage db2112 (s1 codfw master) [[phab:T290865|T290865]]
* 04:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 03:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:21 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:18 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:38 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer


== 2020-03-25 ==
== 2021-10-18 ==
* 23:49 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: Add investigate to $wgAvailableRights ([[phab:T247645|T247645]]) (duration: 03m 16s)
* 23:40 hoo: Updated the Wikidata property suggester with data from the 2021-10-04 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 23:42 catrope@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/CheckUser/: Retry because mw1251 timed out, and it is a proxy (duration: 03m 15s)
* 23:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b654980240d51fff3c6e9c48f7076d4609c2560f}}: Create an alias for the Draft namespace on hrwiki ([[phab:T291755|T291755]]) (duration: 00m 56s)
* 23:38 catrope@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/CheckUser/: Add new investigate right ([[phab:T247645|T247645]]) (duration: 03m 17s)
* 23:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:21 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:21 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 23:12 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=thwiktionary --fix # [[phab:T291761|T291761]]
* 22:16 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 23:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|abe777d28594da852e49ccb1c1597b2598f3e483}}: Create Rhymes namespace for thwiktionary ([[phab:T291761|T291761]]) (duration: 00m 57s)
* 22:16 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 23:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:10 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 23:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:10 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 22:56 legoktm@deploy1002: Synchronized php-1.38.0-wmf.4/includes/http/MWHttpRequest.php: Allow using a reverse proxy for local HTTP requests ([[phab:T288848|T288848]]) (duration: 00m 56s)
* 22:05 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 22:06 maryum: deployed security patch for [[phab:T293589|T293589]]
* 22:05 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 21:23 maryum: deployed security patch for [[phab:T293556|T293556]]
* 22:05 rlazarus: updating eventgate-logging-external to envoy 1.13.1 [[phab:T246868|T246868]]
* 21:05 mutante: mwmaint1002 - sudo -u www-data /usr/local/bin/mw-cli-wrapper /usr/local/bin/mwscript extensions/TranslationNotifications/scripts/DigestEmailer.php --wiki mediawikiwiki {{!}} Fatal error: Uncaught Error: Class 'MediaWiki\MediaWikiServices' not found
* 22:00 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 20:58 mutante: mwmaint1002 - attempt to start mediawiki_job_translationnotifications-mediawikiwiki which was alerting as failed
* 22:00 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' .
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:59 ppchelko@deploy1001: Finished deploy [restbase/deploy@a1c3be4] (dev-cluster): Remove experimental PCS endpoints (duration: 02m 57s)
* 20:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:56 ppchelko@deploy1001: Started deploy [restbase/deploy@a1c3be4] (dev-cluster): Remove experimental PCS endpoints
* 19:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:54 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 19:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 21:54 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' .
* 19:29 mutante: LDAP: removed non-existent user gerrit2 from group labsadminbots ([[phab:T160122|T160122]])
* 21:46 urandom: dropping unused Cassandra keyspaces -- [[phab:T248018|T248018]]
* 19:29 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/resources/store/state.js: {{Gerrit|ac7b4fc2ccc69589e00a42f49d18a8f6d71777f2}}: Revert 727328 ([[phab:T293554|T293554]]) (duration: 00m 56s)
* 21:45 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:44 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' .
* 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:44 rlazarus: updating eventgate-analytics-external to envoy 1.13.1 [[phab:T246868|T246868]]
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:39 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' .
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:39 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' .
* 18:45 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Convert $wgEventStreams to be an associative array - [[phab:T277193|T277193]] (duration: 00m 57s)
* 21:27 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' .
* 18:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:27 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' .
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:16 rlazarus: holding off on updating eventgate-analytics until EU time, to check on unexpected helmfile diffs [[phab:T246868|T246868]]
* 18:07 mutante: gerrit - removed tonina from wmde-mediawiki gerrit group ([[phab:T293621|T293621]])
* 21:11 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' .
* 17:51 mutante: puppet run on all bastion hosts via cumin
* 21:11 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' .
* 15:32 mvernon@cumin2002: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 21:10 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' .
* 15:32 mvernon@cumin2002: START - Cookbook sre.discovery.service-route
* 21:10 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' .
* 15:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 21:07 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' .
* 15:23 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 21:07 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' .
* 15:16 herron: reprepro copied anycast-healthchecker, python3-json-logger and python3-anycast-healthchecker from buster-wikimedia to bullseye-wikimedia [[phab:T292196|T292196]]
* 21:07 rlazarus: updating eventgate-analytics to envoy 1.13.1 [[phab:T246868|T246868]]
* 15:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 20:36 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 15:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 20:32 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 14:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 20:22 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 14:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 20:22 rlazarus: updating cxserver to envoy 1.13.1 [[phab:T246868|T246868]]
* 14:54 herron: rebuilt and uploaded kafkatee for bullseye [[phab:T292196|T292196]]
* 20:19 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'citoid' for release 'staging' .
* 14:50 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:19 rlazarus: updating citoid to envoy 1.13.1 [[phab:T246868|T246868]]
* 14:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 20:16 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:36 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731346{{!}}[beta] Rename $wgIPInfoGeoIP2Path to $wgIPInfoGeoIP2Prefix (T289361)]] (duration: 00m 56s)
* 20:16 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:01 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:01 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:36 hasharDinner: Jenkins restarted on all machines
* 14:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:30 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 13:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:30 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:29 rlazarus: updating eventstreams to envoy 1.13.1 [[phab:T246868|T246868]]
* 13:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (2/2) (duration: 00m 56s)
* 19:28 twentyafterfour: group1 looks good after deploying wmf.25 refs [[phab:T233873|T233873]]
* 13:47 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (1/2) (duration: 00m 56s)
* 19:27 hashar: upgrading Jenkins # [[phab:T248122|T248122]]
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:26 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.25  refs [[phab:T233873|T233873]]
* 13:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:731014{{!}}Unconditionally enable Wikibase dispatching via jobs (T291828)]] (duration: 00m 56s)
* 19:26 twentyafterfour: scap sync-proxies failed on mw1251
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:53 ppchelko@deploy1001: Finished deploy [restbase/deploy@a1c3be4]: Add restbase202[123] [[phab:T244178|T244178]] (duration: 14m 00s)
* 12:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2079.codfw.wmnet with OS buster
* 18:39 ppchelko@deploy1001: Started deploy [restbase/deploy@a1c3be4]: Add restbase202[123] [[phab:T244178|T244178]]
* 12:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:39 ppchelko@deploy1001: Finished deploy [restbase/deploy@777b881]: Remove experimental PCS endpoints (duration: 14m 28s)
* 12:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:24 ppchelko@deploy1001: Started deploy [restbase/deploy@777b881]: Remove experimental PCS endpoints
* 11:55 Lucas_WMDE: UTC morning backport window done
* 18:21 tgr@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/GrowthExperiments/modules/homepage/: re-sync, mw1251 failed (duration: 03m 18s)
* 11:55 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (2/2) (duration: 00m 56s)
* 18:13 tgr@deploy1001: Synchronized php-1.35.0-wmf.25/extensions/GrowthExperiments/modules/homepage/: SWAT: [[gerrit:583393{{!}}Mentorship module: Update for root screen refactor (T248422)]] (duration: 03m 23s)
* 11:54 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (1/2) (duration: 00m 56s)
* 18:06 ppchelko@deploy1001: Finished deploy [changeprop/deploy@4bdf55b]: Stop rerendering experimental PCS endpoints (duration: 01m 40s)
* 11:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:05 ppchelko@deploy1001: Started deploy [changeprop/deploy@4bdf55b]: Stop rerendering experimental PCS endpoints
* 11:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2079.codfw.wmnet with OS buster
* 17:43 mvolz@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'citoid' for release 'production' .
* 11:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:38 mvolz@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'citoid' for release 'production' .
* 11:49 marostegui: Reimage db2079 (codfw s8 master) [[phab:T290868|T290868]]
* 17:33 mvolz@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'citoid' for release 'staging' .
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730747{{!}}Set dispatchViaJobsAllowedClients to null everywhere (T291828)]] (duration: 00m 56s)
* 16:50 moritzm: installing python-bleach security updates
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:47 moritzm: updated jenkins packages on apt.wikimedia.org to 2.222.1
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:33 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 11:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731239{{!}}Make deduplication actually work for DispatchChangesJob (T291118)]] (duration: 00m 55s)
* 16:32 sukhe: upload cescout 0.1.0-1 to apt.wm.o (buster) - [[phab:T247273|T247273]]
* 11:10 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/Hooks/RecentChangeSaveHookHandler.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (2/2) (duration: 00m 56s)
* 16:17 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 11:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:15 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (duration: 00m 56s)
* 16:07 rlazarus: updating blubberoid to envoy 1.13.1 [[phab:T246868|T246868]]
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2115 after reimage to Buster', diff saved to https://phabricator.wikimedia.org/P10767 and previous config saved to /var/cache/conftool/dbconfig/20200325-152148-marostegui.json
* 10:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:14 moritzm: installing deneb.codfw.wmnet [[phab:T248165|T248165]]
* 10:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:51 cdanis: repool codfw [[phab:T248394|T248394]]
* 10:47 moritzm: copied wmf-certificates from buster-wikimedia to stretch-wikimedia in reprepro
* 14:46 mutante: closed port 80 for caching servers on misc backends https://gerrit.wikimedia.org/r/q/topic:%22applayer-tls%22+(status:open%20OR%20status:merged) as final step per service on [[phab:T210411|T210411]]
* 10:38 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:731237{{!}}Don't filter by change Id when dispatching to client wikis ()]] (duration: 00m 59s)
* 14:39 mutante: static microsites (annual.wikimedia.org, research.wikimedia.org, static-bugzilla etc). closed port 80 for caching servers, finalizing switch to https behind caching servers
* 09:48 moritzm: installing node-tar security updates on buster
* 14:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:39 vgutierrez: updating acme-chief to version 0.34 on acmechief instances - [[phab:T292619|T292619]]
* 14:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 09:38 godog: sync metrics from graphite1004 to graphite2003 - [[phab:T247963|T247963]]
* 13:53 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 09:13 moritzm: installing apr security updates on bullseye
* 13:48 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 08:57 godog: cleanup graphite metrics not modified for >= ~3yr (1024 days)
* 13:26 _joe_: cumin A:puppetmaster 'apt-get -y install puppet-common'
* 07:34 ema: cp3060 (text), cp3061 (upload): upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 13:03 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 07:34 elukey: depool + restart blazegraph on wdqs1013
* 13:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:56 marostegui: Deploy schema change on db1139:3316
* 06:31 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:45 marostegui: Stop MySQL on db2115 for reimage to buster
* 06:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:50 cdanis: cr1-codfw: `set chassis fpc 5 inline-services flex-flow-sizing` and `request chassis fpc restart slot 5` [[phab:T248394|T248394]]
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2115 for upgrade', diff saved to https://phabricator.wikimedia.org/P10763 and previous config saved to /var/cache/conftool/dbconfig/20200325-114655-marostegui.json
* 11:39 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 11:37 mutante: decom mw1250 - mw1253
* 11:37 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 11:35 cdanis: depool codfw for router maintenance [[phab:T248394|T248394]]
* 11:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 11:32 mutante: decom mw1232 - mw1235
* 11:31 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 11:27 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw125[0-3].eqiad.wmnet
* 11:26 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw123[2-5].eqiad.wmnet
* 11:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:22 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 11:21 Urbanecm: EU SWAT done
* 11:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:21 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 11:20 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw123[2-5].eqiad.wmnet
* 11:20 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw125[0-3].eqiad.wmnet
* 11:19 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|59412db}}: Add gwtoolset to available rights to allow granting to global groups (duration: 01m 07s)
* 11:12 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|7b8d7c5}}: TwoColConflict: Limited default deployment CommonSettings.php ([[phab:T244863|T244863]]) (duration: 01m 06s)
* 11:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|81cda0f}}: TwoColConflict: Limited default deployment InitialiseSettings.php ([[phab:T244863|T244863]]; take II) (duration: 01m 06s)
* 11:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|81cda0f}}: TwoColConflict: Limited default deployment InitialiseSettings.php ([[phab:T244863|T244863]]) (duration: 01m 17s)
* 11:08 jynus@cumin1001: dbctl commit (dc=all): 'Reduce db1091 load, increase main traffic on all other s4 instances', diff saved to https://phabricator.wikimedia.org/P10762 and previous config saved to /var/cache/conftool/dbconfig/20200325-110821-jynus.json
* 10:55 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1137', diff saved to https://phabricator.wikimedia.org/P10761 and previous config saved to /var/cache/conftool/dbconfig/20200325-105503-marostegui.json
* 10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10760 and previous config saved to /var/cache/conftool/dbconfig/20200325-103938-marostegui.json
* 10:37 XioNoX: change aggregate policy for 2620:0:862::/48 on cr3-knams - [[phab:T236785|T236785]]
* 10:19 XioNoX: change aggregate policy for v4 prefixes on cr2-eqdfw - [[phab:T236785|T236785]]
* 10:04 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' .
* 10:04 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'production' .
* 09:56 XioNoX: change aggregate policy for 2620:0:860::/46 on cr2-eqdfw - [[phab:T236785|T236785]]
* 09:54 vgutierrez: Enable inbound TLSv1.3 on upload@eqsin - [[phab:T170567|T170567]]
* 09:27 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 09:23 vgutierrez: upgrade ATS to 8.0.6-1wm3 on upload@eqsin - [[phab:T170567|T170567]]
* 09:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10759 and previous config saved to /var/cache/conftool/dbconfig/20200325-091421-marostegui.json
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10758 and previous config saved to /var/cache/conftool/dbconfig/20200325-090227-marostegui.json
* 08:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:38 marostegui: Reimage db1137
* 08:18 marostegui: Reboot db1117 for full-upgrade
* 08:15 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' .
* 08:15 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'production' .
* 08:14 _joe_: upgrading all eventgate-main to envoy 1.13.1 [[phab:T246868|T246868]]
* 08:12 marostegui: Stop all mysql daemons on db1117
* 07:50 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' .
* 07:50 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'production' .
* 07:42 XioNoX: reboot scs-eqsin for CPU usage
* 07:20 jmm@cumin2001: START - Cookbook sre.ganeti.makevm
* 07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1137 for upgrade', diff saved to https://phabricator.wikimedia.org/P10757 and previous config saved to /var/cache/conftool/dbconfig/20200325-070946-marostegui.json
* 06:57 marostegui: Deploy schema change on db2129 (s6 codfw master)
* 06:15 marostegui: Rename tables on db1133 (m5 master) nova_api database - [[phab:T248313|T248313]]
* 06:13 marostegui: Remove grants 'nova'@'208.80.154.23' on nova.* - [[phab:T248313|T248313]]


== 2020-03-24 ==
== 2021-10-16 ==
* 20:53 cdanis: repool eqsin
* 03:56 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:52 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Don't hard-set wgTmhUseBetaFeatures to true, let it vary by wiki (duration: 01m 07s)
* 02:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 20:50 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 01m 07s)
* 01:30 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:49 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wgTmhUseBetaFeatures to vary by wiki (duration: 01m 06s)
* 20:35 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: Attempt #2: group0 wikis to 1.35.0-wmf.25 refs [[phab:T233873|T233873]]
* 20:32 twentyafterfour@deploy1001: Synchronized wmf-config: Now touch and sync again because of settings cache rache condition. refs [[phab:T248409|T248409]] (duration: 00m 59s)
* 20:31 cdanis: rebooting cr2-eqsin [[phab:T248394|T248394]]
* 20:30 twentyafterfour@deploy1001: Synchronized wmf-config: Now sync InitializeSettings* refs [[phab:T248409|T248409]] (duration: 00m 59s)
* 20:28 twentyafterfour@deploy1001: Synchronized wmf-config/CommonSettings.php: sync CommonSettings before InitialiseSettings refs [[phab:T248409|T248409]] (duration: 00m 58s)
* 20:27 volans: force rebooting analytics1044 from console, host down and unreachable (ping, ssh, console)
* 20:26 cdanis: commit flow-table-size on cr2-eqsin [[phab:T248394|T248394]]
* 20:19 cdanis: eqsin depooled for router maintenance at 16:15
* 19:29 twentyafterfour@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 19:29 twentyafterfour: rolling back to wmf.24 due to high error rate refs [[phab:T233873|T233873]]
* 19:28 twentyafterfour@deploy1001: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 18:49 gehel: repooling wdqs1006, catched up on lag
* 17:12 hashar@deploy1001: Finished scap: testwiki to 1.35.0-wmf.25 and rebuild l10n cache # [[phab:T233873|T233873]] (duration: 77m 52s)
* 17:10 ebernhardson: update cloudelastic-chi replica counts from 2 to 1 [[phab:T231517|T231517]]
* 16:41 moritzm: installing linux-perf updates on stretch
* 16:31 moritzm: installing linux-perf-4.19 updates on buster
* 15:58 mutante: installing OS on otrs1001.eqiad.wmnet ([[phab:T248028|T248028]])
* 15:55 hashar@deploy1001: Started scap: testwiki to 1.35.0-wmf.25 and rebuild l10n cache # [[phab:T233873|T233873]]
* 15:35 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 15:31 hashar@deploy1001: Pruned MediaWiki: 1.35.0-wmf.22 (duration: 02m 02s)
* 15:29 hashar@deploy1001: Pruned MediaWiki: 1.35.0-wmf.21 (duration: 24m 00s)
* 15:17 hashar: Cleaning old MediaWiki deployments # [[phab:T233873|T233873]]
* 15:03 hashar: Applied patches to 1.35.0-wmf.25 # [[phab:T233873|T233873]]
* 14:59 hashar: scap prep 1.35.0-wmf.25 # [[phab:T233873|T233873]]
* 14:55 gehel: depooling wdqs1006 to catch up on lag
* 14:28 marostegui: Deploy schema change on db2117 (s6)
* 14:26 hashar: Branching wmf/1.35.0-wmf.25 # [[phab:T233873|T233873]]
* 13:22 moritzm: installing glib2.0 updates from Stretch point release
* 13:04 moritzm: installing maridb-10.1 updates from Stretch point release (client/tools/libraries as packaged by Debian, different from wmf-mariadb)
* 12:16 Urbanecm: mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'Toroid~huwiki' 'Toroidt' ([[phab:T248371|T248371]])
* 12:10 Urbanecm: mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwiki --logwiki=metawiki 'Erika Greenberg' 'Copperqueen' ([[phab:T248371|T248371]])
* 11:57 Urbanecm: mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'Romy merdeka' 'Romy_Dwi_Laksono' ([[phab:T248371|T248371]])
* 11:55 marostegui: Deploy schema change on db2087 db2089 db2097
* 11:34 Urbanecm: EU SWAT done
* 11:29 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|e28c819}}: Enable visualeditor on hewiktionary by default ([[phab:T248311|T248311]]; take II) (duration: 00m 59s)
* 11:28 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|e28c819}}: Enable visualeditor on hewiktionary by default ([[phab:T248311|T248311]]) (duration: 00m 59s)
* 11:25 urbanecm@deploy1001: Synchronized dblists/visualeditor-nondefault.dblist: SWAT: {{Gerrit|e28c819}}: Enable visualeditor on hewiktionary by default ([[phab:T248311|T248311]]) (duration: 01m 03s)
* 10:08 gehel: restart blazegraph and updater on wdqs1004
* 09:41 marostegui: Deploy schema change on db2076 (s6)
* 08:39 marostegui: Rename nova database tables on db1133 (m5 master) - [[phab:T248313|T248313]]
* 08:25 marostegui: Rename wikidatawiki.wb_terms on db1104 - [[phab:T248086|T248086]]
* 07:33 elukey: restart update-openstack-mirror.service on sodium
* 06:55 marostegui: Reboot dbproxy1018
* 06:42 marostegui: Reboot dbproxy1019
* 06:16 marostegui: Create empty database testreduce on m5 master [[phab:T245408|T245408]]
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1087, vslow s8, with weight 1 as it originally had', diff saved to https://phabricator.wikimedia.org/P10753 and previous config saved to /var/cache/conftool/dbconfig/20200324-060133-marostegui.json


== 2020-03-23 ==
== 2021-10-15 ==
* 21:50 krinkle@deploy1001: Synchronized docroot/noc/css/vector.css: {{Gerrit|I627a0ddba5}} (duration: 01m 02s)
* 23:48 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:39 mholloway-shell@deploy1001: Finished deploy [recommendation-api/deploy@26aa5c3]: Update recommendation-api to {{Gerrit|3141cb6}} (duration: 03m 21s)
* 23:27 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:45 Urbanecm: Morning SWAT done
* 23:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:41 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|0e535b1}}: InitialiseSettings - clean up groupOverrides layout / spacing ([[phab:T231178|T231178]]; take II) (duration: 00m 59s)
* 22:38 mutante: apt1001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 18:39 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|0e535b1}}: InitialiseSettings - clean up groupOverrides layout / spacing ([[phab:T231178|T231178]]) (duration: 01m 00s)
* 22:36 mutante: apt2001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 18:35 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|6ca1593}}: wgCopyUploadsDomains: Fix supremecourt.gov ([[phab:T248146|T248146]]; take II) (duration: 00m 59s)
* 22:34 mutante: apt2001 - upgraded nginx
* 18:33 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|6ca1593}}: wgCopyUploadsDomains: Fix supremecourt.gov ([[phab:T248146|T248146]]) (duration: 01m 00s)
* 22:18 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:32 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: SWAT: {{Gerrit|cbda0e5}}: ApiVisualEditorEdit: Fix handling of minor parameter ([[phab:T248257|T248257]]) (duration: 01m 00s)
* 22:14 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|212114e}}: Dont try to grant `oathauth-enable` to `*` ([[phab:T248282|T248282]]) (duration: 00m 59s)
* 22:05 dpifke@deploy1002: Finished deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes (duration: 00m 05s)
* 18:19 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|0c12fc2}}: wgCopyUploadsDomains: Add supremecourt.gov ([[phab:T248146|T248146]], take II) (duration: 00m 59s)
* 22:05 dpifke@deploy1002: Started deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes
* 18:18 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|0c12fc2}}: wgCopyUploadsDomains: Add supremecourt.gov ([[phab:T248146|T248146]]) (duration: 01m 00s)
* 21:51 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:18 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 21:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:18 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 21:44 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:15 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|5eb70ac}}: Add configuration variable $wgRestAPIAdditionalRouteFiles ([[phab:T247997|T247997]]; take II) (duration: 00m 59s)
* 21:36 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|5eb70ac}}: Add configuration variable $wgRestAPIAdditionalRouteFiles ([[phab:T247997|T247997]]) (duration: 01m 00s)
* 20:09 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:09 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 18:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:09 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 17:20 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:08 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 17:17 mutante: gitlab1001 - disabling puppet for debugging
* 18:05 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 17:05 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold - [[phab:T283076|T283076]]
* 18:05 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 17:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 17:57 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 16:50 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold
* 17:57 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 16:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:31 ema: upload atskafka 0.5 to buster-wikimedia [[phab:T237993|T237993]]
* 16:44 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 15:59 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-enablle client side error logging for group0 and hawwike - [[phab:T226986|T226986]] (take 2) (duration: 00m 59s)
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:56 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-enablle client side error logging for group0 and hawwike - [[phab:T226986|T226986]] (duration: 01m 00s)
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:32 moritzm: installing maridb-10.1 updates from Stretch point release (client/tools/libraries as packaged by Debian, different from wmf-mariadb)
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:24 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:13 moritzm: installing freetype updates from Stretch point release
* 14:48 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:04 otto@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/WikimediaEvents/modules/ext.wikimediaEvents/clientError.js: [[gerrit:578951{{!}}clientError: Changes event fields (T226986)]] (take 2) (duration: 00m 59s)
* 14:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:00 jynus@cumin1001: dbctl commit (dc=all): 'Remove db1089 for special groups (rc)', diff saved to https://phabricator.wikimedia.org/P10749 and previous config saved to /var/cache/conftool/dbconfig/20200323-150046-jynus.json
* 14:15 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:00 otto@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/WikimediaEvents/modules/ext.wikimediaEvents/clientError.js: [[gerrit:578951{{!}}clientError: Changes event fields (T226986)]] (duration: 01m 01s)
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 14:46 jynus@cumin1001: dbctl commit (dc=all): 'Finish doubling db1107 main s1 traffic', diff saved to https://phabricator.wikimedia.org/P10748 and previous config saved to /var/cache/conftool/dbconfig/20200323-144612-jynus.json
* 13:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 14:40 jynus@cumin1001: dbctl commit (dc=all): 'Increase db1107 main s1 traffic a 50%', diff saved to https://phabricator.wikimedia.org/P10747 and previous config saved to /var/cache/conftool/dbconfig/20200323-144005-jynus.json
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 14:35 jynus@cumin1001: dbctl commit (dc=all): 'remove db1107 from special groups', diff saved to https://phabricator.wikimedia.org/P10746 and previous config saved to /var/cache/conftool/dbconfig/20200323-143536-jynus.json
* 13:30 elukey: start topic rebalancing for kafka main-eqiad (long maintenance, it will last a couple of days)
* 14:28 elukey@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 13:24 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 14:28 elukey@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:21 vgutierrez: updating acme-chief to version 0.34 on acmechief-test instances - [[phab:T292619|T292619]]
* 14:25 elukey@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 13:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:25 elukey@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:14 vgutierrez: upload acme-chief 0.34 to apt.wikimedia.org (buster) - [[phab:T292619|T292619]]
* 14:13 elukey@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'canary' .
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:13 elukey@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'production' .
* 11:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:54 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2007.codfw.wmnet
* 13:40 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Temporarily disable client side error logging for a deploy - [[phab:T226986|T226986]] (duration: 01m 01s)
* 11:45 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:33 moritzm: installing python-cryptography updates from Stretch point release
* 11:33 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2007.codfw.wmnet
* 12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 11:14 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:41 tgr@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/OAuth/includes/frontend/specialpages/SpecialMWOAuthManageMyGrants.php: SWAT: [[gerrit:582768{{!}}Get consumerKey from consumerId not from acceptanceId (T247531)]] (duration: 01m 01s)
* 10:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 11:32 ema: cp1081: restart prometheus-trafficserver-tls-exporter.service
* 09:15 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:27 elukey: upload oozie 4.3.0-3 to thirparty/bigtop14 on wikimedia-stretch - [[phab:T244499|T244499]]
* 09:06 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 10:37 jbond42: switch idp1001 to tlsproxy::envoy profile
* 08:58 jelto: jelto@gitlab1001:~$ sudo disable-puppet "disable puppet on gitlab1001 to test 728380 on GitLab replica - [[phab:T283076|T283076]]"
* 08:07 marostegui: Start m1 and m2 on db1117
* 07:41 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 08:04 marostegui: Stop m1 and m2 on db1117 to transfer them to db1077 - this will trigger dbproxies IRC alert
* 06:20 urbanecm: Start server-side upload for 1 video file
* 08:03 moritzm: installing python-cryptography bug fix updates from Stretch point release
* 02:14 ryankemper: [[phab:T288231|T288231]] `wdqs2006` data transfer complete and all tests passing on the host. All of `codfw wdqs-internal` is on the new streaming updater
* 07:46 marostegui: Stop MySQL on db1077 (non used) for 10.4 upgrade and gtid_domain_id on multisource [[phab:T149418|T149418]]
* 00:09 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:07 brennen: end of UTC late backport & config training window


== 2020-03-22 ==
== 2021-10-14 ==
* 23:19 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: [[phab:T248274|T248274]] (duration: 01m 19s)
* 23:59 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 04:37 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 23:58 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 23:56 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 56s)
* 23:49 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 23:48 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 23:46 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 56s)
* 23:43 ejegg: updated payments-wiki from {{Gerrit|19d18c1852}} to {{Gerrit|0f48acea49}}
* 23:34 cjming@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikimediaEvents/includes/VectorPrefDiffInstrumentation.php: Backport: [[gerrit:730733{{!}}Change VectorPrefDiffInstrumentation stream name to `mediawiki.skin_diff` (T289622)]] (duration: 00m 56s)
* 23:24 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730936{{!}}allow sysops to add and remove users to other groups on ptwikivoyage (T292806)]] (duration: 00m 56s)
* 23:21 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 23:11 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730933{{!}}Add americanantiquarian.org to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T292918)]] (duration: 00m 57s)
* 23:11 mutante: mw1452 - re-pooled, scap pull
* 23:09 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:35 ryankemper: [[phab:T288231|T288231]] Ran puppet on `wdqs2006`, now back to the cookbook run
* 22:33 ryankemper: [[phab:T288231|T288231]] Forgot about running puppet-agent on `wdqs2006`; aborted cookbook run
* 22:33 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 22:33 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:32 ryankemper: [[phab:T288231|T288231]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/730795; proceeding to data-transfer on `wdqs2006`: `sudo rm -fv /srv/wdqs/data_loaded` on `wdqs2006` followed by `ryankemper@cumin1001:~$ sudo cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "streaming updater cutover for wdqs2005" --blazegraph_instance blazegraph --task-id [[phab:T288231|T288231]]`
* 22:31 mutante: depooling mw1452 for testig
* 22:28 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo pool`: transfer completed successfully; tests passing on host (used `ssh -L 9999:localhost:80 wdqs2005.codfw.wmnet` to establish tunnel)
* 22:23 dpifke@deploy1002: Finished deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]] (duration: 00m 05s)
* 22:23 dpifke@deploy1002: Started deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]]
* 22:17 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 22:07 eileen: civicrm revision changed from {{Gerrit|018d3b19fe}} to {{Gerrit|9b5e0d015b}}, config revision is {{Gerrit|781d6a1b1f}}
* 21:34 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:25 robh@cumin1001: START - Cookbook sre.dns.netbox
* 21:10 robh@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 21:06 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 19:23 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 19:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:53 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:53 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=dagwiki --fix
* 18:47 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=frwiktionary --logwiki=metawiki 'TURK FASTER' 'ARTHUR MORGAN'
* 18:42 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'George Dum Fulton' 'George Fulton' # [[phab:T293403|T293403]]
* 18:41 urbanecm: UTC evening B&C done
* 18:40 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/extension.json: {{Gerrit|6da3523daaba85a4199721980c0a9c96b20697e7}}: Fix assessment quickview labels ([[phab:T292596|T292596]]) (duration: 01m 03s)
* 18:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c8dffefd0d095abe3709dcc962d5d24f27b55869}}: Create Salima namespace for dagwiki ([[phab:T289911|T289911]]) (duration: 01m 04s)
* 18:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bccd4bc45498db8628567574d0bb3a23f8fb378}}: Add $wgSitename and $wgMetaNamespace for kswiki and kswiktionary ([[phab:T289752|T289752]], [[phab:T289767|T289767]]) (duration: 01m 04s)
* 18:17 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|262e588b44f126fb9e1aa933a3ca59b191b42bd7}}: Enable Growth mentor dashboard backend on all wikis ([[phab:T278920|T278920]]) (duration: 01m 05s)
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|41baa8c41d64510986f009b9be2d70dad0915f8c}}: Add new mediawiki.skin_diff event logging stream ([[phab:T289622|T289622]]) (duration: 01m 05s)
* 18:03 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:02 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:01 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 17:54 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:52 rzl: repooled mw1452 (with `sudo pool` so no auto log from conftool)
* 17:47 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:45 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw1452.eqiad.wmnet
* 17:42 rzl: depool mw1452 for training
* 17:32 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:31 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:29 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 16:44 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:44 ryankemper: [[phab:T288231|T288231]] Manually killed dangling `pigz` / `nc` processes on `wdqs2008` (and `wdqs2005` implicitly). Should be in the right state to re-start the `data-transfer` cookbook from again
* 16:41 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:37 elukey: drop kubeflow-kfserving* docker images from deneb
* 16:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:34 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 16:33 moritzm: installing node-ansi-regex security updates
* 16:28 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere (duration: 02m 24s)
* 16:25 mbsantos@deploy1002: Started deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere
* 16:24 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 04s)
* 16:16 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad (duration: 02m 41s)
* 16:14 mbsantos@deploy1002: Started deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad
* 16:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:07 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:07 ryankemper: [[phab:T288231|T288231]] About to ctrl+c out of ongoing data transfer because puppet run following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/730794 restarted blazegraph; we'll manually disable updater and kick off the transfer again
* 16:04 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo run-puppet-agent --force`
* 15:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:54 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2008:~$ sudo depool`
* 15:52 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo depool`
* 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:13 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/VisualEditor/includes/VisualEditorHooks.php: Backport: [[gerrit:730729{{!}}Fix value of 'namespacesWithSubpages' in wgVisualEditorConfig (T293310)]] (duration: 01m 04s)
* 15:02 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 03s)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:59 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:53 kormat: upgrading orchestrator.wm.o to 3.2.6-1 [[phab:T275784|T275784]]
* 14:49 jbond@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=apt
* 14:43 jbond: migrate apt.w.o to a dns active/passiev discovery address (cc moritzm)
* 14:23 moritzm: installing krb5 security updates on KDCs
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:10 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 2/2) (duration: 01m 03s)
* 14:07 urbanecm: Run extensions/GrowthExperiments/initWikiConfig.php for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:07 urbanecm: Create growthexperiments DB tables for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 1/2) (duration: 01m 04s)
* 14:03 urbanecm@deploy1002: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|82d0a4bf45126ecba2cfcd1a0c2081a00f58dca3}}: Enable VE by default on 4 more wikis ([[phab:T290614|T290614]]) (duration: 01m 05s)
* 13:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 kormat: uploaded orchestrator 3.2.6-1 packages to apt.wm.o (buster) [[phab:T275784|T275784]]
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 12:44 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 12:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730746{{!}}Untangle “dispatch via jobs” settings in Wikibase.php (T291828)]] (no-op) (duration: 01m 04s)
* 12:12 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730725{{!}}Set wmgWikibaseDispatchViaJobsPruneChangesTableInJobEnabled for wikidatawiki (T291828)]] (no-op) (duration: 01m 05s)
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 10:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/: {{Gerrit|1f33fc3}}, {{Gerrit|e0ea1b8}}, {{Gerrit|cba2ac9}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 05s)
* 10:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|465b564}}, {{Gerrit|a8cc98b}}, {{Gerrit|6e95c48}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 06s)
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 09:20 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:20 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 volans@deploy1002: Finished deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1 (duration: 00m 50s)
* 09:17 volans@deploy1002: Started deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1
* 09:04 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 45s)
* 09:03 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 09:02 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 23s)
* 09:02 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 08:52 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:22 volans: rolling out debmonitor-client upgrade to 0.3.1 across the fleet
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:24 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:24 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift-ro,name=eqiad
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad
* 07:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:37 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:52 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:50 foks: changing user email for "Region of Peel Archives"
* 01:41 ejegg: updated payments-wiki from {{Gerrit|b329d2dea2}} to {{Gerrit|19d18c1852}}
* 01:35 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:31 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2020-03-20 ==
== 2021-10-13 ==
* 23:16 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 23:37 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:04 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 23:36 eileen: civicrm revision changed from {{Gerrit|946dfb6c5a}} to {{Gerrit|018d3b19fe}}, config revision is {{Gerrit|85277466ed}}
* 21:06 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730575{{!}}Create an alias for the project namespace on kswiki (T291740)]] (duration: 01m 05s)
* 21:04 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 22:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:01 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 22:01 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Collection/includes/Specials/SpecialCollection.php: Backport: [[gerrit:730578{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 20:59 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 21:50 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection: Backport: [[gerrit:730577{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 20:59 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 21:47 foks: removing 8 files for legal compliance
* 20:59 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 21:03 foks: removing 2 files for legal compliance
* 20:57 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 21:00 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 20:56 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:50 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:55 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 20:49 brennen@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/Api/ApiGetBookCreatorBoxContent.php: Backport: [[gerrit:730574{{!}}Fall back to main page if given title is invalid (T293299)]] (duration: 01m 04s)
* 20:53 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:41 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw124[4-9].eqiad.wmnet
* 20:40 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:40 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:31 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 20:40 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:27 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1021.eqiad.wmnet with OS stretch
* 20:40 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:04 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 20:40 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:03 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubernetes1021.eqiad.wmnet with OS stretch
* 20:40 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 20:39 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:18 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:37 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw123[0-1].eqiad.wmnet
* 19:16 mutante: gitlab2001 - status before was that "gitlab-ctl status" showed components "gitlab-workhorse" and "postgres-exporter" as "down". this was either pre-broken or caused by the restore process. after manually 'gitlab-ctl start gitlab-workhorse' all of the components are in "run" and https://gitlab-replica.wikimedia.org is up ( [[phab:T285867|T285867]])
* 20:32 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw122[7-9].eqiad.wmnet
* 19:08 mutante: gitl1b2001 - started workhorse which was for some reason marked as down after restore command ran
* 20:18 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw124[4-9].eqiad.wmnet
* 19:08 mutante: [gitlab2001:~] $ sudo /usr/bin/gitlab-ctl start gitlab-workhorse
* 20:18 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw123[0-1].eqiad.wmnet
* 19:06 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 01m 03s)
* 20:18 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw122[7-9].eqiad.wmnet
* 19:05 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 15:44 hashar@deploy1001: Synchronized php-1.35.0-wmf.24/includes/ActorMigration.php: Avoid upsert() log warning spam in ActorMigration due to unique key array format - [[phab:T248147|T248147]] (duration: 01m 01s)
* 19:02 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87879865c35edab3ead523027681146e00d6fc02}}: Create Translation namespace for viwikisource ([[phab:T290691|T290691]]) (duration: 01m 04s)
* 13:34 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 18:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|06fd0f225575448771cdba0d4e6bf36bb6715bc1}}: add extendedconfimed for autoreview group on ptwiki ([[phab:T292912|T292912]]) (duration: 01m 04s)
* 13:33 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 18:37 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript initSiteStats.php --wiki=ptwiki --update
* 13:33 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 18:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=ptwiki extendedconfirmed
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Decrease db1087, vslow host weight in main, given that the CPU across s8 is now doing a lot better', diff saved to https://phabricator.wikimedia.org/P10741 and previous config saved to /var/cache/conftool/dbconfig/20200320-121628-marostegui.json
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bb2b388217aa91a39ed3684f87fdf7edb06fd81}}: Set autoconfirmedextended and confirmedextended for ptwiki ([[phab:T292915|T292915]]) (duration: 01m 04s)
* 11:52 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 18:16 urbanecm@deploy1002: Synchronized static/images/project-logos: {{Gerrit|694bc234ab5dbb9a2387a6129998d45a53ac0ab3}}: Remove an old dawiki temporary logo (duration: 01m 04s)
* 11:10 elukey: upload oozie 4.3.0-2 packages to thirdparty/bigtop14 on wikimedia-stretch
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|224e2a374b1cc6327e9d8c2bca576091ce4efc74}}: Add NS_MAIN back to wgExtraSignatureNamespaces for mediawikiwiki ([[phab:T291630|T291630]]) (duration: 01m 05s)
* 10:56 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:56 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 18:12 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 10:34 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 18:11 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|1b96f54a518620b0dc6a0ab63b402d0ea2c6bf70}}: Update logo for liwiktionary ([[phab:T291479|T291479]]) (duration: 01m 14s)
* 10:29 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 18:10 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 10:13 dcausse: repooling wdqs1006
* 18:10 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:28 moritzm: rolling restart of FPM on mw1261-mw1265 for freetype update
* 18:09 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:59 moritzm: installing freetype bugfix updates from stretch point release
* 18:09 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1017', diff saved to https://phabricator.wikimedia.org/P10739 and previous config saved to /var/cache/conftool/dbconfig/20200320-084730-marostegui.json
* 18:08 volans: uploaded debmonitor-client_0.3.1 to apt.wikimedia.org stretch-wikimedia,buster-wikimedia,bullseye-wikimedia
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1017', diff saved to https://phabricator.wikimedia.org/P10738 and previous config saved to /var/cache/conftool/dbconfig/20200320-083334-marostegui.json
* 17:14 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|dd7a3314602ffddc5b917cccc71c917301639388}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 04s)
* 07:59 XioNoX: reorder LVS BGP neighbors and add descriptions - https://gerrit.wikimedia.org/r/576320
* 17:13 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|5c27154cf434bebc37f5e98e2ad1b5cea7cde1d4}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 15s)
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1017', diff saved to https://phabricator.wikimedia.org/P10737 and previous config saved to /var/cache/conftool/dbconfig/20200320-074816-marostegui.json
* 16:57 mutante: stat1008 - short on disk space, mostly used in /tmp, high CPU usage by R proccess, sent a message about it to all shell users via wall
* 07:46 elukey: upload hadoop_2.8.5-2 (and related debs) to thirdparty/bigtop14 on wikimedia-stretch (manually rebuilt via docker after patch backports from upstream)
* 16:50 mutante: stat1008 - apt-get clean - freed 1.3 GB disk space - was alerting in Icinga because / was 97% full
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1017', diff saved to https://phabricator.wikimedia.org/P10736 and previous config saved to /var/cache/conftool/dbconfig/20200320-073205-marostegui.json
* 16:37 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 07:26 marostegui: Restart mysql on es1017 for upgrade - [[phab:T239791|T239791]]
* 16:37 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1017 for update [[phab:T239791|T239791]]', diff saved to https://phabricator.wikimedia.org/P10735 and previous config saved to /var/cache/conftool/dbconfig/20200320-070945-marostegui.json
* 16:23 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1014 to es3 master, this is a NOOP [[phab:T239791|T239791]]', diff saved to https://phabricator.wikimedia.org/P10734 and previous config saved to /var/cache/conftool/dbconfig/20200320-070922-marostegui.json
* 16:23 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:29 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:28 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:26 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:26 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 15:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:12 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:12 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 15:04 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:03 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:03 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:01 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:01 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:01 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:59 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 14:59 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:59 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:57 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:54 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:52 ema: repool cp4021, further testing can be performed on sretest1001 [[phab:T201317|T201317]]
* 14:51 volans: restarting ircecho.service on alert1001 to get back icinga-wm without the underscore
* 14:50 elukey: restart pybal on lvs1015 (low-traffic primary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 14:48 moritzm: reverted to clean package state on deneb
* 14:44 elukey@puppetmaster1001: conftool action : ge; selector: cluster=ml_serve,service=inference
* 14:36 elukey: restart pybal on lvs1016 (low-traffic secondary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 14:27 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:27 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:25 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:25 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:21 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:21 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:20 moritzm: temporarily downgrade sphinx packages on deneb to 1.7.9-1~bpo9+1 to build a Ganeti 2.16 stretch backport with delicate toolchain needs
* 14:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:59 XioNoX: push prep-work for anycast tuning in ulsfo - [[phab:T288843|T288843]]
* 13:38 jayme: imported helm-diff_3.1.3-2 to buster-wikimedia (https://gerrit.wikimedia.org/r/c/operations/debs/helm-diff/+/730509)
* 13:37 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 13:34 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 12:13 Lucas_WMDE: UTC morning backport+config window done
* 12:12 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/includes: Backport: [[gerrit:730370{{!}}Add Link: Do not log "no suggestion found" errors in production log (T291251)]] (duration: 01m 04s)
* 12:11 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]'  # after applying 730512 at mwmaint1002 to workaround [[phab:T293219|T293219]] # [[phab:T255037|T255037]]
* 12:11 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/modules: Backport: [[gerrit:730371{{!}}Suggested Edits: Update local config.presets when topics/difficulty presets change (T292536)]] (duration: 01m 07s)
* 11:56 urbanecm@deploy1002: Synchronized wmf-config/config/itwiki.yaml: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]) (duration: 01m 04s)
* 11:55 urbanecm: mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=mediawikiwiki "Growth/Communities/How to introduce yourself as a mentor" "Growth/Communities/How to configure the mentors' list" "Martin Urbanec (WMF)" --reason '[[:phab:T293184]]' # [[phab:T293184|T293184]]
* 11:55 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 2/3) (duration: 01m 04s)
* 11:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 1/3) (duration: 01m 05s)
* 11:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]' # [[phab:T255037|T255037]]
* 11:49 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=itwiki growthexperiments # [[phab:T255037|T255037]]
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:730380{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 07s)
* 11:43 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Wikibase/repo/: Backport: [[gerrit:730385{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 18s)
* 11:33 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 11:19 ema: pool cp4021 after reimage [[phab:T201317|T201317]]
* 11:05 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 10:15 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:09 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:728490{{!}}Add more types of QuickSurveys on beta cluster (T292459)]] (duration: 01m 53s)
* 10:06 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 09:22 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 08:35 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:21 elukey: run kafka preferred-replica-election on kafka-main1001 to rebalance partition leaders - [[phab:T288825|T288825]]
* 08:15 godog: bounce graphite on graphite1004 to apply new config
* 07:33 elukey: increase kafka topic partition size of the top 4 high traffic topics of main-eqiad as described in https://phabricator.wikimedia.org/T288825#7422726
* 07:13 XioNoX: provision new eqsin-ulsfo link - [[phab:T273308|T273308]]
* 06:26 elukey: `kafka topics --alter --topic <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change --partitions 3` on kafka-main2001 - [[phab:T288825|T288825]]
* 00:38 ejegg: updated payments-wiki from {{Gerrit|030b11da1a}} to {{Gerrit|b329d2dea2}}


== 2020-03-19 ==
== 2021-10-12 ==
* 22:15 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@794f099]: Update mobileapps to {{Gerrit|99869f45}} (duration: 05m 13s)
* 23:48 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 22:10 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@794f099]: Update mobileapps to {{Gerrit|99869f45}}
* 23:16 urbanecm: UTC late B&C window done
* 19:14 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.24
* 23:15 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 01m 04s)
* 18:30 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/Wikibase/lib/includes/Store/ByIdDispatchingEntityInfoBuilder.php: [[gerrit:581674{{!}}Fix 'max' to Int32EntityId::MAX conversion (T247985)]], part II (duration: 01m 07s)
* 23:12 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 02m 09s)
* 18:24 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/Wikibase/data-access/src/SingleEntitySourceServices.php: [[gerrit:581674{{!}}Fix 'max' to Int32EntityId::MAX conversion (T247985)]], part I (duration: 01m 08s)
* 23:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 17:47 mutante: releases/releases-jenkins - closed firewall hole to port 80 for caching servers - kept it open just for envoy from the backends - ATS speaks https to them meanwhile
* 22:53 urbanecm: [urbanecm@labweb1001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=labswiki Jamesmontalvo3 #
* 16:54 hashar@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/RelatedArticles: Do not register "" as a style path, that breaks ResourceLoader - [[phab:T248090|T248090]] (duration: 01m 07s)
* 22:51 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 16:01 jeh@deploy1001: Finished deploy [horizon/deploy@ad60c2b]: update horizon designate-dashboard submodule (duration: 03m 31s)
* 20:21 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:57 jeh@deploy1001: Started deploy [horizon/deploy@ad60c2b]: update horizon designate-dashboard submodule
* 19:31 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.1 (duration: 04m 02s)
* 15:19 andrew@deploy1001: deploy aborted: modest css change for the hiera editing dialog (take two -- I consistently forget to rebase before doing this) (duration: 00m 00s)
* 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:52 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 19:02 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 14:48 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 18:47 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 45m 36s)
* 14:48 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' .
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 13:32 hashar@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.24 (duration: 01m 07s)
* 18:01 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 13:31 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.24
* 17:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:11 marostegui: Rename testwikidatawiki.wb_terms on db1078 - [[phab:T248086|T248086]]
* 17:56 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/CentralNotice: Backport: [[gerrit:730141]] (duration: 00m 59s)
* 12:33 XioNoX: push frack fw policies [[phab:T248004|T248004]]
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:43 Lucas_WMDE: EU SWAT done
* 17:46 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 11:40 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.24/includes/OutputPage.php: SWAT: [[gerrit:581245{{!}}OutputPage: Fix warning when setting wgUserNewMsgRevisionId (T248049)]] (duration: 01m 08s)
* 17:43 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:15 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|e277d29}}: trwiki: Grant interface editors editprotected & editsemiprotected ([[phab:T247672|T247672]]; take II) (duration: 01m 08s)
* 17:41 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:13 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|e277d29}}: trwiki: Grant interface editors editprotected & editsemiprotected ([[phab:T247672|T247672]]) (duration: 01m 07s)
* 17:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:47 ema: upload atskafka 0.4 to buster-wikimedia [[phab:T237993|T237993]]
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:24 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.24/skins/Vector/skin.json: [[gerrit:581248{{!}}skins.vector.styles.legacy needs to define legacy feature (T247566)]] (duration: 01m 08s)
* 17:32 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SyntaxHighlight_GeSHi/includes/ResourceLoaderPygmentsModule.php: Backport: [[gerrit:730233{{!}}Include generated styles before Mediawiki overrides (T292736)]] (duration: 00m 57s)
* 10:01 ema: cp: rolling ats-tls-restart to apply log format changes [[phab:T248067|T248067]] [[phab:T237993|T237993]]
* 17:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:26 marostegui: m2 maintenance window done [[phab:T246098|T246098]]
* 17:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:03 akosiaris: restart gerrit on gerrit1001 [[phab:T246098|T246098]]
* 17:23 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730236{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 09:02 akosiaris: restart otrs-daemon, apache on mendelevium [[phab:T246098|T246098]]
* 17:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:01 akosiaris: restart recommendation-api on scb [[phab:T246098|T246098]]
* 17:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:00 marostegui: Restart m2 primary database master - [[phab:T246098|T246098]]
* 17:16 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730235{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 08:48 dcausse: depooling wdqs1006 to help catching up lag
* 17:12 moritzm: installing rsync bugfix updates
* 08:43 dcausse: restarting blazegraph on wdqs1006 ([[phab:T242453|T242453]])
* 17:09 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 07:54 moritzm: installing cups updates from Stretch point release
* 16:56 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 07:48 moritzm: installing libjaxen-java security updates from Stretch point release
* 16:55 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 07:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Update pc1008 spare situation [[phab:T247787|T247787]] (duration: 01m 09s)
* 16:53 moritzm: failed over ganeti master for test cluster to ganeti2025
* 06:49 elukey: execute 'sudo rm /etc/logrotate.d/ceph-common' on cloudvirt-dev and cloudcontrol-dev to stop daily cronspam
* 16:50 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 06:46 marostegui: Deploy schema change on testcommonswiki.globalimagelinks (empty table) on the s4 master [[phab:T243987|T243987]]
* 16:48 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 06:33 marostegui: Upgrade db1132 without restarting [[phab:T246098|T246098]]
* 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:39 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikiws to 1.35.0-wmf.24 refs [[phab:T233872|T233872]]
* 16:30 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts testvm2009.codfw.wmnet
* 00:31 twentyafterfour@deploy1001: Synchronized php-1.35.0-wmf.24/skins/Vector/includes/templates/index.mustache: deploy https://gerrit.wikimedia.org/r/c/mediawiki/skins/Vector/+/581116 which reverts https://gerrit.wikimedia.org/r/c/mediawiki/skins/Vector/+/581054 refs  [[phab:T248010|T248010]] (duration: 01m 07s)
* 16:30 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 00:18 eileen: civicrm revision changed from {{Gerrit|a1b2cbeac1}} to {{Gerrit|1c477ff07f}}, config revision is {{Gerrit|37232d8460}}
* 16:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:26 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 16:26 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes: Backport: [[gerrit:730226{{!}}Pre-format comments for non-local files too (T292570)]] (duration: 01m 15s)
* 16:17 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 16:16 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2009.codfw.wmnet
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:10 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 16:09 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730231{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 00m 57s)
* 16:00 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:58 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730230{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 02m 13s)
* 15:57 volans@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2009.codfw.wmnet
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:51 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:49 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:48 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 15:48 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:41 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for analytics1069.eqiad.wmnet
* 15:41 btullis@cumin1001: START - Cookbook sre.hosts.remove-downtime for analytics1069.eqiad.wmnet
* 15:02 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:50 volans@cumin2002: START - Cookbook sre.dns.netbox
* 13:49 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:14 godog: add 50G to prometheus/k8s in eqiad
* 13:13 otto@deploy1002: Synchronized wmf-config/CommonSettings.php: Enable x_client_ip_forwarding_enabled for eventgate-analytics and eventgate-analytics-external - [[phab:T288853|T288853]] (duration: 00m 56s)
* 13:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:11 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:05 volans: upgraed spicerack to 1.0.5 on cumin hosts
* 12:25 volans: uploaded spicerack_1.0.5 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:15 elukey: `kafka topics --alter --topic codfw.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:15 elukey: `kafka topics --alter --topic eqiad.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:10 elukey: `kafka topics --alter --topic codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:09 elukey: `kafka topics --alter --topic eqiad.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 11:58 elukey: `kafka topics --alter --topic codfw.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:49 elukey: `kafka topics --alter --topic eqiad.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 11:34 urbanecm: UTC morning B&C window done
* 11:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|860ea0944d6dc1e6b5061eb84eec378eb5ac8441}}: Remove NS_MAIN from wgExtraSignatureNamespaces on most special wikis ([[phab:T291630|T291630]]) (duration: 00m 57s)
* 11:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:14 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 urbanecm@deploy1002: Synchronized w/static.php: {{Gerrit|e77ae17efb34723598fc69e87109944384df442a}}: static.php: correctly report a bad request (duration: 00m 57s)
* 11:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2003.codfw.wmnet
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 10:30 ema: apply https://gerrit.wikimedia.org/r/726912 to all A:cp nodes [[phab:T288106|T288106]]
* 10:24 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 10:23 ema: depool/repool ats-be on cp4028 to verify updates to /etc/varnish/directors.frontend.vcl on cp4027 keep on working fine [[phab:T288106|T288106]]
* 10:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:22 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 10:16 ema: cp4027: enable and run puppet to test https://gerrit.wikimedia.org/r/726912 [[phab:T288106|T288106]]
* 10:12 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti2025.codfw.wmnet with OS buster
* 09:16 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17456 and previous config saved to /var/cache/conftool/dbconfig/20211012-091614-kormat.json
* 09:01 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17455 and previous config saved to /var/cache/conftool/dbconfig/20211012-090111-kormat.json
* 08:46 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17454 and previous config saved to /var/cache/conftool/dbconfig/20211012-084607-kormat.json
* 08:31 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17453 and previous config saved to /var/cache/conftool/dbconfig/20211012-083103-kormat.json
* 08:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:58 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|17dc3aa}}, {{Gerrit|e0ca905}}, {{Gerrit|c0f4f4e}}: GrowthExperiments backports ([[phab:T292224|T292224]], [[phab:T290609|T290609]], [[phab:T290609|T290609]]) (duration: 00m 59s)
* 07:40 elukey: run kafka preferred-replica-election on kafka-main2001 to rebalance partition leaders after the last topic moves - [[phab:T288825|T288825]]
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 07:22 moritzm: installing RT security updates
* 04:43 eileen: civicrm revision changed from {{Gerrit|96090e4bd2}} to {{Gerrit|946dfb6c5a}}, config revision is {{Gerrit|85277466ed}}
* 03:56 kart_: cxserver: Remove Matxin Key from Production ([[phab:T292635|T292635]])
* 03:54 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:48 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:45 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 02:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 eileen: civicrm revision changed from {{Gerrit|598b59b0ee}} to {{Gerrit|96090e4bd2}}, config revision is {{Gerrit|85277466ed}}


== 2020-03-18 ==
== 2021-10-11 ==
* 23:31 twentyafterfour@deploy1001: Synchronized php-1.35.0-wmf.23/includes/TemplateParser.php: sync https://gerrit.wikimedia.org/r/c/mediawiki/core/+/581114/ refs [[phab:T248010|T248010]] (duration: 01m 07s)
* 21:25 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 23:26 twentyafterfour@deploy1001: Synchronized php-1.35.0-wmf.24/includes/TemplateParser.php: sync https://gerrit.wikimedia.org/r/c/mediawiki/core/+/581115/ (duration: 01m 08s)
* 20:58 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 22:22 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 22:18 volans@cumin1001: START - Cookbook sre.dns.netbox
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 21:56 Krinkle: krinkle@mw1385: scap pull # clean up AdHoc debugging for [[phab:T248010|T248010]]
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 21:16 brennen@deploy1001: Synchronized php-1.35.0-wmf.24/skins/Vector/includes/templates/index.mustache: [[gerrit:581054{{!}}Change master template to force cache invalidation of partials]] (duration: 01m 06s)
* 15:31 jgleeson: smashpig updated from {{Gerrit|3607b16f83}} to {{Gerrit|dd3a81c7c2}}
* 21:11 brennen@deploy1001: Synchronized php-1.35.0-wmf.23/skins/Vector/includes/templates/index.mustache: [[gerrit:581054{{!}}Change master template to force cache invalidation of partials]] (duration: 01m 15s)
* 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 20:04 volans@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 19:58 volans@cumin1001: START - Cookbook sre.dns.netbox
* 14:36 Emperor: start restoring weight to ms-be2045 [[phab:T290881|T290881]]
* 19:49 hashar@deploy1001: rebuilt and synchronized wikiversions files: Ensure fleet wide consistency
* 13:42 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 19:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 12:53 moritzm: install apache security updates on buster
* 19:21 mutante: shutting down (decom cookbook) elnath.codfw.wmnet ([[phab:T188544|T188544]])
* 12:49 topranks: Setting up BGP peering to AS12552 (GlobalConnect Group) at AMS-IX on cr2-esams
* 19:20 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 12:45 ema: cp4027: upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 19:15 fdans@deploy1001: Finished deploy [analytics/refinery@549f6a4]: deploying analytics refinery (duration: 15m 02s)
* 12:04 moritzm: install apache security updates on bullseye
* 19:11 hashar: 1.35.0-wmf.24 is on hold: too many blockers
* 10:23 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 19:00 fdans@deploy1001: Started deploy [analytics/refinery@549f6a4]: deploying analytics refinery
* 09:50 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 18:32 Lucas_WMDE: Morning SWAT done
* 09:45 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 18:30 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 09:37 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 18:27 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: [[gerrit:579018{{!}}Update linter whitelist w/ parsoid11's IP address (T246833)]] (beta-only) (duration: 01m 04s)
* 09:13 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 18:20 Lucas_WMDE: scap pull on mwdebug1001, attempting to fix mismatched wikiversions alert
* 09:09 elukey: force kafka preferred-replica-election on kafka-main2001 after the first 50 topic partitions moves - [[phab:T288825|T288825]]
* 18:14 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: [[gerrit:580373{{!}}Add beta configuration for Wikibase reference formatting (T247416)]] (duration: 01m 08s)
* 09:05 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 18:13 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 09:01 godog: bounce swift-object-replicator on ms-be2036
* 18:13 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: [[gerrit:580373{{!}}Add beta configuration for Wikibase reference formatting (T247416)]], take II (duration: 01m 07s)
* 08:52 godog: bounce statsite on graphite1004 to apply unit config changes
* 18:11 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 08:48 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 18:11 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: [[gerrit:580373{{!}}Add beta configuration for Wikibase reference formatting (T247416)]] (duration: 01m 07s)
* 08:41 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet
* 16:43 mutante: wtp1025 - Icinga alerted it's running out of disk - 'apt-get clean' lowered disk usage from 97% to 91%
* 08:38 moritzm: updated buster d-i image for Bullseye 11.1 point release [[phab:T292844|T292844]]
* 16:00 hashar@deploy1001: Finished scap: testwiki to 1.35.0-wmf.24 and rebuild l10n cache - [[phab:T233872|T233872]] (duration: 61m 23s)
* 08:38 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 14:58 hashar@deploy1001: Started scap: testwiki to 1.35.0-wmf.24 and rebuild l10n cache - [[phab:T233872|T233872]]
* 08:26 godog: swift eqiad-prod: final weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 14:41 vgutierrez: disable TLS session tickets in ulsfo - [[phab:T245616|T245616]] [[phab:T170567|T170567]]
* 08:25 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 14:29 godog: add debug to icinga2001 - [[phab:T247538|T247538]]
* 08:24 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet
* 14:28 _joe_: restarted php-fpm on mw1283, was throwing SIGILL
* 08:06 godog: bounce uwsgi on graphite hosts to bump request size limit - [[phab:T292877|T292877]]
* 14:17 marostegui: Rename wb_terms on codfw hosts: s8 (wikidatawiki - db2081), s3 (testwikidatawiki - db2109), s4 (commonswiki, testcommonswiki - db2106) [[phab:T208425|T208425]]
* 07:58 volans: migrating physical hosts DHCP to the new reimage process - [[phab:T269855|T269855]]
* 14:06 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.23
* 07:57 elukey: start kafka topics rebalancing for main-codfw (long running maintenance) - [[phab:T288825|T288825]]
* 11:59 hashar@deploy1001: Synchronized php-1.35.0-wmf.24/includes/objectcache/ObjectCache.php: objectcache: Restore keyspace for LocalServerCache service - [[phab:T247562|T247562]] (duration: 01m 07s)
* 11:57 hashar@deploy1001: Synchronized php-1.35.0-wmf.23/includes/objectcache/ObjectCache.php: objectcache: Restore keyspace for LocalServerCache service - [[phab:T247562|T247562]] (duration: 01m 10s)
* 11:42 marostegui@cumin1001: dbctl commit (dc=all): 'Decrease db1087, vslow host weight in main, given that the CPU across s8 is now doing a lot better', diff saved to https://phabricator.wikimedia.org/P10715 and previous config saved to /var/cache/conftool/dbconfig/20200318-114259-marostegui.json
* 11:17 ema: upload atskafka 0.3 to buster-wikimedia [[phab:T237993|T237993]]
* 11:16 kart_: EU Mid-day SWAT done
* 11:11 kartik@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit{{!}}579893{{!}}Enable ContentTranslation as a default tool in Malay, Azerbaijani and Estonian WPs (T246622, T246628, T246629)]], take II (duration: 01m 07s)
* 11:10 kartik@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit{{!}}579893{{!}}Enable ContentTranslation as a default tool in Malay, Azerbaijani and Estonian WPs (T246622, T246628, T246629)]] (duration: 01m 07s)
* 10:58 _joe_: setting num_retries=0 on mw2224 for eventgate-analytics in envoy ([[phab:T247484|T247484]])
* 10:58 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Stop writing to old term store (wb_terms table) in wikidata (T208425)]], take II (duration: 01m 06s)
* 10:55 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Stop writing to old term store (wb_terms table) in wikidata (T208425)]] (duration: 01m 08s)
* 10:52 _joe_: setting num_retries=0, idle_timeout=5s on mw2223 for eventgate-analytics in envoy ([[phab:T247484|T247484]])
* 10:48 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Stop writing to old term store in testwikidatawiki (T208425)]], take II (duration: 01m 07s)
* 10:45 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Stop writing to old term store in testwikidatawiki (T208425)]] (duration: 01m 07s)
* 10:33 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Read from the new term store everywhere (T219123)]], take II (duration: 01m 07s)
* 10:31 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Read from the new term store everywhere (T219123)]] (duration: 01m 07s)
* 10:14 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Read from the new term store everywhere (T219123)]], take II (duration: 01m 07s)
* 10:12 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Read from the new term store everywhere (T219123)]] (duration: 01m 08s)
* 09:43 vgutierrez: enabling inbound TLSv1.3 in upload@ulsfo - [[phab:T170567|T170567]]
* 09:18 vgutierrez: enabling inbound TLSv1.3 in cp4026 - [[phab:T170567|T170567]]
* 08:44 marostegui: Start replication pc1008 from pc1010 to get some of the new keys so it is not fully empty - [[phab:T247787|T247787]]
* 08:14 vgutierrez: upgrade ATS to 8.0.6-1wm3 in ulsfo - [[phab:T170567|T170567]]
* 07:55 moritzm: installing remaining libxslt security updates
* 07:40 oblivian@deploy1001: Synchronized wmf-config/ProductionServices.php: eventgate-analytics to use envoy everywhere (duration: 01m 10s)
* 07:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 06:31 marostegui: Reboot pc1008 to try to get its RAID redone - [[phab:T247787|T247787]]
* 00:31 Amir1: foreachwikiindblist medium deleteEqualMessages.php --delete ([[phab:T247562|T247562]])
* 00:10 crusnov@deploy1001: Finished deploy [netbox/deploy@14256f9]: netbox 2.7.10 upgrade (duration: 02m 29s)
* 00:08 crusnov@deploy1001: Started deploy [netbox/deploy@14256f9]: netbox 2.7.10 upgrade
* 00:07 crusnov@deploy1001: Finished deploy [netbox/deploy@14256f9]: netbox 2.7.10 upgrade (duration: 01m 17s)
* 00:06 crusnov@deploy1001: Started deploy [netbox/deploy@14256f9]: netbox 2.7.10 upgrade


== 2020-03-17 ==
== 2021-10-09 ==
* 22:49 Amir1: warming up cache for Q80M to Q88M for new term store on db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 05:01 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:17 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@0adead4]: Update mobileapps to {{Gerrit|ec6fd6e}} (duration: 06m 08s)
* 04:28 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:11 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@0adead4]: Update mobileapps to {{Gerrit|ec6fd6e}}
* 01:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 21:54 Krinkle: krinkle@mw2170$ disable-puppet (Testing for [[phab:T99740|T99740]])
* 00:46 mutante: ms-be2045 - started systemd-timedated which had been killed by something
* 21:15 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: WikimediaEditorTasks: Enable Depicts counting (again) ([[phab:T247874|T247874]]) (duration: 01m 07s)
* 00:28 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 21:10 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: WikimediaEditorTasks: Enable Depicts counting ([[phab:T247874|T247874]]) (duration: 01m 07s)
* 00:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.force-unfreeze (exit_code=99)
* 20:50 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.23/extensions/WikimediaEditorTasks: Fix revert counting for non-language-specific counters, take 2 ([[phab:T244974|T244974]]) (duration: 01m 12s)
* 00:23 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.force-unfreeze
* 20:33 mutante: boron - systemctl start docker-reporter-k8s-images ; systemctl start docker-reporter-releng-images
* 00:13 ryankemper: [[phab:T292814|T292814]] Write queue stuck at 133 events in partition 1 of topic `codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite`, will try again at another time
* 20:31 mutante: boron - had degraded systemd state in Icinga - systemctl start docker-reporter-base-images
* 00:12 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 19:54 mutante: miscweb1001 - restarted ferm, reverted live hack
* 19:53 ppchelko@deploy1001: Finished deploy [restbase/deploy@8db09ed]: Various PCS endpoints additions and fixes [[phab:T247295|T247295]] [[phab:T247096|T247096]] [[phab:T244175|T244175]] (duration: 14m 31s)
* 19:51 mutante: miscweb1001 - testing if ferm 80 firewall hole is needed for envoy, temp. disabled puppet, restarted ferm
* 19:38 ppchelko@deploy1001: Started deploy [restbase/deploy@8db09ed]: Various PCS endpoints additions and fixes [[phab:T247295|T247295]] [[phab:T247096|T247096]] [[phab:T244175|T244175]]
* 19:01 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q80M (T219123)]], take II (duration: 01m 06s)
* 19:00 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q80M (T219123)]] (duration: 01m 07s)
* 18:53 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.24/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseItemTermStoreWriter.php: [[gerrit:580390{{!}}Do not lock rows when there's no term returned (T247553 T246898)]], To catch the train (duration: 01m 08s)
* 18:50 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 18:45 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:45 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:41 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 18:39 mutante: removing mw1238 through mw1243 - decom with cookbook ([[phab:T247780|T247780]] [[phab:T245099|T245099]])
* 18:38 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:37 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 18:35 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw123[8-9].eqiad.wmnet
* 18:35 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw124[0-3].eqiad.wmnet
* 18:29 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 18:01 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@b6bff94]: Update mobileapps to {{Gerrit|3c73ca3}} (duration: 06m 06s)
* 18:00 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 17:58 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 17:56 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.23/languages/LanguageConverter.php: [[gerrit:580361{{!}}languages: Don't assume  in LanguageConverter (T235360)]] (duration: 01m 07s)
* 17:55 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@b6bff94]: Update mobileapps to {{Gerrit|3c73ca3}}
* 17:55 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 17:53 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw124[0-3].eqiad.wmnet
* 17:53 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw123[89].eqiad.wmnet
* 17:52 Amir1: warming up cache for Q70M to Q80M for new term store on db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 17:46 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.23/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseItemTermStoreWriter.php: [[gerrit:580352{{!}}Do not lock rows when there's no term returned (T247553 T246898)]] (duration: 01m 07s)
* 17:42 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 17:40 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 17:37 ejegg: updated payments-wiki from {{Gerrit|86ce0361f9}} to {{Gerrit|72856949a1}}
* 17:30 bearND: mobileapps deploy failed on canary, rolled back
* 17:29 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@266e6da]: Update mobileapps to {{Gerrit|6370784}} (duration: 04m 00s)
* 17:25 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@266e6da]: Update mobileapps to {{Gerrit|6370784}}
* 17:24 elukey@deploy1001: Finished deploy [analytics/superset/deploy@3f3ddcb]: Upgrade PyHive to 0.6.2 (duration: 00m 43s)
* 17:24 elukey@deploy1001: Started deploy [analytics/superset/deploy@3f3ddcb]: Upgrade PyHive to 0.6.2
* 17:18 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1280.eqiad.wmnet
* 17:17 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw1280.eqiad.wmnet
* 17:10 jynus: purging some old rows on pc1010 on a screen to earn some time [[phab:T247788|T247788]]
* 16:56 mutante: mw1280 - scap pull - had ancient mw version due to downtime
* 16:46 mutante: mw1280 back after long downtime due to broken RAM, added back into puppet ([[phab:T240187|T240187]])
* 16:36 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:36 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 15:56 brennen@deploy1001: rebuilt and synchronized wikiversions files: Reverting All wikis to 1.35.0-wmf.23
* 15:52 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 15:52 brennen@deploy1001: sync-wikiversions aborted: All wikis to 1.35.0-wmf.23 (duration: 05m 16s)
* 15:51 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 15:50 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 15:44 brennen@deploy1001: sync-wikiversions aborted: All wikis to 1.35.0-wmf.23 (duration: 03m 49s)
* 15:36 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 15:36 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 15:23 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 15:11 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 15:01 hashar: scap prep 1.35.0-wmf.24 and applying security patches # [[phab:T233872|T233872]]
* 15:00 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 14:57 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:44 dcausse: wdqs1010 (test server) is running a data-reload cookbook (and is probably taking longer than the expected downtime)
* 14:38 hashar: mediawiki/core git push {{Gerrit|68bc9300dc}}:wmf/1.35.0-wmf.24  to catch up with a change that got merged while branch is being cut # [[phab:T233872|T233872]]
* 14:29 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q70M (T219123)]], take II (duration: 01m 04s)
* 14:28 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q70M (T219123)]] (duration: 01m 10s)
* 14:24 marostegui: Stop mysql and restart pc1008 [[phab:T247787|T247787]]
* 14:23 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:21 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:14 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.23/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseItemTermStoreWriter.php: [[gerrit:580328{{!}}Store item terms at late as possible to avoid deadlocks (T247553 T246898)]] (duration: 01m 07s)
* 14:13 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:12 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 14:09 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:07 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 14:07 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 14:06 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 14:03 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 13:41 hashar: Branching 1.35.0-wmf.24 # [[phab:T233872|T233872]]
* 13:30 godog: stop puppet and turn on debug on icinga2001 - [[phab:T247538|T247538]]
* 12:06 cdanis@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 12:06 cdanis@cumin1001: START - Cookbook sre.network.cf
* 11:46 godog: test pinning icinga to a subset of cpu on icinga1001
* 11:16 akosiaris: [[phab:T242461|T242461]] undeploy restrouter. Unused service and per task to not  be used after all
* 11:16 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'restrouter' for release 'staging' .
* 11:15 akosiaris@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'restrouter' for release 'production' .
* 11:15 akosiaris@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'restrouter' for release 'production' .
* 10:56 XioNoX: add extra prepend to LG export filter
* 10:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:41 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:40 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:40 jbond42: sec update for libgraphicsmagick on maps
* 10:20 godog: bounce squid on install1003 [[phab:T247759|T247759]]
* 10:07 _joe_: sudo cumin -b2 -s 50 'A:mw-jobrunner' 'restart-php7.2-fpm' [[phab:T247622|T247622]]
* 10:03 Amir1: warming up cache for Q60M to Q70M for new term store on db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 10:02 ema: create kafka topic atskafka_test_webrequest_text [[phab:T247497|T247497]]
* 09:57 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)
* 09:55 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q60M (T219123)]], take II (duration: 01m 05s)
* 09:54 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q60M (T219123)]] (duration: 01m 09s)
* 09:27 elukey@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers
* 09:21 ema: cp: rolling varnish-frontend-restart to decrease memory usage and apply transient storage limits [[phab:T185968|T185968]]
* 09:09 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)
* 08:39 elukey@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers
* 00:57 krinkle@deploy1001: Synchronized php-1.35.0-wmf.23/extensions/Wikibase/lib/includes/Formatters/: {{Gerrit|Ic77b2c6b33a}}, [[phab:T247458|T247458]] (duration: 01m 12s)


== 2020-03-16 ==
== 2021-10-08 ==
* 23:14 tzatziki: reset email for "MNadrofsky (WMF)" on SUL and officewiki
* 23:16 legoktm: sudo cumin -b 10 C:mediawiki::packages 'apt-get purge lilypond-data -y'
* 20:58 mutante: mw1223 power down
* 23:10 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:54 mutante: powercycling mw1223
* 21:38 mutante: mwmaint2002 - disable-puppet, stop bacula-fd, recovery in progress
* 20:52 mutante: 5 old API appservers in eqiad removed
* 21:34 mutante: disabling puppet on bacula - going through a restore https://wikitech.wikimedia.org/wiki/Bacula#Restore_from_a_non-existent_host_(missing_private_key)
* 20:45 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 21:30 legoktm: running puppet across C:mediawiki::packages to uninstall lilypond and ploticus: legoktm@cumin1001:~$ sudo cumin -b 4 C:mediawiki::packages 'run-puppet-agent'
* 20:43 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:42 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw122[1-6].eqiad.wmnet
* 20:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 20:37 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:35 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 20:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 20:04 mutante: depool (yes->no) mw1221 - mw1226 ([[phab:T247780|T247780]])
* 20:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 20:04 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw122[1-6].eqiad.wmnet
* 20:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 19:28 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@f5600d6]: Update mobileapps to {{Gerrit|8a6e403}} (duration: 06m 48s)
* 19:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 19:26 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 19:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 19:24 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventstreams' for release 'canary' .
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 19:23 jynus: stop replication at pc1010 at pos pc1007-bin.080617:{{Gerrit|259138670}}
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 19:21 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@f5600d6]: Update mobileapps to {{Gerrit|8a6e403}}
* 19:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 19:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool pc1010 instead of pc1008 as pc1008 is overloaded (duration: 01m 06s)
* 19:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 18:38 krinkle@deploy1001: Synchronized wmf-config/: {{Gerrit|I2c3217fb3da8bb65}} (duration: 01m 07s)
* 18:15 cstone: civicrm revision changed from {{Gerrit|5cb7d487cb}} to {{Gerrit|598b59b0ee}}
* 18:36 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: no-op, courtesy of opcache (duration: 01m 06s)
* 16:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=enwiki --force # to measure performance on a large wiki
* 18:34 krinkle@deploy1001: Synchronized docroot/noc/: {{Gerrit|I2c3217fb3}} (duration: 01m 07s)
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 18:18 mforns@deploy1001: Finished deploy [analytics/refinery@1681b92]: deploying refinery to add forgotten artifacts for v0.0.118 (duration: 13m 01s)
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 18:05 mforns@deploy1001: Started deploy [analytics/refinery@1681b92]: deploying refinery to add forgotten artifacts for v0.0.118
* 15:29 jelto: enable puppet on gitlab1001 again for [[phab:T283076|T283076]]
* 17:08 Amir1: warming up cache for Q50M to Q60M for new term store on db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 14:05 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:06 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q50M (T219123)]], take II (duration: 01m 08s)
* 14:01 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:03 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q50M (T219123)]] (duration: 01m 06s)
* 09:49 Amir1: wikiadmin@10.64.16.85(wikidatawiki)> delete from wb_changes_subscription where cs_subscriber_id in ('testcommonswiki', 'mowiki');
* 16:54 gehel: repooling wdqs1005
* 09:39 Emperor: installing stress on ms-be2045 given recent h/w issues [[phab:T290881|T290881]]
* 16:52 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Enforce Content Security Policy if wmgUseCSP is set [[phab:T244124|T244124]] (duration: 01m 06s)
* 08:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:50 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 01m 07s)
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wmgUseCSP false everywhere [[phab:T244124|T244124]] (duration: 01m 07s)
* 08:04 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force
* 16:34 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|I498e2ebd8c9}} (duration: 01m 07s)
* 07:43 Emperor: reboot ms-be2045 [[phab:T290881|T290881]]
* 16:33 krinkle@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: {{Gerrit|I498e2ebd8c9}} (no-op) (duration: 01m 07s)
* 07:41 gehel: manually resuming the data reloads on wdqs1009 and wdqs2008
* 16:30 krinkle@deploy1001: Synchronized wmf-config/wgConf.php: {{Gerrit|I870122f946d}} (duration: 01m 07s)
* 06:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 16:22 rlazarus: copied envoyproxy_1.13.1-1 from buster-wikimedia to stretch-wikimedia
* 06:42 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 16:21 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|I08af45e2e47}} (duration: 01m 07s)
* 06:28 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 16:14 krinkle@deploy1001: Synchronized wmf-config/wgConf.php: {{Gerrit|Ie9002d9095ee}} (duration: 01m 08s)
* 06:28 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 15:04 akosiaris: [[phab:T234181|T234181]] upload apertium-recursive_0.0.1-1+wmf1 to apt.wikimedia.org jessie-wikimedia/main
* 05:35 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 15:04 akosiaris: [[phab:T234181|T234181]] upload apertium-anaphora_0.0.4-1+wmf1 to apt.wikimedia.org jessie-wikimedia/main
* 04:56 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 15:02 moritzm: rolling restart of FPM/apache on netmon* to pick up libxslt security updates
* 04:32 ryankemper: [[phab:T292814|T292814]] Beginning rolling restart of `cloudelastic`: `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic restart" --nodes-per-run 1 --start-datetime 2021-10-08T03:53:49 --task-id [[phab:T292814|T292814]]` on `ryankemper@cumin1001` tmux `elastic`
* 14:22 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q40M (T219123)]], take II (duration: 01m 06s)
* 04:31 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 14:22 Amir1: warming up cache for Q40M to Q50M for new term store on db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 04:29 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 14:18 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579925{{!}}Set up read new term store up to Q40M (T219123)]] (duration: 01m 07s)
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 14:16 moritzm: rolling restart of FPM on mw1261-mw1265 to pick up libxslt security updates
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 14:15 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=wikidatawiki --from-id {{Gerrit|87500000}} --to-id {{Gerrit|87767570}} --batch-size=10 --sleep=5 ([[phab:T219123|T219123]])
* 04:23 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@8f57a56]: 0.3.89 (duration: 08m 22s)
* 14:05 moritzm: installing libxslt security updates
* 04:20 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 13:49 ema: upload atskafka 0.1 to buster-wikimedia [[phab:T237993|T237993]]
* 04:20 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 13:42 gehel: restarting blazegraph on wdqs1007
* 04:18 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 13:30 gehel: depooling wdqs1005 to catch up on lag
* 04:17 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1015', diff saved to https://phabricator.wikimedia.org/P10706 and previous config saved to /var/cache/conftool/dbconfig/20200316-124309-marostegui.json
* 04:15 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.89` on canary `wdqs1003`; proceeding to rest of fleet
* 12:09 Amir1: warming up cache for Q35M to Q40M for new term store on db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 04:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@8f57a56]: 0.3.89
* 12:09 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579913{{!}}Set up read new term store up to Q35M (T219123)]], take II (duration: 01m 07s)
* 04:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.89`. Pre-deploy tests passing on canary `wdqs1003`
* 12:05 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:579913{{!}}Set up read new term store up to Q35M (T219123)]] (duration: 01m 08s)
* 03:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 11:52 XioNoX: manually fix prometheus squid exporter on install1003
* 03:58 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 11:04 Amir1: ... for Q30M-Q35M of the new term store
* 02:04 Krinkle: krinkle@deploy1002$ echo 'https://en.wikipedia.org/static/images/project-logos/jvwiktionary.png' {{!}} mwscript purgeList.php , ref [[phab:T287425|T287425]], [[phab:T292810|T292810]]
* 11:04 Amir1: Warming up InnoDB buffer pool cache in db1111, db1126, db1104, db1092 ([[phab:T219123|T219123]])
* 00:07 tgr_: deploy window over
* 10:55 Amir1: warming up db1026 for up to Q35M for the new term store ([[phab:T219123|T219123]])
* 00:05 tgr@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments: Backport: [[gerrit:727498{{!}}Mentee overview: Make UncachedMenteeOverviewDataProvider::getBlocksForUsers faster (T290609)]] (duration: 00m 56s)
* 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1015', diff saved to https://phabricator.wikimedia.org/P10705 and previous config saved to /var/cache/conftool/dbconfig/20200316-104723-marostegui.json
* 10:45 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: "Set term store to WRITE_BOTH for all of Wikidata" ([[phab:T219123|T219123]]), take II (duration: 01m 07s)
* 10:43 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: "Set term store to WRITE_BOTH for all of Wikidata" ([[phab:T219123|T219123]]) (duration: 01m 13s)
* 10:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1015', diff saved to https://phabricator.wikimedia.org/P10704 and previous config saved to /var/cache/conftool/dbconfig/20200316-104002-marostegui.json
* 10:36 elukey: roll restart of recommendation service on scb* as attempt to fix the flapping alerts - [[phab:T247732|T247732]]
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1015', diff saved to https://phabricator.wikimedia.org/P10703 and previous config saved to /var/cache/conftool/dbconfig/20200316-102829-marostegui.json
* 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1015', diff saved to https://phabricator.wikimedia.org/P10702 and previous config saved to /var/cache/conftool/dbconfig/20200316-101707-marostegui.json
* 10:10 marostegui: Stop mysql for upgrade on es1015 [[phab:T239791|T239791]]
* 10:02 Amir1: start of ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=wikidatawiki --batch-size=50 --sleep=0 --file=15march2217-holes-nulls.list on screen ([[phab:T219123|T219123]])
* 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1015 for upgrade and restart [[phab:T239791|T239791]]', diff saved to https://phabricator.wikimedia.org/P10701 and previous config saved to /var/cache/conftool/dbconfig/20200316-093228-marostegui.json
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1011 to es2 master, this is a NOOP [[phab:T239791|T239791]]', diff saved to https://phabricator.wikimedia.org/P10700 and previous config saved to /var/cache/conftool/dbconfig/20200316-093048-marostegui.json
* 08:16 marostegui: Review and enable events on recently migrated 10.4 hosts - [[phab:T247728|T247728]]
* 08:02 ema: cp4025 restart trafficserver-tls to clear 'tls process restarted' alert [[phab:T241593|T241593]] [[phab:T185968|T185968]]
* 07:57 moritzm: installing libxslt security updates
* 07:52 ema: cp4025: restart varnish-fe to clear 'child restarted' alert [[phab:T185968|T185968]]
* 07:47 moritzm: installing lxml security updates
* 07:14 moritzm: installing libgd2 security updates on jessie
* 06:54 moritzm: removing some library packages from jessie/stretch after labstore1006/1007 dist-upgrade to buster
* 06:38 _joe_: restart envoy with 10 requests per connection on mw2231, [[phab:T247484|T247484]]


== 2020-03-15 ==
== 2021-10-07 ==
* 23:20 jynus: removed oldest snapshots on dbprov1001
* 23:43 thcipriani@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 3/3 (duration: 00m 55s)
* 13:27 dcausse: restarting blazegraph on wdqs1005 [[phab:T242453|T242453]]
* 23:41 thcipriani@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 2/3 (duration: 00m 55s)
* 07:01 marostegui: Restart logrotate on db1107
* 23:40 thcipriani@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 1/3 (duration: 00m 56s)
* 23:30 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 2/2 (duration: 00m 56s)
* 23:28 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikiquote-wordmark-tr.svg: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 1/2 (duration: 00m 57s)
* 21:35 urbanecm: Password reset for SUL User:LA2-bot ([[phab:T292793|T292793]])
* 20:43 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3
* 20:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281167|T281167]]
* 20:35 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 20:35 cmooney@cumin1001: START - Cookbook sre.network.cf
* 20:23 krinkle@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Gadgets/: {{Gerrit|I7c858b8c4bc}} (duration: 00m 56s)
* 20:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Echo/: {{Gerrit|8a7ff05ba28f302adb581bf430a868bb815b4ffd}}: Revert "Use namespaced CentralAuthSessionProvider" (duration: 00m 57s)
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/CentralAuth/: {{Gerrit|c01c2e4983bad8582ddd62aeb35ac9be852d493b}}: Revert "Namespace session providers" (duration: 00m 57s)
* 19:44 urbanecm: Backporting https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/727489, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/727487 in an unsafe way -- exceptions at testwikis expected, wmf.3 is not deployed elsewhere, so this should be ok
* 19:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert all wikis to 1.38.0-wmf.2 ([[phab:T281167|T281167]])
* 19:33 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): variously blocked, rolling back to testwikis for safe deploy of backports
* 19:14 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.38.0-wmf.2
* 19:07 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 19:03 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to all wikis
* 18:50 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=test2wiki
* 18:46 sukhe: running authdns-update for [[phab:T292537|T292537]]
* 18:29 urbanecm: Morning B&C window done
* 18:28 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a946c046ae17a520f8d3463a16b1435ceb4856c}}: Deploy Growth mentor dashboard to pilot wikis ([[phab:T278920|T278920]]) (duration: 01m 04s)
* 18:23 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 03s)
* 18:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 04s)
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|31770f2b3660e7d7490c0a9ab66285c1f069732d}}: shwiki: Deploy Growth features to newcomers ([[phab:T278240|T278240]]) (duration: 01m 04s)
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|33526dfed148068585289f5ac501feda72068fd9}}: Stream config changes for android_daily_stats schema ([[phab:T286000|T286000]]) (duration: 01m 06s)
* 18:10 ejegg: updated payments-wiki from {{Gerrit|6d3560d083}} to {{Gerrit|030b11da1a}}
* 18:07 arnoldokoth: gitlab2001 re-image complete ([[phab:T283076|T283076]])
* 17:30 mutante: rebooting gitlab2001.wikimedia.org
* 16:56 arnoldokoth: down timing gitlab2001 for re-imaging ([[phab:T283076|T283076]])
* 16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:32 hnowlan: roll restarting maps cassandra instances for java updates
* 16:19 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 16:19 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 15:07 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001 (duration: 00m 08s)
* 15:07 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001
* 14:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:49 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001 (duration: 00m 10s)
* 14:49 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001
* 14:48 hashar: Upgrading Gerrit replica to 3.3.6 # [[phab:T290236|T290236]]
* 14:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:56 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:46 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:30 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:29 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:29 hashar: restarting CI Jenkins for git plugin update
* 13:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 hashar: Upgraded CI Jenkins on contint2001
* 13:14 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:13 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:10 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:09 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:06 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 13:05 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:05 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:40 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 12:16 moritzm: installing testvm2005
* 11:59 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 11:52 Lucas_WMDE: EU backport+config window (aka UTC morning) done
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725858{{!}}Enable Content and Section Translation to Kurdish WP (T290238)]] (duration: 01m 04s)
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/WikidataPageBanner/includes/WikidataPageBannerFunctions.php: Backport: [[gerrit:727188{{!}}Change PropertyId to NumericPropertyId (T289125, T292667)]] (duration: 01m 05s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 jbond: update puppet stdlib gerrit:726872
* 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
* 09:26 mvernon@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host ms-be2045.codfw.wmnet
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2005.codfw.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
* 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2005.codfw.wmnet
* 08:49 mvernon@cumin2002: START - Cookbook sre.experimental.reimage for host ms-be2045.codfw.wmnet
* 08:36 moritzm: imported jenkins 2.303.2 to thirdparty/ci component for buster-wikimedia
* 07:57 Emperor: re-enabling puppet on ms-be2045 after hw work [[phab:T290881|T290881]]
* 07:39 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 07:39 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 07:38 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 07:37 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 07:34 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 07:33 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 07:32 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:31 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 06:21 ryankemper: [Elastic] Restart of `relforge` complete
* 06:05 ryankemper: [Elastic] Cluster in green status, proceeding to next and final node => `ryankemper@relforge1003:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:53 ryankemper: [Elastic] `ryankemper@relforge1004:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:48 ryankemper: [Elastic] Performing rolling restarts of `relforge`. `relforge1003` is the master so I'll restart `relforge1004` first to minimize disruption
* 03:00 ejegg: updated payments-wiki from {{Gerrit|23d0ffac66}} to {{Gerrit|6d3560d083}}
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: enable Parsoid API everywhere (duration: 01m 04s)
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 mutante: [grafana2001:~] $ sudo systemctl start rsync-var-lib-grafana  because of "PROBLEM - Check systemd state on grafana2001 is CRITICAL: CRITICAL - degraded" because of some race condition where a file vanished during sync


== 2020-03-14 ==
== 2021-10-06 ==
* 08:33 elukey: run kafka preferred-replica-election on kafka-jumbo1001 - [[phab:T247561|T247561]]
* 23:57 mutante: releases2002 - rm /srv/org/wikimedia/reprepro/conf/distributions - contains only jessie-mediawiki - see 725670 and EOL of MediaWiki 1.31
* 08:32 elukey: run systemctl restart systemd-timedated.service on stat1008
* 23:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:06 mutante: planet1001 - copying /etc/apt/sources.list from planet2001 to planet1001 - apt-get update - apt-get install openssh-server [[phab:T247592|T247592]]
* 23:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:21 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 23:20 jforrester@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ckb.svg: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 23:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:16 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726603{{!}}Enable NewUserMessage for ptwikivoyage (T290820)]] (duration: 01m 05s)
* 22:30 mutante: re-enabling puppet on mw*, an-worker* after deploying gerrit:726954. no issue this time
* 22:23 mutante: temp. disabling puppet on an-worker*, mw*
* 20:50 mutante: global puppet failure - revert is merged, puppet run will recover on next run everywhere. partially forcing with cumin, partially letting it recover naturally
* 20:43 mutante: [cumin1001:~] $ sudo cumin -b 5 -p 95 'mw2*' 'run-puppet-agent -q --failed-only'
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:05 brennen@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 01m 03s)
* 19:04 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 19:01 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): still unblocked after triage meeting, rolling to group1
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:44 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Revert disabling static mapframes on eswiki (duration: 01m 14s)
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:31 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: eswiki: Disable static mapframes ([[phab:T291736|T291736]]) (duration: 01m 17s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: viwikibooks: Set $wgRestrictDisplayTitle to false ([[phab:T289837|T289837]]) (duration: 01m 21s)
* 17:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:47 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:43 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to group0
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:35 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726596{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 04s)
* 16:35 jynus: stopping db1127 for hw maintenance [[phab:T292366|T292366]]
* 16:31 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:31 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:28 brennen@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726597{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 10s)
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 15:45 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): proceeding to deploy backports for [[phab:T292589|T292589]]
* 15:37 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 15:35 volans: installer spicerack 1.0.4 on cumin2002
* 12:50 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:48 volans: uploaded spicerack_1.0.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2004.codfw.wmnet
* 12:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:18 effie: pool mw1455 mw1422
* 12:17 urbanecm: wikiadmin@10.64.0.164(viwiki)> delete from growthexperiments_mentee_data; # cleanup after disabling mentor dashboard backend
* 12:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2004.codfw.wmnet
* 12:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1aa67d4846f39f59127a835cb7a8ed2974506025}}: viwiki: Disable mentor dashboard backend ([[phab:T278920|T278920]]) (duration: 01m 06s)
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:56 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2003.codfw.wmnet
* 11:55 XioNoX: esams - Advertise 185.15.59.0/24 instead of 185.15.58.0/23 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:50 jelto: disable puppet on gitlab1001 to test puppetized code on GitLab replica - [[phab:T283076|T283076]]
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1003.wikimedia.org
* 10:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1003.wikimedia.org
* 10:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 urbanecm@deploy1002: Synchronized wmf-config/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 2/2) (duration: 01m 05s)
* 10:01 urbanecm@deploy1002: Synchronized dblists/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 1/2) (duration: 01m 04s)
* 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host copernicium.wikimedia.org
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host copernicium.wikimedia.org
* 09:19 jbond: update ipaddress6 fact - https://gerrit.wikimedia.org/r/c/operations/puppet/+/726625
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:725923{{!}}Don't fail job if subscribed wiki is unknown (T292446 T292440)]] (duration: 01m 15s)
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 08:21 XioNoX: add ROAs for 185.15.58.0/24 and 185.15.59.0/24 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 08:04 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews --fix # [[phab:T291344|T291344]]
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews # [[phab:T291344|T291344]]
* 07:55 urbanecm: mwdebug1001: scap pull ([[phab:T291344|T291344]] fix done)
* 07:51 urbanecm: Staging at mwdebug1001 for [[phab:T291344|T291344]]
* 05:53 kart_: Updated cxserver to use nodejs12 ([[phab:T290754|T290754]])
* 05:47 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:39 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:36 Amir1: start of mwscript extensions/Wikibase/repo/maintenance/pruneChanges.php --wiki wikidatawiki --number-of-days=2
* 05:31 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 04:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:29 ryankemper: [WDQS] `wdqs1012` is back up after restarting blazegraph (blazegraph was locked up)
* 04:27 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph` (attempting to bring downed `wdqs1012` back into health)
* 04:25 ryankemper: [WDQS] Repooling eqiad hosts following the brief outage from earlier: `wdqs1004`, `wdqs1006`, `wdqs1007`
* 03:19 eileen: civicrm revision changed from {{Gerrit|b6f5f71c18}} to {{Gerrit|82efd2e195}}, config revision is {{Gerrit|f4c57d4733}}
* 03:11 tstarling@deploy1002: Synchronized php-1.38.0-wmf.3/includes/CommentFormatter/RowCommentIterator.php: fix UBN [[phab:T292590|T292590]] (duration: 01m 04s)
* 01:39 legoktm: legoktm@mwmaint1002:~$ echo "https://en.wikiversity.org/static/images/mobile/copyright/wikiversity.svg" {{!}}mwscript purgeList.php
* 01:17 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 03s)
* 01:12 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 17s)
* 00:59 arlolra@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable legacy media dom on metawiki (duration: 01m 05s)
* 00:37 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:35 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 03s)
* 00:32 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:29 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 04s)
* 00:16 mutante: puppetmasters: rm /etc/logrotate.d/geoipupdate && systemctl start logrotate && puppet agent -tv
* 00:14 mutante: puppetmaster2002 - rm /etc/logrotate.d/geoipupdate (not managed by puppet anymore but not removed, caused duplicate logrotate config, made logrotate service fail), start logrotate
* 00:08 cstone: civicrm revision changed from {{Gerrit|34d3c3aae8}} to {{Gerrit|b6f5f71c18}}
* 00:01 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725132{{!}}Add WN as an alias to project namespace in Polish Wikinews (T291344)]] (duration: 01m 04s)


== 2020-03-13 ==
== 2021-10-05 ==
* 23:12 bstorm_: rebooting labstore1006 for upgrade to stretch [[phab:T224583|T224583]]
* 23:54 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikiversity.svg: Config: [[gerrit:725413{{!}}Wikiversity Logo Update for 2017 Logo Version (T292109)]] (duration: 01m 03s)
* 22:49 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 04s)
* 22:45 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 23:44 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-az.svg: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 23s)
* 22:27 bstorm_: rebooting labstore1006 [[phab:T224583|T224583]]
* 23:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725386{{!}}Add image_suggestion_interaction event stream]] (duration: 01m 12s)
* 22:21 bstorm_: downtimed labstore1006 for upgrades [[phab:T224583|T224583]]
* 23:02 legoktm: deleting old stretch docker images from the registry for [[phab:T292485|T292485]]
* 20:02 mutante: stat1005 - ip link set en01 down ; ip link set en01 up ([[phab:T247561|T247561]])
* 22:24 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 19:30 bstorm_: rebooting labstore1007 for upgrade to buster [[phab:T224583|T224583]]
* 22:20 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]) rolling back to testwikis for the day; will revisit in US-morning
* 18:51 shdubsh: test increase fs.inotify.max_user_watches on prometheus2004
* 20:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 17:58 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 20:44 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/includes/page: Backport: [[gerrit:726594{{!}}Pre-format comments for non-local files too]] ([[phab:T292570|T292570]]) (duration: 01m 04s)
* 17:21 mutante: removed squid from install1002/install2002 (formerly webproxy.(eqiad{{!}}codfw).wmnet until 2 days ago, replaced by install1003/install2003) [[phab:T224576|T224576]]
* 20:18 mutante: puppetmaster1003 et al - converting maxmind geoip database fetching from cron to timers
* 17:20 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0)
* 20:06 mutante: cumin 'puppetmaster*' "disable-puppet '[[phab:T288844|T288844]] - [[phab:T273673|T273673]] - gerrit:721595 - $<nowiki>{</nowiki>USER<nowiki>}</nowiki>'"
* 17:09 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 19:30 mutante: restoring /home/amire80 from and to mwmaint2002 via Bacula bconsole ([[phab:T292573|T292573]])
* 17:08 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker
* 19:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 17:00 krinkle@deploy1001: Synchronized dblists/: {{Gerrit|If4d17082f}}, {{Gerrit|Iadba5b01b}}, {{Gerrit|Ibe16d5f09}} (duration: 01m 07s)
* 19:03 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:58 krinkle@deploy1001: Synchronized wmf-config/config/: {{Gerrit|Ibe16d5f09}} (duration: 01m 10s)
* 18:26 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.23 (duration: 01m 57s)
* 16:51 bstorm_: rebooting labstore1007 for stretch upgrade [[phab:T224583|T224583]]
* 18:23 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.21 (duration: 04m 20s)
* 16:37 krinkle@deploy1001: Synchronized wmf-config/config/: {{Gerrit|If4d17082f}}, {{Gerrit|Iadba5b01b}} (duration: 01m 11s)
* 18:21 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]): pruning old branches, starting with 1.37.0-wmf.21, proceeeding to 1.37.0-wmf.23 if time allows
* 16:18 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:11 ppchelko@deploy1002: Synchronized wmf-config: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] Php72ToUpper.php removal (duration: 01m 06s)
* 16:15 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 18: