You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(ejegg: updated payments-wiki from c365c136d2 to cd012f37f1)
imported>Stashbot
(Amir1: wikiadmin@10.64.32.197(avkwiki)> delete from site_identifiers; (T259122))
Line 1: Line 1:
== 2020-07-31 ==
== 2020-08-01 ==
* 23:48 ejegg: updated payments-wiki from {{Gerrit|c365c136d2}} to {{Gerrit|cd012f37f1}}
* 16:30 Amir1: wikiadmin@10.64.32.197(avkwiki)> delete from site_identifiers; ([[phab:T259122|T259122]])
* 22:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:27 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T259122|T259122]])
* 22:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:03 mutante: wtp2019 - parsoid could not start after reimaging - was missing /etc/parsoid/config.yaml which is a symbolic link deep onto /srv/deployment/parsoid/deploy-cache/.. like in some other cases before manually deleted deploy-cache dir and ran puppet again  .. [[phab:T258775|T258775]]
* 21:58 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2020.codfw.wmnet
* 21:57 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=wtp2019.codfw.wmnet
* 21:50 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2019.codfw.wmnet
* 21:36 mutante: [wtp2019:~] $ sudo rm -rf /srv/deployment/parsoid/deploy-cache
* 21:13 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2018.codfw.wmnet
* 20:57 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:55 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:22 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:22 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:11 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:11 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:03 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:02 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:51 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:51 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:39 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:39 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:21 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2017.codfw.wmnet
* 18:57 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2016.codfw.wmnet
* 17:45 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:45 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:45 mutante: rebooting / reinstalling OS on xhgui1001
* 17:25 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:25 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:21 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:13 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:13 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:12 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:11 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:24 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:22 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 16:22 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:20 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 15:32 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 13:52 elukey: update cr1/cr2-eqiad's analytics filters (ref: https://gerrit.wikimedia.org/r/c/operations/homer/public/+/617649/)
* 13:51 moritzm: installing cups security updates (client-side tools/libs only)
* 13:20 moritzm: installing openjpeg2 security updates
* 13:04 kormat: proudly uploaded version 0.1 of python3-wmfmariadbpy + wmfmariadbpy
* 11:55 moritzm: installing mercurial security updates
* 11:21 jynus: restart dbstore1004
* 11:19 moritzm: installing ffmpeg security updates for jessie (standard version from security.debian.org, not the VP9-enabled component)
* 11:16 moritzm: imported ffmpeg 3.2.15-0+deb9u1+wmf1 to component/vp9 for stretch-wikimedia [[phab:T259336|T259336]]
* 07:51 moritzm: updating lilypond on mw* servers
* 07:50 moritzm: uploaded lilypond 2.19.81+really-2.18.2-13~bpo9+1+wmf1 to stretch-wikimedia [[phab:T256877|T256877]]
* 07:07 elukey: stop mysql replication on db1108; update port config for mysql instances and restart them; restart replication on instances
* 06:32 elukey: roll restart of druid brokers on druid100[4-8] to pick up new changes
* 06:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:26 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 05:59 moritzm: installing qemu updates on stretch
* 04:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:54 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 04:22 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:20 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 03:55 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 03:53 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 03:14 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 03:12 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 02:57 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 02:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 00:47 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: disable lilypond execution again (duration: 01m 10s)
* 00:07 catrope@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/Echo/modules/mobile/notificationsFilterOverlay.js: [[phab:T258954|T258954]] (duration: 01m 06s)
* 00:06 catrope@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Echo/modules/mobile/notificationsFilterOverlay.js: [[phab:T258954|T258954]] (duration: 01m 10s)
* 00:00 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2015.codfw.wmnet
 
== 2020-07-30 ==
* 23:56 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:56 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:59 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2014.codfw.wmnet
* 22:34 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:34 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:27 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:12 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.36.0-wmf.1
* 21:52 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.2
* 21:52 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:51 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:51 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:51 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:45 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2013.codfw.wmnet
* 21:41 mutante: revoking and resigning puppet cert for xhgui2001.codfw.wmnet [[phab:T259206|T259206]]
* 21:40 catrope@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/GrowthExperiments/: [[phab:T258609|T258609]] (duration: 01m 06s)
* 21:40 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:40 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:39 catrope@deploy1001: Synchronized php-1.36.0-wmf.2/skins/MinervaNeue/: [[phab:T258939|T258939]] (duration: 01m 08s)
* 21:30 mutante: reinstalling xhgui2001
* 21:27 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:27 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:10 ryankemper@deploy1001: Finished deploy [wdqs/wdqs@e797cf0]: 0.3.42 (duration: 24m 41s)
* 20:45 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2012.codfw.wmnet
* 20:45 ryankemper@deploy1001: Started deploy [wdqs/wdqs@e797cf0]: 0.3.42
* 20:23 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:23 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:09 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:09 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:08 mutante: [wtp2012:~] $ sudo rm -rf /srv/deployment/parsoid/deploy-cache
* 19:24 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2011.codfw.wmnet
* 19:13 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.36.0-wmf.1
* 19:10 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.2
* 19:01 mforns@deploy1001: Finished deploy [analytics/refinery@adb0d09]: Regular analytics weekly train [analytics/refinery@adb0d09b6584a7a26143623cf6173ae8983423e3] (duration: 10m 41s)
* 18:59 mutante: imported twig (php-twig) into APT repo
* 18:50 mforns@deploy1001: Started deploy [analytics/refinery@adb0d09]: Regular analytics weekly train [analytics/refinery@adb0d09b6584a7a26143623cf6173ae8983423e3]
* 18:34 Urbanecm: Morning B&C done
* 18:32 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/Kartographer/modules/box/Map.js: {{Gerrit|aa3dbd54f8e422a511e55b5efba6b5f48253dbe7}}: Disable panning and zooming until ready ([[phab:T257872|T257872]]) (duration: 01m 06s)
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 617516: Add import sources for yuewiktionary {{!}} https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/617516; 617518: Fix definition of yuewiktionary import sources {{!}} https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/617518 # [[phab:T258913|T258913]] (duration: 01m 06s)
* 18:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:08 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|14ef2ec2956fd8f66be7efb3c5978ac0eda7ce97}}: sysop_itwiki: Set favicon to Wikimedia_logo_blue.svg ([[phab:T259243|T259243]]; 2/2) (duration: 01m 06s)
* 18:04 urbanecm@deploy1001: Synchronized static/favicon/wmf-blue.ico: {{Gerrit|14ef2ec2956fd8f66be7efb3c5978ac0eda7ce97}}: sysop_itwiki: Set favicon to Wikimedia_logo_blue.svg ([[phab:T259243|T259243]]; 1/2) (duration: 01m 06s)
* 17:04 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@d3ab874]: airflow: refinery_drop_hive_partitions: Fix kerberos token passing (duration: 00m 55s)
* 17:03 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@d3ab874]: airflow: refinery_drop_hive_partitions: Fix kerberos token passing
* 16:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:44 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:44 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:44 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:44 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:41 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:41 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:41 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:40 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:40 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:38 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 14:30 moritzm: installing squid security updates
* 14:04 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:50 volans@cumin1001: START - Cookbook sre.dns.netbox
* 13:47 vgutierrez: upgrade acme-chief to version 0.27 - [[phab:T255249|T255249]]
* 13:47 vgutierrez: upload acme-chief 0.27 to apt.wm.o (buster) - [[phab:T255249|T255249]]
* 13:46 moritzm: installing qemu security updates on Buster
* 13:02 jayme: imported chartmuseum_0.12.0-3 to buster-wikimedia
* 12:07 elukey: upgrade of the druid public cluster (serving AQS) from 0.12.3 to 0.19
* 11:53 urbanecm@deploy1001: Synchronized static/favicon/: {{Gerrit|c08f774b9b05cb9c5faf692c59dd45bf5d65b557}}: Revert "sysop_itwiki: Set favicon to Wikimedia_logo_blue.svg" ([[phab:T259243|T259243]]) (duration: 01m 06s)
* 11:48 urbanecm@deploy1001: Synchronized static/favicon/wmf-blue.ico: {{Gerrit|399e9c5d949ade5f574ef965e05288b7253c3c3e}}: sysop_itwiki: Set favicon to Wikimedia_logo_blue.svg ([[phab:T259243|T259243]]; 1/2) (duration: 01m 06s)
* 11:46 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|fc48441f26afc6f4e97c5f7e96185d04cacb0f4b}}: Add import sources to sysop_itwiki ([[phab:T259243|T259243]]) (duration: 01m 08s)
* 11:44 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|fc5de151ee2c9cf5c64a7d13b2e65e39bb349296}}: ClosedWikiProvider: Do not run when $wmgUseCentralAuth is false ([[phab:T259246|T259246]]) (duration: 01m 07s)
* 11:42 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7aa0c2361a1e0e363700d54a109b81497b5a045b}}: sysop_itwiki: Add several pages to wgWhitelistRead ([[phab:T259243|T259243]]) (duration: 01m 06s)
* 11:39 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|5ea4bc87ee3b5ed1ef4f7cf2b1068e678f6eb42c}}: sysop_itwiki: Add WP as an alias for NS_PROJECT ([[phab:T259243|T259243]]) (duration: 01m 08s)
* 10:49 liw@deploy1001: Synchronized php: group1 wikis to 1.36.0-wmf.2 (duration: 01m 07s)
* 10:48 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.2
* 10:38 marostegui: Reload haproxy on dbproxy1013 and dbproxy1015
* 08:43 godog: flip smokeping/librenms from netmon2001 to netmon1002 - [[phab:T247967|T247967]]
* 07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:32 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 07:31 filippo@cumin1001: START - Cookbook sre.hosts.decommission
* 07:22 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:22 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1121', diff saved to https://phabricator.wikimedia.org/P12129 and previous config saved to /var/cache/conftool/dbconfig/20200730-071633-marostegui.json
* 06:57 elukey: upload druid_0.19.0-1 packages to buster-wikimedia
* 05:26 marostegui: Deploy MCR schema change on labswiki (wikitech) [[phab:T238966|T238966]]
* 02:18 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2010.codfw.wmnet
* 01:53 dpifke@deploy1001: Finished deploy [performance/arc-lamp@ad87f69]: Deploying https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/615302 (duration: 00m 05s)
* 01:53 dpifke@deploy1001: Started deploy [performance/arc-lamp@ad87f69]: Deploying https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/615302
* 01:37 eileen: civicrm revision changed from {{Gerrit|cc5d17fbaf}} to {{Gerrit|150c3476c4}}, config revision is {{Gerrit|b6ece03513}}
* 01:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 01:33 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 01:30 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2009.codfw.wmnet
* 01:22 mutante: imported in apt.wikimedia.org for buster: php-slim, php-slim-views, php-perftools-xhgui-collector, php-pimple, php-psr-http-server-middleware, php-psr-http-server-handler, xhgui
* 01:07 mholloway-shell@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/JsonConfig: Backport: Implement GetContentModels hook ([[phab:T259126|T259126]]) (duration: 01m 07s)
* 00:52 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:52 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 00:23 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 00:23 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
 
== 2020-07-29 ==
* 23:51 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:51 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:41 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2008.codfw.wmnet
* 23:28 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:28 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:09 Urbanecm: Run mwscript namespaceDupes.php --wiki=mswiktionary --fix ([[phab:T255391|T255391]])
* 23:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|396a395c79c606cb7deeb7906fefc7f16e63fa4f}}: Add several extra namespaces for mswiktionary ([[phab:T255391|T255391]]) (duration: 01m 07s)
* 22:46 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2006.codfw.wmnet
* 22:45 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2007.codfw.wmnet
* 22:00 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:00 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:22 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:15 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:15 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:00 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 21:00 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:35 crusnov@deploy1001: Finished deploy [netbox/deploy@fde9dfe]: Test deploy of 2.8.8 to netbox-next pt2 (duration: 00m 05s)
* 20:35 crusnov@deploy1001: Started deploy [netbox/deploy@fde9dfe]: Test deploy of 2.8.8 to netbox-next pt2
* 20:35 crusnov@deploy1001: Finished deploy [netbox/deploy@fde9dfe]: Test deploy of 2.8.8 to netbox-next (duration: 01m 12s)
* 20:34 crusnov@deploy1001: Started deploy [netbox/deploy@fde9dfe]: Test deploy of 2.8.8 to netbox-next
* 20:19 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2004.codfw.wmnet
* 19:45 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:44 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:43 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:41 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:41 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:41 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:29 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:27 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:20 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:20 herron@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 19:19 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:18 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:18 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:17 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:17 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:04 qchris: Restarting Gerrit on gerrit2001 (gerrit-replica) to make security fix effective.
* 19:04 qchris@deploy1001: Finished deploy [gerrit/gerrit@9275b30]: Gerrit to v3.2.3-1-g185bdc3a69 on gerrit2001 (duration: 00m 09s)
* 19:03 qchris@deploy1001: Started deploy [gerrit/gerrit@9275b30]: Gerrit to v3.2.3-1-g185bdc3a69 on gerrit2001
* 19:00 qchris: Restarting Gerrit on gerrit1001 to make security fix effective.
* 19:00 qchris@deploy1001: Finished deploy [gerrit/gerrit@9275b30]: Gerrit to v3.2.3-1-g185bdc3a69 on gerrit1001 (duration: 00m 08s)
* 19:00 qchris@deploy1001: Started deploy [gerrit/gerrit@9275b30]: Gerrit to v3.2.3-1-g185bdc3a69 on gerrit1001
* 18:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:44 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:42 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:42 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:39 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:39 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:36 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:36 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:32 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:32 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:13 Urbanecm: Morning B&C window is done
* 18:13 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/DiscussionTools/: {{Gerrit|00ecec80d12a34977d55dd09bce0c5a1aab369f9}}: Revert new reply API for now ([[phab:T252558|T252558]]) (duration: 01m 06s)
* 18:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d54f041be6508b641eec08e25287d280374cc863}}: Enable Translate extension at plwikimedia ([[phab:T259087|T259087]]) (duration: 01m 08s)
* 18:07 urbanecm@deploy1001: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|a237f5b40c3662c0f08398abeeaadba61d7462f8}}: Move VisualEditor from beta to default on enwikiversity ([[phab:T258992|T258992]]) (duration: 01m 06s)
* 18:05 Urbanecm: Create tables for Translate extension in plwikimedia ([[phab:T259087|T259087]])
* 18:05 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:03 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:03 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:01 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:01 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:01 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:50 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2003.codfw.wmnet
* 17:50 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2002.codfw.wmnet
* 17:21 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:19 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:45 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:45 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:43 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:43 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:43 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:25 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:24 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:19 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:18 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:16 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:16 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:15 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:15 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 16:02 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:48 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 617167: Revert "Set muswiki to read only" {{!}} https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/617167 ([[phab:T259004|T259004]]) (duration: 01m 06s)
* 15:44 volans@cumin1001: START - Cookbook sre.dns.netbox
* 15:33 liw@deploy1001: rebuilt and synchronized wikiversions files: Revert "group[0{{!}}1] wikis to 1.36.0-wmf.1"
* 15:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 617152: Set muswiki to read only {{!}} https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/617152 ([[phab:T259004|T259004]]) (duration: 01m 08s)
* 15:10 jayme: imported docker-report_0.0.8-1 to buster-wikimedia
* 14:49 moritzm: installing ruby-json security updates
* 14:34 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:30 jbond42: install curl security update for jessie
* 14:29 moritzm: installing exiv2 security updates
* 14:27 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 13:55 volans: migrating *all* codfw mgmt DNS records to the autogenerated ones via Netbox - [[phab:T233183|T233183]]
* 13:50 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:45 volans@cumin1001: START - Cookbook sre.dns.netbox
* 13:29 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp2001.codfw.wmnet
* 13:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:05 liw@deploy1001: Synchronized php: group1 wikis to 1.36.0-wmf.2 (duration: 01m 07s)
* 13:04 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.2
* 13:03 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime
* 12:58 volans@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:56 volans@cumin1001: START - Cookbook sre.dns.netbox
* 12:49 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:48 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:44 moritzm: imported curl 7.38.0-4+deb8u16+wmf1 to apt.wikimedia.org (jessie-wikimedia) [[phab:T259102|T259102]]
* 12:30 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 21s)
* 12:28 urbanecm@deploy1001: Synchronized langlist: Creating avkwiki ([[phab:T257943|T257943]]) (duration: 01m 05s)
* 12:27 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating avkwiki ([[phab:T257943|T257943]]) (duration: 01m 03s)
* 12:26 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating avkwiki ([[phab:T257943|T257943]]) (duration: 01m 06s)
* 12:24 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating avkwiki ([[phab:T257943|T257943]])
* 12:15 urbanecm@deploy1001: Synchronized dblists: Creating avkwiki ([[phab:T257943|T257943]]) (duration: 01m 06s)
* 12:14 urbanecm@deploy1001: Synchronized wmf-config/db-codfw.php: Creating avkwiki ([[phab:T257943|T257943]]) (duration: 01m 06s)
* 12:12 urbanecm@deploy1001: Synchronized wmf-config/db-eqiad.php: Creating avkwiki ([[phab:T257943|T257943]]) (duration: 01m 05s)
* 12:09 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:07 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:07 moritzm: rebooting idp2001 for kernel update
* 11:41 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|252bb6c1bf83d96a14a0ef63e06eb544eef8a00b}}: Add Wikipedia wordmark for trwiki ([[phab:T255489|T255489]]; sync 2/2) (duration: 01m 05s)
* 11:39 urbanecm@deploy1001: Synchronized static/images/mobile/copyright/wikipedia-wordmark-tr.svg: {{Gerrit|252bb6c1bf83d96a14a0ef63e06eb544eef8a00b}}: Add Wikipedia wordmark for trwiki ([[phab:T255489|T255489]]; sync 1/2) (duration: 01m 06s)
* 11:36 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9f7e03292941d0d782437862f406efa7e1c6463e}}: Fix overindentation (duration: 01m 08s)
* 11:11 Lucas_WMDE: EU B&C window done
* 11:09 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/project-logos/%s\n' 'wuuwiki.png' 'wuuwiki-1.5x.png' 'wuuwiki-2x.png' {{!}} mwscript purgeList.php # [[phab:T259005|T259005]]
* 11:08 lucaswerkmeister-wmde@deploy1001: Synchronized static/images/project-logos/: Config: [[gerrit:616760{{!}}Change the logo for Wu Wikipedia (T259005)]] (duration: 01m 08s)
* 10:40 vgutierrez: rolling upgrade of ATS to version 8.0.8-1wm2
* 10:21 tstarling@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Score/includes/Score.php: do not offer .ly downloads (duration: 01m 07s)
* 10:19 tstarling@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Score/extension.json: do not offer .ly downloads (duration: 01m 20s)
* 10:12 vgutierrez: upgrade ATS to version 8.0.8-1wm2 on cp3064 and cp3065
* 09:44 vgutierrez: upgrade ATS to version 8.0.8-1wm2 on cp5006 and cp5012
* 09:20 jayme@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 09:20 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 09:16 vgutierrez: upgrade ATS to version 8.0.8-1wm2 on cp4026 and cp4032
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1112', diff saved to https://phabricator.wikimedia.org/P12115 and previous config saved to /var/cache/conftool/dbconfig/20200729-091528-marostegui.json
* 09:15 vgutierrez: upload trafficserver 8.0.8-1wm2 to apt.wm.o (buster)
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P12114 and previous config saved to /var/cache/conftool/dbconfig/20200729-091319-marostegui.json
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P12113 and previous config saved to /var/cache/conftool/dbconfig/20200729-091006-marostegui.json
* 08:55 marostegui: The above was db1112
* 08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1121', diff saved to https://phabricator.wikimedia.org/P12112 and previous config saved to /var/cache/conftool/dbconfig/20200729-085504-marostegui.json
* 08:42 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp2001.codfw.wmnet
* 08:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:24 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 08:05 marostegui: Deploy MCR schema change on db1121 (lag will show up on s4), also remove triggers on db1124:3314
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P12111 and previous config saved to /var/cache/conftool/dbconfig/20200729-080442-marostegui.json
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1141', diff saved to https://phabricator.wikimedia.org/P12110 and previous config saved to /var/cache/conftool/dbconfig/20200729-080318-marostegui.json
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P12109 and previous config saved to /var/cache/conftool/dbconfig/20200729-075558-marostegui.json
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P12108 and previous config saved to /var/cache/conftool/dbconfig/20200729-074828-marostegui.json
* 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1141', diff saved to https://phabricator.wikimedia.org/P12107 and previous config saved to /var/cache/conftool/dbconfig/20200729-074414-marostegui.json
* 06:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 06:26 XioNoX: standardize mr1-eqiad interfaces
* 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P12106 and previous config saved to /var/cache/conftool/dbconfig/20200729-062224-marostegui.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1078', diff saved to https://phabricator.wikimedia.org/P12105 and previous config saved to /var/cache/conftool/dbconfig/20200729-062009-marostegui.json
* 06:16 XioNoX: standardize mr1-codfw interfaces
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P12104 and previous config saved to /var/cache/conftool/dbconfig/20200729-061450-marostegui.json
* 06:05 XioNoX: standardize mr1-ulsfo interfaces
* 06:01 legoktm: ssh doc1001.eqiad.wmnet sudo -u doc-uploader git -C /srv/docroot pull
* 05:52 XioNoX: standardize mr1-eqsin interfaces
* 05:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 05:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078', diff saved to https://phabricator.wikimedia.org/P12103 and previous config saved to /var/cache/conftool/dbconfig/20200729-050346-marostegui.json
* 05:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1141', diff saved to https://phabricator.wikimedia.org/P12102 and previous config saved to /var/cache/conftool/dbconfig/20200729-050247-marostegui.json
* 05:02 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1142', diff saved to https://phabricator.wikimedia.org/P12101 and previous config saved to /var/cache/conftool/dbconfig/20200729-050204-marostegui.json
* 04:58 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1142', diff saved to https://phabricator.wikimedia.org/P12100 and previous config saved to /var/cache/conftool/dbconfig/20200729-045859-marostegui.json
* 02:19 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: re-enable lilypond in safe mode (duration: 01m 09s)
* 01:47 tstarling@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/Score/includes/Score.php: work around firejail bug (duration: 01m 07s)
* 01:45 tstarling@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Score/includes/Score.php: work around firejail bug (duration: 01m 08s)
* 01:19 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1048.eqiad.wmnet
* 01:15 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1047.eqiad.wmnet
* 00:53 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1046.eqiad.wmnet
* 00:48 ryankemper: sudo -E cumin -b 10 'A:wdqs-all' 'sudo run-puppet-agent'
* 00:18 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:14 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
 
== 2020-07-28 ==
* 23:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:37 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrus: reduce mlr window size on enwiki (duration: 01m 05s)
* 23:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:34 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: cirrus: reduce mlr window size on enwiki (duration: 01m 06s)
* 23:31 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:29 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:04 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove unused setting $wgGEHomepageSuggestedEditsNewAccountInitiatedPercentage (no-op) (duration: 01m 06s)
* 22:55 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=wtp1046.eqiad.wmnet
* 22:19 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1044.eqiad.wmnet
* 21:27 dancy@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 21:24 dancy@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 21:17 dancy@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 20:38 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:36 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:02 eileen: process-control config revision is {{Gerrit|b6ece03513}}
* 19:50 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:48 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 19:48 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 19:25 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 19:25 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 19:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 19:24 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 19:24 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 19:23 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1142', diff saved to https://phabricator.wikimedia.org/P12097 and previous config saved to /var/cache/conftool/dbconfig/20200728-191926-marostegui.json
* 19:12 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1147', diff saved to https://phabricator.wikimedia.org/P12096 and previous config saved to /var/cache/conftool/dbconfig/20200728-191237-marostegui.json
* 19:11 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@69bbbbb]: airflow: drop_old_data_daily: top_queries table renamed to fulltext_head_queries (duration: 00m 53s)
* 19:11 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@69bbbbb]: airflow: drop_old_data_daily: top_queries table renamed to fulltext_head_queries
* 19:09 jhuneidi@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 19:09 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1147', diff saved to https://phabricator.wikimedia.org/P12095 and previous config saved to /var/cache/conftool/dbconfig/20200728-190933-marostegui.json
* 19:06 jhuneidi@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 19:05 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1147', diff saved to https://phabricator.wikimedia.org/P12094 and previous config saved to /var/cache/conftool/dbconfig/20200728-190517-marostegui.json
* 19:03 jhuneidi@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 19:01 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1147', diff saved to https://phabricator.wikimedia.org/P12093 and previous config saved to /var/cache/conftool/dbconfig/20200728-190137-marostegui.json
* 18:35 cdanis: ✔️ cdanis@lvs1015.eqiad.wmnet ~ 🕝☕ sudo ipvsadm -D -t 10.2.2.51:9283
* 18:29 cdanis: ❌cdanis@lvs1016.eqiad.wmnet ~ 🕝☕ sudo ipvsadm -D -t 10.2.2.51:9283
* 18:29 catrope@deploy1001: Synchronized php-1.36.0-wmf.2/extensions/GrowthExperiments/extension.json: Fix reference to MentorChangeLogFormatter ([[phab:T259041|T259041]]) (duration: 01m 05s)
* 18:20 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: No-op sync for wmgUseWikimediaApiPortal and wmgUseWikimediaApiPortalOAuth (2 of 2) (duration: 00m 58s)
* 18:17 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: No-op sync for wmgUseWikimediaApiPortal and wmgUseWikimediaApiPortalOAuth (1 of 2) (duration: 01m 05s)
* 18:16 cdanis: primary pybal restart ✔️ cdanis@lvs1015.eqiad.wmnet ~ 🕑☕ sudo systemctl restart pybal.service
* 18:14 cdanis: backup pybal restart: ✔️ cdanis@lvs1016.eqiad.wmnet ~ 🕑☕ sudo systemctl restart pybal.service
* 18:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:05 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:05 catrope@deploy1001: Synchronized php-1.36.0-wmf.2/includes/libs/filebackend/SwiftFileBackend.php: Fix index error in SwiftFileBackend ([[phab:T259023|T259023]]) (duration: 01m 07s)
* 17:46 volker-e@deploy1001: Finished deploy [design/style-guide@e3fda83]: Deploy design/style-guide:  (duration: 00m 05s)
* 17:46 volker-e@deploy1001: Started deploy [design/style-guide@e3fda83]: Deploy design/style-guide:
* 17:41 volans: run apt-get clean on  wtp[1046,1048].eqiad.wmnet and wtp2001.codfw.wmnet to free ~`2GB as they were 100% - [[phab:T258775|T258775]]
* 17:33 XioNoX: standardize mr1-esams interfaces
* 17:30 brennen@deploy1001: sync aborted: (no justification provided) (duration: 28m 53s)
* 17:03 brennen: prior scap sync for https://gerrit.wikimedia.org/r/c/mediawiki/core/+/616842 ([[phab:T259023|T259023]])
* 17:02 brennen@deploy1001: Started scap: (no justification provided)
* 16:51 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@0982d4e]: convert_to_esbulk: repair variable ref before assign (duration: 04m 33s)
* 16:49 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:47 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 16:47 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@0982d4e]: convert_to_esbulk: repair variable ref before assign
* 16:45 XioNoX: remove mr1-codfw source NAT (not used)
* 16:43 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1045.eqiad.wmnet
* 16:39 ppchelko@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 16:36 ppchelko@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 16:34 ppchelko@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 16:33 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1035.eqiad.wmnet
* 16:32 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1034.eqiad.wmnet
* 16:31 XioNoX: mr1-eqiad# delete security nat source rule-set mgmt-to-untrust  (unused, no matching ACL)
* 16:25 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:23 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 16:21 hnowlan: imported envoyproxy 1.15.0-1 deb into component/envoy-future for buster-wikimedia
* 16:11 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1042.eqiad.wmnet
* 16:09 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1043.eqiad.wmnet
* 15:54 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:52 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:51 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 15:50 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:48 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 15:48 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 15:45 jayme@cumin1001: conftool action : set/pooled=no; selector: name=wtp1035.*
* 15:44 jayme@cumin1001: conftool action : set/pooled=no; selector: name=wtp1034.*
* 15:35 ayounsi@deploy1001: Finished deploy [homer/deploy@5e999c8]: once more (duration: 03m 06s)
* 15:32 ayounsi@deploy1001: Started deploy [homer/deploy@5e999c8]: once more
* 15:32 ayounsi@deploy1001: Finished deploy [homer/deploy@5e999c8]: CR613642 (duration: 03m 38s)
* 15:31 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1045.eqiad.wmnet
* 15:30 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1041.eqiad.wmnet
* 15:30 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1044.eqiad.wmnet
* 15:29 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1039.eqiad.wmnet
* 15:28 ayounsi@deploy1001: Started deploy [homer/deploy@5e999c8]: CR613642
* 15:19 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:17 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 15:16 mholloway-shell@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .
* 15:15 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:14 mholloway-shell@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' .
* 15:13 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 15:11 mholloway-shell@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 15:08 ayounsi@deploy1001: Finished deploy [homer/deploy@fcf4332]: CR613642 (duration: 02m 14s)
* 15:06 ayounsi@deploy1001: Started deploy [homer/deploy@fcf4332]: CR613642
* 15:01 ayounsi@deploy1001: Finished deploy [homer/deploy@fcf4332]: CR613642 (duration: 00m 11s)
* 15:01 ayounsi@deploy1001: Started deploy [homer/deploy@fcf4332]: CR613642
* 14:58 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1043.eqiad.wmnet
* 14:58 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1040.eqiad.wmnet
* 14:57 mholloway-shell@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .
* 14:55 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1042.eqiad.wmnet
* 14:54 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1038.eqiad.wmnet
* 14:52 mholloway-shell@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' .
* 14:48 mholloway-shell@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 14:23 herron: bounced centrallog rsyslog services in codfw/eqiad
* 14:19 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:15 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1147', diff saved to https://phabricator.wikimedia.org/P12087 and previous config saved to /var/cache/conftool/dbconfig/20200728-140313-marostegui.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1148', diff saved to https://phabricator.wikimedia.org/P12086 and previous config saved to /var/cache/conftool/dbconfig/20200728-140249-marostegui.json
* 14:02 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1148', diff saved to https://phabricator.wikimedia.org/P12085 and previous config saved to /var/cache/conftool/dbconfig/20200728-140220-marostegui.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1148', diff saved to https://phabricator.wikimedia.org/P12084 and previous config saved to /var/cache/conftool/dbconfig/20200728-140207-marostegui.json
* 14:00 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:59 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 13:58 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:58 moritzm: installing perl security updates
* 13:56 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 13:56 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 13:55 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1041.eqiad.wmnet
* 13:55 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1037.eqiad.wmnet
* 13:50 godog: remove stale ipvs thanos-query service on port 80
* 13:39 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1040.eqiad.wmnet
* 13:38 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1036.eqiad.wmnet
* 13:38 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1039.eqiad.wmnet
* 13:37 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1035.eqiad.wmnet
* 13:37 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1038.eqiad.wmnet
* 13:36 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1034.eqiad.wmnet
* 13:29 godog: roll-restart pybal on eqiad lvs low-traffic to change port for thanos-query
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075', diff saved to https://phabricator.wikimedia.org/P12083 and previous config saved to /var/cache/conftool/dbconfig/20200728-132520-marostegui.json
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075 with less weight', diff saved to https://phabricator.wikimedia.org/P12082 and previous config saved to /var/cache/conftool/dbconfig/20200728-132023-marostegui.json
* 13:09 godog: roll-restart pybal on lvs low-traffic to apply thanos-query changes
* 13:04 XioNoX: standardize cr3-esams interfaces
* 13:03 liw@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.2
* 12:41 XioNoX: standardize cr2-esams interfaces
* 12:38 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:36 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 12:33 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075', diff saved to https://phabricator.wikimedia.org/P12081 and previous config saved to /var/cache/conftool/dbconfig/20200728-123201-marostegui.json
* 12:28 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:26 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 12:26 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:24 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 12:17 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1037.eqiad.wmnet
* 12:14 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1036.eqiad.wmnet
* 12:08 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1035.eqiad.wmnet
* 12:07 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1032.eqiad.wmnet
* 12:07 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1033.eqiad.wmnet
* 12:05 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1031.eqiad.wmnet
* 12:04 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1034.eqiad.wmnet
* 12:04 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: disabling lilypond rendering in Score again due to error running gs (duration: 01m 05s)
* 11:56 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: re-enabling Score in safe mode (duration: 01m 04s)
* 11:50 Urbanecm: EU B&C window done
* 11:49 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1a5672628b82709350ca74bb784197e7ff5fdc19}}: Add Turkish powered by MW and Wikimedia project icons ([[phab:T257732|T257732]]) (duration: 00m 59s)
* 11:46 urbanecm@deploy1001: Synchronized static/images/footer/: {{Gerrit|1a5672628b82709350ca74bb784197e7ff5fdc19}}: Add Turkish powered by MW and Wikimedia project icons ([[phab:T257732|T257732]]) (duration: 01m 01s)
* 11:43 urbanecm@deploy1001: Synchronized static/images: {{Gerrit|df9b9acf0876dad9b11d5641fe6fa174c7066f8b}}: Move footer logos to /static/images/footer ([[phab:T257732|T257732]]) (duration: 01m 02s)
* 11:38 marostegui: Deploy schema change on s3 codfw, this will generate lag on codfw [[phab:T256682|T256682]]
* 11:38 ema: A:cp-text varnish ban pt.wikiversity.org [[phab:T256750|T256750]]
* 11:37 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|df9b9acf0876dad9b11d5641fe6fa174c7066f8b}}: Move footer logos to /static/images/footer ([[phab:T257732|T257732]]) (duration: 00m 58s)
* 11:36 ema: A:cp-text varnish ban fr.wiktionary.org [[phab:T256750|T256750]]
* 11:35 urbanecm@deploy1001: Synchronized static/images/footer: {{Gerrit|df9b9acf0876dad9b11d5641fe6fa174c7066f8b}}: Move footer logos to /static/images/footer ([[phab:T257732|T257732]]) (duration: 01m 05s)
* 11:34 ema: A:cp-text varnish ban eu.wikipedia.org [[phab:T256750|T256750]]
* 11:32 ema: A:cp-text varnish ban he.wikipedia.org [[phab:T256750|T256750]]
* 11:30 marostegui: Deploy MCR change on db1143, db1148, db1146:3314
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1148 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12079 and previous config saved to /var/cache/conftool/dbconfig/20200728-113009-marostegui.json
* 11:30 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|04c7ef94bb7901668f2a8df3289b6a59d42f0a7e}}: Undeploy graphoid for phase 2 wikis ([[phab:T258463|T258463]]) (duration: 01m 00s)
* 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1143', diff saved to https://phabricator.wikimedia.org/P12078 and previous config saved to /var/cache/conftool/dbconfig/20200728-112850-marostegui.json
* 11:25 ema: A:cp-text varnish ban fa.wikipedia.org [[phab:T256750|T256750]]
* 11:21 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [cirrus] use more neutral config var names (duration: 01m 06s)
* 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P12077 and previous config saved to /var/cache/conftool/dbconfig/20200728-112046-marostegui.json
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P12076 and previous config saved to /var/cache/conftool/dbconfig/20200728-111522-marostegui.json
* 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1143', diff saved to https://phabricator.wikimedia.org/P12075 and previous config saved to /var/cache/conftool/dbconfig/20200728-111226-marostegui.json
* 11:11 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:10 jdrewniak@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:614890 desktop improvements by default for testing group (round 2) (T254227)]] (duration: 01m 06s)
* 11:09 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 11:09 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:07 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 10:56 hashar@deploy1001: Finished deploy [integration/docroot@ba85bdf]: Catch up with HEAD and support DOCUMENT_ROOT being a symbolic link for [[phab:T149924|T149924]] (duration: 00m 06s)
* 10:56 hashar@deploy1001: Started deploy [integration/docroot@ba85bdf]: Catch up with HEAD and support DOCUMENT_ROOT being a symbolic link for [[phab:T149924|T149924]]
* 10:55 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:53 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 10:50 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1033.eqiad.wmnet
* 10:48 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1030.eqiad.wmnet
* 10:48 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1029.eqiad.wmnet
* 10:47 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1032.eqiad.wmnet
* 10:47 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1028.eqiad.wmnet
* 10:33 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1031.eqiad.wmnet
* 10:32 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1027.eqiad.wmnet
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1082', diff saved to https://phabricator.wikimedia.org/P12074 and previous config saved to /var/cache/conftool/dbconfig/20200728-102342-marostegui.json
* 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P12072 and previous config saved to /var/cache/conftool/dbconfig/20200728-100412-marostegui.json
* 09:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:57 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:55 XioNoX: standardize cr2-esams interfaces
* 09:52 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:50 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 09:49 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:47 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 09:45 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:43 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 09:40 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:38 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 09:35 moritzm: imported libmysqlclient18 to component/cloudera [[phab:T258768|T258768]]
* 09:31 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1030.eqiad.wmnet
* 09:28 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1029.eqiad.wmnet
* 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P12070 and previous config saved to /var/cache/conftool/dbconfig/20200728-092606-marostegui.json
* 09:24 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1028.eqiad.wmnet
* 09:19 XioNoX: standardize cr3-eqsin interfaces
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082', diff saved to https://phabricator.wikimedia.org/P12069 and previous config saved to /var/cache/conftool/dbconfig/20200728-091849-marostegui.json
* 09:18 jayme@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp1027.eqiad.wmnet
* 09:10 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1026.eqiad.wmnet
* 09:07 ema: cp3050: restart varnishmtail.service, stuck on "Condition(c->offset <= c->vtx->len) not true."
* 08:39 XioNoX: standardize cr2-eqsin interfaces
* 08:38 godog: temporary downgrade prometheus-snmp-exporter on netmon2001
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1143 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12067 and previous config saved to /var/cache/conftool/dbconfig/20200728-083336-marostegui.json
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P12066 and previous config saved to /var/cache/conftool/dbconfig/20200728-083209-marostegui.json
* 08:20 liw@deploy1001: Finished scap: testwikis wikis to 1.36.0-wmf.2 (duration: 53m 11s)
* 08:09 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:07 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 08:06 godog: failover librenms/smokeping to netmon2001 - [[phab:T247967|T247967]]
* 08:04 marostegui: Reduce labsdb1009 weight
* 07:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 jayme: depooled wtp1026.eqiad.wmnet for reimage
* 07:48 moritzm: switched superset to CAS
* 07:47 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 07:46 ayounsi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:43 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 07:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:31 jayme@cumin1001: conftool action : set/pooled=yes; selector: name=wtp1025.eqiad.wmnet
* 07:27 liw@deploy1001: Started scap: testwikis wikis to 1.36.0-wmf.2
* 07:03 liw: 1.36.0-wmf.2 was branched at {{Gerrit|04e863fdf3646ee6ed5c05b784f85c9f323e1f19}} for [[phab:T257970|T257970]]
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146:3314 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12065 and previous config saved to /var/cache/conftool/dbconfig/20200728-051928-marostegui.json
* 05:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1144:3314 and restore db1146:3314 original weight', diff saved to https://phabricator.wikimedia.org/P12064 and previous config saved to /var/cache/conftool/dbconfig/20200728-051813-marostegui.json
* 02:17 eileen: process-control config revision is {{Gerrit|6811ca294a}} - just delayed silverpop_daily a bit as clashing with dedupe
* 00:18 andrew@cumin1001: conftool action : set/pooled=inactive; selector: name=cloudcephmon1003.eqiad.wmnet
* 00:17 andrew@cumin1001: conftool action : set/pooled=no; selector: name=cloudcephmon1003.eqiad.wmnet
 
== 2020-07-27 ==
* 23:49 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@ac8e5d0]: airflow: head queries report, managed variables, refinery-drop-hive-partitions support (duration: 00m 54s)
* 23:48 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@ac8e5d0]: airflow: head queries report, managed variables, refinery-drop-hive-partitions support
* 23:28 mutante: otrs1001 - ran puppet (it was alerting in icinga that puppet failed, but it was neither disabled nor failing and changed nothing when it ran)
* 21:31 sbassett@deploy1001: Synchronized wmf-config/CommonSettings.php: Deployed CentralNotice CSP conifg change for [[phab:T258459|T258459]] (duration: 00m 57s)
* 21:10 sbassett: Deployed mitigations for [[phab:T238075|T238075]]
* 20:41 urbanecm@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/InterwikiSorting/: {{Gerrit|c5f6c97856a5dbe673064afd2804bebb9b787580}}: Use LanguageLinksHook to sort interwiki links ([[phab:T257625|T257625]]) (duration: 00m 59s)
* 19:50 jhuneidi@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 19:44 jhuneidi@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 19:36 jhuneidi@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 19:23 jhuneidi@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 19:19 jhuneidi@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 19:11 jhuneidi@deploy1001: helmfile [staging] Ran 'sync' command on namespace 'sessionstore' for release 'staging' .
* 19:06 jhuneidi@deploy1001: helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'production' .
* 19:00 jhuneidi@deploy1001: helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'production' .
* 18:57 urbanecm@deploy1001: sync-file aborted: {{Gerrit|3833b135caf4171daa0814eba81393b6c44db619}}: Move footer logos to /static/images/footer ([[phab:T257732|T257732]]) (duration: 00m 04s)
* 18:50 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|c6a9674366d9c8d273ce0e74dfb6a04c91d64307}}: Move footer logos to wmg* variables ([[phab:T257732|T257732]]) (duration: 00m 56s)
* 18:50 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 00m 57s)
* 18:49 urbanecm@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 01s)
* 18:49 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c6a9674366d9c8d273ce0e74dfb6a04c91d64307}}: Move footer logos to wmg* variables ([[phab:T257732|T257732]]) (duration: 00m 57s)
* 18:29 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable desktop web UI click tracking instrumentation on frwiki, hewiki, fawiki ([[phab:T258058|T258058]]) (duration: 00m 56s)
* 18:13 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove WPBSkinBlacklist ([[phab:T254675|T254675]]) (duration: 00m 57s)
* 17:42 liw@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.1
* 17:30 liw: promoting train to group2
* 17:14 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 17:14 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 17:14 dpifke@deploy1001: Finished deploy [performance/arc-lamp@f14888b]: Deploying arclamp-compress-logs ([[phab:T235456|T235456]]) (duration: 00m 05s)
* 17:14 dpifke@deploy1001: Started deploy [performance/arc-lamp@f14888b]: Deploying arclamp-compress-logs ([[phab:T235456|T235456]])
* 16:59 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephmon1002.eqiad.wmnet
* 16:58 andrew@cumin1001: conftool action : set/pooled=no; selector: name=cloudcephmon1002.eqiad.wmnet
* 16:57 andrew@cumin1001: conftool action : set/pooled=inactive; selector: name=cloudcephmon1002.eqiad.wmnet
* 16:50 andrew@cumin1001: conftool action : set/pooled=inactive; selector: name=cloudcephosd1003.eqiad.wmnet
* 16:50 andrew@cumin1001: conftool action : set/pooled=inactive; selector: name=cloudcephosd1002.eqiad.wmnet
* 16:50 andrew@cumin1001: conftool action : set/pooled=inactive; selector: name=cloudcephosd1001.eqiad.wmnet
* 16:50 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephmon1003.eqiad.wmnet
* 16:50 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephmon1002.eqiad.wmnet
* 16:49 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephmon1001.eqiad.wmnet
* 16:48 andrew@cumin1001: conftool action : set/pooled=no; selector: name=cloudcephosd1003.wikimedia.org
* 16:48 andrew@cumin1001: conftool action : set/pooled=no; selector: name=cloudcephosd1002.wikimedia.org
* 16:48 andrew@cumin1001: conftool action : set/pooled=no; selector: name=cloudcephosd1001.wikimedia.org
* 16:48 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephosd1001.eqiad.wmnet
* 16:48 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephosd1002.eqiad.wmnet
* 16:47 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cloudcephosd1003.eqiad.wmnet
* 16:44 andrew@cumin1001: conftool action : set/pooled=yes; selector: name=cumin1001.eqiad.wmnet
* 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2087:3316, db2087:3317 after on-site maintenance [[phab:T258587|T258587]]', diff saved to https://phabricator.wikimedia.org/P12063 and previous config saved to /var/cache/conftool/dbconfig/20200727-163311-marostegui.json
* 16:05 marostegui: Will show up on labsdb hosts for s5
* 16:04 marostegui: Stop MySQL on db1082 for onsite maintenance - [[phab:T258910|T258910]]
* 15:03 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:03 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:57 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Reduce db1146:3314 weight while db1144:3314 is depooled', diff saved to https://phabricator.wikimedia.org/P12060 and previous config saved to /var/cache/conftool/dbconfig/20200727-145010-marostegui.json
* 14:48 marostegui: Deploy MCR change on db1144:3314
* 14:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3314 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12059 and previous config saved to /var/cache/conftool/dbconfig/20200727-144807-marostegui.json
* 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1149', diff saved to https://phabricator.wikimedia.org/P12058 and previous config saved to /var/cache/conftool/dbconfig/20200727-144034-marostegui.json
* 14:21 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:19 XioNoX: standardize cr1-codfw interfaces
* 14:19 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 14:07 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:04 jayme@cumin1001: START - Cookbook sre.hosts.downtime
* 13:57 moritzm: upgrading idp2001 to CAS 6.1.7.1
* 13:19 XioNoX: standardize some cr2-esams interfaces
* 13:11 marostegui@cumin1001: dbctl commit (dc=all): 'More weight to db1089 in main traffic', diff saved to https://phabricator.wikimedia.org/P12057 and previous config saved to /var/cache/conftool/dbconfig/20200727-131123-marostegui.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3311 with normal weight and pool db1089 into vslow', diff saved to https://phabricator.wikimedia.org/P12056 and previous config saved to /var/cache/conftool/dbconfig/20200727-130954-marostegui.json
* 13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311', diff saved to https://phabricator.wikimedia.org/P12055 and previous config saved to /var/cache/conftool/dbconfig/20200727-130713-marostegui.json
* 13:03 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:01 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3311 with less weight', diff saved to https://phabricator.wikimedia.org/P12054 and previous config saved to /var/cache/conftool/dbconfig/20200727-125824-marostegui.json
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311', diff saved to https://phabricator.wikimedia.org/P12053 and previous config saved to /var/cache/conftool/dbconfig/20200727-125351-marostegui.json
* 12:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3311 with less weight', diff saved to https://phabricator.wikimedia.org/P12052 and previous config saved to /var/cache/conftool/dbconfig/20200727-125207-marostegui.json
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311', diff saved to https://phabricator.wikimedia.org/P12051 and previous config saved to /var/cache/conftool/dbconfig/20200727-125045-marostegui.json
* 12:41 marostegui: Compress innodb on db1106, this will generate lag on enwiki on labsdb hosts (wiki replicas) [[phab:T254462|T254462]]
* 12:38 moritzm: disable puppet on idp1001/2001
* 12:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 and pool db1105:3311 as vslow [[phab:T254462|T254462]]', diff saved to https://phabricator.wikimedia.org/P12050 and previous config saved to /var/cache/conftool/dbconfig/20200727-123833-marostegui.json
* 12:37 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=eqiad,service=mobileapps,name=scb1001.eqiad.wmnet
* 12:37 akosiaris@cumin1001: conftool action : set/weight=0; selector: dc=eqiad,service=mobileapps,name=scb1001.eqiad.wmnet
* 12:37 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,service=mobileapps,name=scb1001.eqiad.wmnet
* 12:37 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,service=mobileapps,name=scb1001.eqiad.wmnet
* 12:36 akosiaris@cumin1001: conftool action : set/weight=0; selector: dc=eqiad,service=mobileapps,name=scb1001.eqiad.wmnet
* 12:31 XioNoX: standardize cr2-codfw interfaces
* 12:28 volans@deploy1001: Finished deploy [debmonitor/deploy@25dbd20]: Release v0.2.7 (duration: 00m 27s)
* 12:28 volans@deploy1001: Started deploy [debmonitor/deploy@25dbd20]: Release v0.2.7
* 12:25 jbond42: upload new cas package to buster-wikimedia
* 12:25 jbond42: upload new cas package
* 12:23 ema: A:cp rolling varnish-frontend restart to actually discard old VCL still pointing at varnishcheck/check [[phab:T255015|T255015]] [[phab:T236754|T236754]]
* 12:21 moritzm: installing ruby-json security updates
* 12:16 moritzm: installing batik security updates
* 11:59 marostegui: Deploy MCR schema change on db1149
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1149 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12049 and previous config saved to /var/cache/conftool/dbconfig/20200727-115818-marostegui.json
* 11:57 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1138', diff saved to https://phabricator.wikimedia.org/P12048 and previous config saved to /var/cache/conftool/dbconfig/20200727-115739-marostegui.json
* 11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1138', diff saved to https://phabricator.wikimedia.org/P12047 and previous config saved to /var/cache/conftool/dbconfig/20200727-115258-marostegui.json
* 11:28 moritzm: installing an-tool1009 [[phab:T258768|T258768]]
* 10:54 ema: upload atskafka 0.10 to buster-wikimedia, upgrade cp3050 [[phab:T254317|T254317]]
* 10:46 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:616463{{!}} Bumping portals to master (616463)]] (duration: 01m 05s)
* 10:45 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:616463{{!}} Bumping portals to master (616463)]] (duration: 01m 10s)
* 10:33 XioNoX: make cr*-ulsfo interfaces netbox compliant
* 08:39 XioNoX: push "Add 185.71.138.0/24 to wikimedia4" to all routers
* 07:00 marostegui: Deploy schema change on s5 codfw [[phab:T256682|T256682]]
* 06:44 elukey: truncate big log file on an-launcher1002 that is filling up the /srv partition
* 06:36 elukey: apt-get clean on netbox1001 to free some space
* 05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1138 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12043 and previous config saved to /var/cache/conftool/dbconfig/20200727-051156-marostegui.json
* 05:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2087:3316, db2087:3317 for on-site maintenance [[phab:T258587|T258587]]', diff saved to https://phabricator.wikimedia.org/P12042 and previous config saved to /var/cache/conftool/dbconfig/20200727-050058-marostegui.json
* 04:58 marostegui: Stop MySQL on db2087 for on-site maintenance [[phab:T258587|T258587]]
 
== 2020-07-25 ==
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1096:3315 into s5 api afte db1082 crashed [[phab:T258336|T258336]]', diff saved to https://phabricator.wikimedia.org/P12041 and previous config saved to /var/cache/conftool/dbconfig/20200725-124104-marostegui.json
* 09:16 oblivian@cumin1001: dbctl commit (dc=all): 'Depool db1082 [[phab:T258336|T258336]]', diff saved to https://phabricator.wikimedia.org/P12040 and previous config saved to /var/cache/conftool/dbconfig/20200725-091616-oblivian.json
* 01:52 mutante: ganeti - also removing (unmounted) disk 2 (100G) from webperf1002. [[phab:T257931|T257931]]
* 00:46 mutante: ganeti - removing disk 3 (20G) from webperf1002. the disks are 0-indexed, so the ones actually mounted are 0 (50G) and 1 (300G) ([[phab:T257931|T257931]])
* 00:42 dpifke: Manually compressing some more data on webperf1002, using arclamp-compress-logs from https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/615904.
 
== 2020-07-24 ==
* 23:00 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 20:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:06 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 19:57 dpifke: Manually gzipping some older ArcLamp data on webperf1002, to free up space and verify new compression support.
* 19:55 dpifke@deploy1001: Finished deploy [performance/arc-lamp@772b4a3]: Deploy CLs 611465 and 613740 to add compression support to ArcLamp (duration: 00m 05s)
* 19:55 dpifke@deploy1001: Started deploy [performance/arc-lamp@772b4a3]: Deploy CLs 611465 and 613740 to add compression support to ArcLamp
* 16:55 Amir1: deployment done
* 16:49 ladsgroup@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Wikibase/repo/includes/RepoHooks.php: [[gerrit:616032{{!}}Prevent onTitleGetRestrictionTypes changing ns0 protections]], Part II (duration: 01m 07s)
* 16:47 ladsgroup@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Wikibase/repo/includes/WikibaseRepo.php: [[gerrit:616032{{!}}Prevent onTitleGetRestrictionTypes changing ns0 protections]], Part I (duration: 01m 06s)
* 15:06 reedy@deploy1001: Finished scap: Score backports (duration: 36m 50s)
* 14:30 reedy@deploy1001: Started scap: Score backports
* 13:31 XioNoX: advertise 185.71.138.0/24 from AMS
* 13:17 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 13:00 ladsgroup@deploy1001: Synchronized php-1.36.0-wmf.1/includes/import/ImportableOldRevisionImporter.php: [[gerrit:616029{{!}}Import: use master DB for loading slots.]] ([[phab:T258666|T258666]]) (duration: 01m 07s)
* 12:34 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 12:04 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 11:48 hnowlan: bootstrapped restbase-dev1004-b
* 11:13 hnowlan: started bootstrap of restbase-dev1004-a
* 10:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:49 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 10:35 hnowlan: started reimage of restbase-dev1004
* 09:59 jmm@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 09:48 jmm@cumin1001: START - Cookbook sre.ganeti.makevm
* 08:40 kormat: restarting mariadb on all sanitarium hosts [[phab:T258711|T258711]]
* 08:35 akosiaris: start nagios-nrpe-server on kubernetes2002
* 07:44 elukey: depool wtp1025 - disk full
* 06:30 tstarling@deploy1001: Started scap: for Score
* 02:36 tstarling@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Score/includes/Score.php: removing superseded local patch for hard-coding lilypond version (duration: 01m 09s)
* 01:19 ejegg: updated payments-wiki from {{Gerrit|31a3de1130}} to {{Gerrit|c365c136d2}}
* 01:04 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 01:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 01:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 01:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 01:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 01:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 01:02 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 01:02 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 01:02 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 01:02 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 01:02 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 01:02 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 00:46 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:46 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:46 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 00:46 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 00:45 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:45 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:44 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 00:44 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 00:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:44 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 00:43 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:42 andrew@cumin1001: START - Cookbook sre.hosts.decommission
* 00:32 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 00:15 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 00:14 andrew@cumin1001: START - Cookbook sre.hosts.decommission
 
== 2020-07-23 ==
* 23:32 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:30 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:30 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:30 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 23:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 23:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 23:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 23:30 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:53 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:52 mutante: stashbot quadruple log test
* 22:51 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:51 jhuneidi@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'echostore' for release 'staging' .
* 22:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:51 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:51 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:50 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:50 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:50 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 22:21 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:18 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 21:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:45 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 21:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:27 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 21:21 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@c99c626]: airflow: centralize installation specific airflow Variables (duration: 00m 34s)
* 21:20 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@c99c626]: airflow: centralize installation specific airflow Variables
* 21:02 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:00 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:59 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 20:58 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:58 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:58 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:58 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 20:58 andrew@cumin1001: START - Cookbook sre.hosts.downtime
* 19:13 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 19:11 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 19:09 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 18:51 ryankemper: restarted blazegraph on codfw wdqs2001
* 18:44 ryankemper: Restarted blazegraph on following codfw wdqs nodes: 2007, 2003, and 2002
* 18:39 Amir1: BACC is done
* 18:29 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:613235{{!}}Load WikibaseClient from extension.json file instead of php one (T257437 T256228 T88258)]] (duration: 01m 05s)
* 18:21 mutante: testreduce1001 - rm -rf /srv/testreduce and run puppet to re-clone testreduce to it from the scandium branch ([[phab:T257906|T257906]])
* 18:13 ryankemper: restarted blazegraph on 2001
* 17:59 ryankemper: sudo -E cumin -b 10 'A:wdqs-all and not A:wdqs-test and not P<nowiki>{</nowiki>wdqs1003.eqiad.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>wdqs2001.codfw.wmnet<nowiki>}</nowiki>' 'sudo systemctl restart wdqs-blazegraph.service'
* 17:53 cdanis: ❌cdanis@cumin1001.eqiad.wmnet ~ 🕑☕ sudo cumin -b10 'wdqs*' "run-puppet-agent --unless-version 1a4ae81"
* 17:52 cdanis@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=wdqs.*,name=codfw
* 17:35 cdanis@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=wdqs.*,name=codfw
* 17:22 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-reload
* 16:57 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 16:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-reload
* 15:36 urbanecm@deploy1001: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 01m 05s)
* 13:49 akosiaris@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=.*
* 12:29 marostegui: Decrease labsdb1009 weight a bit, as it is lagging again.
* 12:23 XioNoX: remove bogus lo0 IPs from cr3-knams
* 12:21 Urbanecm: Stagging at mwdebug1001 ended, run scap pull to clean changes
* 12:17 Urbanecm: Stagging at mwdebug1001 again
* 12:02 Urbanecm: Stagging at mwdebug1001 ended, run scap pull to clean changes
* 12:00 Urbanecm: Stagging at mwdebug1001
* 11:49 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|745ff20f53e4914cf6e1717c963419e74b68e693}}: Log ClosedWikiProviders start with info level ([[phab:T258695|T258695]]) (duration: 01m 05s)
* 11:48 marostegui: Deploy MCR schema change on db1145:3314
* 11:36 dcausse: European mid-day backport window done
* 11:31 dcausse@deploy1001: Synchronized php-1.36.0-wmf.1/extensions/Wikibase: [[phab:T258507|T258507]]: Fix bug that causes wrong prefixes in RDF output (duration: 01m 11s)
* 11:18 akosiaris: depool scb in mobileapps/eqiad. [[phab:T218733|T218733]]
* 11:17 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=eqiad,service=mobileapps,name=scb.*
* 11:13 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T258474|T258474]]: [sdoc] fix entity source base URIs (duration: 01m 07s)
* 10:27 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=eqiad,service=mobileapps,name=scb.*
* 10:27 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=eqiad,service=mobileapps,name=scb*
* 10:25 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=eqiad,service=mobileapps,name=scb1002.*
* 10:24 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=eqiad,service=mobileapps,name=scb1001.*
* 10:18 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:14 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:11 akosiaris: poole kubernetes in mobileapps/eqiad. [[phab:T218733|T218733]]
* 10:11 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,service=mobileapps,name=kubernetes.*
* 10:06 volans@deploy1001: Finished deploy [debmonitor/deploy@16d0c45]: Release v0.2.6 (duration: 00m 36s)
* 10:06 volans@deploy1001: Started deploy [debmonitor/deploy@16d0c45]: Release v0.2.6
* 10:05 volans@deploy1001: Finished deploy [debmonitor/deploy@44aa1ee]: Release v0.2.6 (duration: 00m 14s)
* 10:05 volans@deploy1001: Started deploy [debmonitor/deploy@44aa1ee]: Release v0.2.6
* 10:04 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 09:51 akosiaris: prepare for pooling kubernetes mobileapps capacity in eqiad. [[phab:T218733|T218733]]
* 09:51 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,service=mobileapps,name=kubernetes.*
* 09:46 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 09:40 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 09:38 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 09:27 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:27 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 09:25 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 09:24 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 09:20 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 09:19 akosiaris: lower replica count back to 80 for mobileapps. [[phab:T218733|T218733]]
* 09:19 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:19 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 09:02 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 08:59 marostegui: transfer --type=xtrabackup from db1117:3322 to db1107 [[phab:T257540|T257540]]
* 08:45 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' .
* 08:42 godog: test librenms poller from netmon2001
* 08:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:40 XioNoX: remove pim-rp IPs from last routers - [[phab:T257573|T257573]]
* 08:40 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' .
* 08:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:29 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1107 from s1 [[phab:T257540|T257540]]', diff saved to https://phabricator.wikimedia.org/P12025 and previous config saved to /var/cache/conftool/dbconfig/20200723-082647-marostegui.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1107 to move it to m2 [[phab:T257540|T257540]]', diff saved to https://phabricator.wikimedia.org/P12024 and previous config saved to /var/cache/conftool/dbconfig/20200723-081650-marostegui.json
* 05:29 marostegui: Restore labsdb1009's original weight
* 00:24 legoktm@deploy1001: Synchronized php-1.35.0-wmf.41/includes/: [[phab:T258664|T258664]]: Revert "Add a new type of database to the installer from extension" (2/2) (duration: 01m 08s)
* 00:22 legoktm@deploy1001: Synchronized php-1.35.0-wmf.41/includes/libs/rdbms/database/Database.php: [[phab:T258664|T258664]]: Revert "Add a new type of database to the installer from extension" (duration: 01m 05s)
* 00:20 legoktm@deploy1001: Scap failed!: 9/9 canaries failed their endpoint checks(https://en.wikipedia.org)
* 00:16 legoktm@deploy1001: Synchronized php-1.36.0-wmf.1/includes/: [[phab:T258664|T258664]]: Revert "Add a new type of database to the installer from extension" (duration: 01m 09s)
* 00:11 legoktm@deploy1001: scap failed: average error rate on 3/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
 
== 2020-07-22 ==
* 22:07 cdanis: remove downtime on api.svc.codfw.wmnet [[phab:T258614|T258614]]
* 19:26 jhuneidi@deploy1001: Synchronized php: group1 wikis to 1.36.0-wmf.1 (duration: 01m 03s)
* 19:25 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.1
* 19:15 urbanecm@deploy1001: Finished scap: {{Gerrit|9529cf8d2570bbf6dd1e919c966f5954e39dbd67}}: {{Gerrit|b66ec9143bd96cbf3a20b70f6aa3f2d6d7963bb5}}: OOUI backport; {{Gerrit|93755a6a92923ae390e3a04b19421c8562568d2a}}: i18n changes for OAuth, removal of spam messages (duration: 42m 26s)
* 19:14 ejegg: updated payments-wiki from {{Gerrit|bf91f8adff}} to {{Gerrit|31a3de1130}}
* 19:11 mutante: mw2335 - mw2339 - scap pull
* 18:39 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw233[5-9].codfw.wmnet
* 18:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw233[6-9].codfw.wmnet
* 18:36 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw233[6-9].codfw.wmnet
* 18:33 urbanecm@deploy1001: Started scap: {{Gerrit|9529cf8d2570bbf6dd1e919c966f5954e39dbd67}}: {{Gerrit|b66ec9143bd96cbf3a20b70f6aa3f2d6d7963bb5}}: OOUI backport; {{Gerrit|93755a6a92923ae390e3a04b19421c8562568d2a}}: i18n changes for OAuth, removal of spam messages
* 18:33 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2335.codfw.wmnet
* 18:28 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw233[5-9].codfw.wmnet
* 18:16 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2339.codfw.wmnet
* 17:58 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2338.codfw.wmnet
* 17:58 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2337.codfw.wmnet
* 17:58 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2336.codfw.wmnet
* 17:26 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2335.codfw.wmnet
* 15:31 moritzm: updated stretch installer image to Stretch 9.13 release [[phab:T258407|T258407]]
* 15:27 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 15:27 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 14:52 XioNoX: add accept-data and remove bogus v6 IP from ulsfo sandbox vlan
* 14:43 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=codfw,service=mobileapps,name=scb.*
* 14:43 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 14:43 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 14:35 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 14:35 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 14:12 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:12 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:06 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:04 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 13:54 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 13:54 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 13:50 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:49 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:36 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:36 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:34 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 13:33 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 13:20 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 13:19 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:18 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 13:18 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:16 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:16 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 12:36 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=codfw,service=mobileapps,name=scb.*
* 12:32 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 12:28 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=codfw,service=mobileapps,name=scb.*
* 12:20 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 12:18 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 12:17 akosiaris@cumin1001: conftool action : set/weight=0; selector: dc=codfw,service=mobileapps,name=scb.*
* 12:05 ema: A:cp-text varnish ban ptwikiversity [[phab:T256750|T256750]]
* 12:01 ema: A:cp-text varnish ban frwiktionary [[phab:T256750|T256750]]
* 11:56 ema: A:cp-text varnish ban euwiki [[phab:T256750|T256750]]
* 11:54 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=mobileapps,name=scb.*
* 11:54 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 11:54 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 11:52 Urbanecm: EU B&C window done
* 11:52 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=codfw,service=mobileapps,name=scb.*
* 11:49 ema: A:cp-text force puppet run to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/615446 [[phab:T256750|T256750]]
* 11:48 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 15s)
* 11:42 jdrewniak@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:614889{{!}}Enable desktop improvements by default for testing group (round 1) (T254227)]] (duration: 01m 05s)
* 11:30 jdrewniak@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:614888{{!}}Enable instrumentation for wikis in the desktop improvements testing group (T254228)]] (duration: 01m 04s)
* 11:30 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 11:30 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 11:28 jdrewniak@deploy1001: Synchronized wmf-config/config: Config: [[gerrit:614888{{!}}Enable instrumentation for wikis in the desktop improvements testing group (T254228)]] (duration: 01m 05s)
* 11:20 jdrewniak@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: Config: [[gerrit:614888{{!}}Enable instrumentation for wikis in the desktop improvements testing group (T254228)]] (duration: 01m 05s)
* 11:18 jdrewniak@deploy1001: Synchronized dblists/desktop-improvements.dblist: Config: [[gerrit:614888{{!}}Enable instrumentation for wikis in the desktop improvements testing group (T254228)]] (duration: 01m 18s)
* 11:13 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 11:13 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 10:39 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 10:24 jbond42: upload prometheus-swagger-exporter_0.3-1+deb10u1 to apt1001 buster repo
* 10:24 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 10:22 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 10:19 akosiaris@deploy2001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 10:19 akosiaris@deploy2001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 10:12 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=mobileapps,name=scb.*
* 10:08 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=codfw,service=mobileapps,name=scb.*
* 10:04 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 10:01 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 09:58 marostegui: Deploy MCR schema change on s4 codfw master (lag will appear on codfw) - [[phab:T238966|T238966]]
* 09:55 akosiaris: bump memory in codfw mobileapps another 20% [[phab:T218733|T218733]]
* 09:55 akosiaris@deploy2001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:55 akosiaris@deploy2001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 09:52 godog: centrallog1001 lvextend /srv by 130G
* 09:51 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 09:46 akosiaris: codfw mobileapps kubernetes traffic back to 96% [[phab:T218733|T218733]] again. scb pooled again.
* 09:46 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=mobileapps,name=scb.*
* 09:43 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 09:43 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 09:43 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:40 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 09:40 akosiaris: increase codfw mobileapps kubernetes traffic to 100% [[phab:T218733|T218733]]
* 09:40 akosiaris@cumin1001: conftool action : set/pooled=no; selector: dc=codfw,service=mobileapps,name=scb.*
* 09:34 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 09:27 akosiaris@deploy2001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:27 akosiaris@deploy2001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 09:25 akosiaris: bump memory limits for mobileapps by 25% [[phab:T218733|T218733]]
* 09:25 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 09:10 jayme: updated docker-report to 0.0.7-1 on deneb
* 09:09 jayme: import docker-report 0.0.7-1 to buster-wikimedia
* 09:06 gehel: restarting blazegraph on all wdqs nodes - new vocabulary
* 08:48 dcausse: restarting blazegraph on wdqs1010 (testing new vocab)
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1126', diff saved to https://phabricator.wikimedia.org/P12017 and previous config saved to /var/cache/conftool/dbconfig/20200722-084613-marostegui.json
* 08:42 kormat@cumin1001: dbctl commit (dc=all): 'Increase es1020 to 100% pooled in es4, reduce es1021 to weight 0 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P12016 and previous config saved to /var/cache/conftool/dbconfig/20200722-084159-kormat.json
* 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P12015 and previous config saved to /var/cache/conftool/dbconfig/20200722-083926-marostegui.json
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12014 and previous config saved to /var/cache/conftool/dbconfig/20200722-083535-marostegui.json
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P12013 and previous config saved to /var/cache/conftool/dbconfig/20200722-083140-marostegui.json
* 08:30 kart_: Updated cxserver to 2020-07-20-200559-production ([[phab:T257674|T257674]])
* 08:28 kartik@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 08:25 kartik@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 08:25 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12012 and previous config saved to /var/cache/conftool/dbconfig/20200722-082309-marostegui.json
* 08:22 kartik@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P12010 and previous config saved to /var/cache/conftool/dbconfig/20200722-082023-marostegui.json
* 08:19 volans@cumin1001: START - Cookbook sre.dns.netbox
* 08:16 akosiaris: increase codfw mobileapps kubernetes traffic to 96% [[phab:T218733|T218733]]. Take #2. Let's see if I can reproduce the weird increases in p99 latencies and figure out their cause
* 08:15 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=codfw,service=mobileapps,name=scb.*
* 08:14 kormat@cumin1001: dbctl commit (dc=all): 'Increase es1020 to 75% pooled in es4, reduce es1021 to weight 25 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P12009 and previous config saved to /var/cache/conftool/dbconfig/20200722-081457-kormat.json
* 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12008 and previous config saved to /var/cache/conftool/dbconfig/20200722-081330-marostegui.json
* 08:12 moritzm: Turnilo switched to CAS
* 08:05 jayme: updated docker-report to 0.0.6-1 on deneb
* 07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12007 and previous config saved to /var/cache/conftool/dbconfig/20200722-075749-marostegui.json
* 07:53 kormat@cumin1001: dbctl commit (dc=all): 'Increase es1020 to 50% pooled in es4 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P12006 and previous config saved to /var/cache/conftool/dbconfig/20200722-075312-kormat.json
* 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1084 to s1, depooled [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P12005 and previous config saved to /var/cache/conftool/dbconfig/20200722-075040-marostegui.json
* 07:49 jayme: import docker-report 0.0.6-1 to buster-wikimedia
* 07:40 jynus: stop db1145 for hw maintenance [[phab:T258249|T258249]]
* 06:47 elukey: update analytics-in4/6 filters on cr1/cr2 eqiad (ref https://gerrit.wikimedia.org/r/c/operations/homer/public/+/614702)
* 06:26 marostegui: Stop MySQL on db1107
* 06:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1107 to clone db1084', diff saved to https://phabricator.wikimedia.org/P12003 and previous config saved to /var/cache/conftool/dbconfig/20200722-060432-marostegui.json
* 05:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126', diff saved to https://phabricator.wikimedia.org/P12002 and previous config saved to /var/cache/conftool/dbconfig/20200722-051607-marostegui.json
 
== 2020-07-21 ==
* 23:37 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump cirrus MLR models to latest (duration: 01m 06s)
* 23:13 Urbanecm: Evening backport window done
* 23:12 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|7a50168d54b5e86834606fb8d7880eb3a923ffd5}}: Updating UploadWizard template: PD-old-70-1923->PD-old-70-expired ([[phab:T258523|T258523]]) (duration: 01m 06s)
* 23:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7acc9d966a07d589bb6aed5f801c9e1defc75fe1}}: Enable $wgWatchlistExpiry on testwiki ([[phab:T257506|T257506]]) (duration: 01m 08s)
* 19:10 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.1
* 19:02 catrope@deploy1001: Synchronized php-1.36.0-wmf.1/includes/Storage/PageUpdater.php: Fix handling of null edits ([[phab:T257766|T257766]]) (duration: 01m 06s)
* 19:01 catrope@deploy1001: Synchronized php-1.35.0-wmf.41/includes/Storage/PageUpdater.php: Fix handling of null edits ([[phab:T257766|T257766]]) (duration: 01m 11s)
* 18:33 jhuneidi@deploy1001: Finished scap: testwikis wikis to 1.36.0-wmf.1 (duration: 41m 22s)
* 18:27 ejegg: restored new URL for TY page in payments-wiki settings
* 18:22 mforns@deploy1001: Finished deploy [analytics/refinery@0c25de1] (thin): Redeploying to unbreak unique devices per domain monthly THIN [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd] (duration: 00m 07s)
* 18:22 mforns@deploy1001: Started deploy [analytics/refinery@0c25de1] (thin): Redeploying to unbreak unique devices per domain monthly THIN [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd]
* 18:21 mforns@deploy1001: Finished deploy [analytics/refinery@0c25de1]: Redeploying to unbreak unique devices per domain monthly - third try [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd] (duration: 00m 12s)
* 18:21 mforns@deploy1001: Started deploy [analytics/refinery@0c25de1]: Redeploying to unbreak unique devices per domain monthly - third try [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd]
* 18:17 mforns@deploy1001: Finished deploy [analytics/refinery@0c25de1]: Redeploying to unbreak unique devices per domain monthly - second try [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd] (duration: 00m 17s)
* 18:16 mforns@deploy1001: Started deploy [analytics/refinery@0c25de1]: Redeploying to unbreak unique devices per domain monthly - second try [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd]
* 18:13 mforns@deploy1001: Finished deploy [analytics/refinery@0c25de1]: Redeploying to unbreak unique devices per domain monthly [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd] (duration: 05m 32s)
* 18:08 mforns@deploy1001: Started deploy [analytics/refinery@0c25de1]: Redeploying to unbreak unique devices per domain monthly [analytics/refinery@0c25de19a3a309276654b4463cca4f574336d8fd]
* 17:52 jhuneidi@deploy1001: Started scap: testwikis wikis to 1.36.0-wmf.1
* 17:50 volans@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 17:45 volans@cumin1001: START - Cookbook sre.dns.netbox
* 17:10 jhuneidi@deploy1001: Pruned MediaWiki: 1.35.0-wmf.39 (duration: 16m 25s)
* 16:32 ppchelko@deploy1001: Finished deploy [restbase/deploy@4f3cb41]: Add new wikis to RESTBase, take 2 (duration: 04m 54s)
* 16:27 ppchelko@deploy1001: Started deploy [restbase/deploy@4f3cb41]: Add new wikis to RESTBase, take 2
* 16:27 ppchelko@deploy1001: Finished deploy [restbase/deploy@4f3cb41]: Add new wikis to RESTBase (duration: 10m 37s)
* 16:21 longma: 1.36.0-wmf.1 was branched at {{Gerrit|3a1faac3764ecae8dde813bd67a5a8e8f4975a85}} for [[phab:T257969|T257969]]
* 16:16 ppchelko@deploy1001: Started deploy [restbase/deploy@4f3cb41]: Add new wikis to RESTBase
* 15:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:10 moritzm: draining restbase1027 for eventual reboot for kernel security update
* 15:09 godog: poweroff ms-be1024 for bbu replacement - [[phab:T257949|T257949]]
* 15:08 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:08 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 15:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:01 vgutierrez: show a synthetic warning for traffic using ECDHE-RSA-AES128-SHA - [[phab:T258405|T258405]]
* 15:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:00 moritzm: draining restbase1026 for eventual reboot for kernel security update
* 14:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:57 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:56 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:52 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:51 moritzm: draining restbase1025 for eventual reboot for kernel security update
* 14:48 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:44 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:35 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=codfw,service=mobileapps,name=scb.*
* 14:35 akosiaris: decrease codfw mobileapps kubernetes traffic to 72% [[phab:T218733|T218733]]. Weird latency patterns exhibited when 92% was reached. See https://grafana.wikimedia.org/d/5CmeRcnMz/mobileapps?panelId=34&fullscreen&orgId=1&from=1595338489749&to=1595342071227&var-dc=codfw%20prometheus%2Fk8s&var-service=mobileapps&var-container_name=All
* 14:35 moritzm: draining restbase1024 for eventual reboot for kernel security update
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1119', diff saved to https://phabricator.wikimedia.org/P11994 and previous config saved to /var/cache/conftool/dbconfig/20200721-143204-marostegui.json
* 14:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1119', diff saved to https://phabricator.wikimedia.org/P11993 and previous config saved to /var/cache/conftool/dbconfig/20200721-142634-marostegui.json
* 14:24 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:24 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:23 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:19 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1119', diff saved to https://phabricator.wikimedia.org/P11992 and previous config saved to /var/cache/conftool/dbconfig/20200721-141813-marostegui.json
* 14:16 moritzm: draining restbase1023 for eventual reboot for kernel security update
* 14:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:06 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:06 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:03 moritzm: draining restbase1022 for eventual reboot for kernel security update
* 14:01 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:57 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:55 moritzm: draining restbase1021 for eventual reboot for kernel security update
* 13:51 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1119', diff saved to https://phabricator.wikimedia.org/P11991 and previous config saved to /var/cache/conftool/dbconfig/20200721-135028-marostegui.json
* 13:48 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:46 moritzm: draining restbase1020 for eventual reboot for kernel security update
* 13:42 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=codfw,service=mobileapps,name=scb.*
* 13:41 akosiaris: increase codfw mobileapps kubernetes traffic to 96% [[phab:T218733|T218733]]
* 13:41 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:41 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 13:15 Amir1: end of ladsgroup@mwmaint1002:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T258472|T258472]] [[phab:T258473|T258473]])
* 13:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 13:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:06 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:03 moritzm: draining restbase1019 for eventual reboot for kernel security update
* 13:01 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:01 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 12:55 Amir1: start of ladsgroup@mwmaint1002:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T258472|T258472]] [[phab:T258473|T258473]])
* 12:54 marostegui: Stop haproxy on dbproxy1012 - [[phab:T255408|T255408]]
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1087', diff saved to https://phabricator.wikimedia.org/P11988 and previous config saved to /var/cache/conftool/dbconfig/20200721-121302-marostegui.json
* 12:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:53 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:49 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:45 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:41 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:25 Urbanecm: EU B&C window done
* 11:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7b96c7ea35557888c6cec2dd19768c246bff804b}}: Enable botpasswords at checkuserwiki and stewardwiki ([[phab:T258358|T258358]], [[phab:T258355|T258355]]) (duration: 00m 57s)
* 11:11 Urbanecm: Create bot_passwords table at checkuserwiki ([[phab:T258358|T258358]])
* 11:10 Urbanecm: Create bot_passwords table at stewardwiki ([[phab:T258355|T258355]])
* 11:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|5d5bb37c342310be5ca0b0e11a8490703867f4fd}}: Enable Vector opt in preference everywhere ([[phab:T254228|T254228]]) (duration: 00m 57s)
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1085 [[phab:T258360|T258360]]', diff saved to https://phabricator.wikimedia.org/P11987 and previous config saved to /var/cache/conftool/dbconfig/20200721-110854-marostegui.json
* 11:00 effie: enable puppet on  P:mediawiki::mcrouter_wancache - [[phab:T247956|T247956]]
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085 [[phab:T258360|T258360]]', diff saved to https://phabricator.wikimedia.org/P11986 and previous config saved to /var/cache/conftool/dbconfig/20200721-105852-marostegui.json
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085 [[phab:T258360|T258360]]', diff saved to https://phabricator.wikimedia.org/P11985 and previous config saved to /var/cache/conftool/dbconfig/20200721-104546-marostegui.json
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P11984 and previous config saved to /var/cache/conftool/dbconfig/20200721-103430-marostegui.json
* 10:20 effie: disable puppet on  P:mediawiki::mcrouter_wancache - [[phab:T247956|T247956]]
* 10:13 effie: enable puppet on on wtp*
* 10:02 marostegui: Analyze revision table on db1119 [[phab:T258480|T258480]]
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 [[phab:T258480|T258480]]', diff saved to https://phabricator.wikimedia.org/P11983 and previous config saved to /var/cache/conftool/dbconfig/20200721-100159-marostegui.json
* 09:59 akosiaris: move all codfw mobileapps nodes (kubernetes and scb) to weight 10. Traffic level remains at 72.727272% flowing to kubernetes, the rest to scb [[phab:T218733|T218733]]
* 09:59 akosiaris: move all codfw mobileapps nodes (kubernetes and scb) to weight 10. Traffic level remains at 72.727272% flowing to kubernetes, the rest to scb
* 09:59 effie: disable puppet on wtp* to merge 613307
* 09:58 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=codfw,service=mobileapps
* 09:58 akosiaris: increase codfw mobileapps kubernetes traffic to 72.727272% [[phab:T218733|T218733]]
* 09:57 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=codfw,service=mobileapps,name=scb.*
* 09:44 elukey: add term 'idp' to analytics-in4/6 filters on cr1-eqiad and cr2-eqiad (ref: https://gerrit.wikimedia.org/r/c/operations/homer/public/+/615160)
* 09:21 kormat@cumin1001: dbctl commit (dc=all): 'Re-pool es1020 at 25% in es4 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11982 and previous config saved to /var/cache/conftool/dbconfig/20200721-092126-kormat.json
* 08:37 akosiaris: increase codfw mobileapps kubernetes traffic to 47% [[phab:T218733|T218733]]
* 08:34 akosiaris@cumin1001: conftool action : set/weight=3; selector: dc=codfw,service=mobileapps,name=scb.*
* 08:28 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:26 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1119', diff saved to https://phabricator.wikimedia.org/P11980 and previous config saved to /var/cache/conftool/dbconfig/20200721-080842-marostegui.json
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1119', diff saved to https://phabricator.wikimedia.org/P11979 and previous config saved to /var/cache/conftool/dbconfig/20200721-075233-marostegui.json
* 07:49 marostegui: Deploy schema change on db1087, lag will appear on s8 (wikidata) on labsdb hosts [[phab:T256685|T256685]]
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1087 [[phab:T256685|T256685]]', diff saved to https://phabricator.wikimedia.org/P11978 and previous config saved to /var/cache/conftool/dbconfig/20200721-074843-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1119', diff saved to https://phabricator.wikimedia.org/P11977 and previous config saved to /var/cache/conftool/dbconfig/20200721-073757-marostegui.json
* 07:29 kormat@deploy1001: Synchronized wmf-config/db-eqiad.php: Re-enable writes to es4 [[phab:T257847|T257847]] (duration: 00m 57s)
* 07:22 kormat@cumin1001: dbctl commit (dc=all): 'Depool es1020 from es4 [[phab:T257847|T257847]]', diff saved to https://phabricator.wikimedia.org/P11976 and previous config saved to /var/cache/conftool/dbconfig/20200721-072251-kormat.json
* 07:21 kormat@cumin1001: dbctl commit (dc=all): 'Promote es1021 to es4 master [[phab:T257847|T257847]]', diff saved to https://phabricator.wikimedia.org/P11975 and previous config saved to /var/cache/conftool/dbconfig/20200721-072127-kormat.json
* 07:13 kormat: killing James_F('s script) on mwmaint1002
* 07:06 _joe_: systemctl reset-failed on deneb, the usual known issue with releng image reporting
* 07:03 kormat@deploy1001: Synchronized wmf-config/db-eqiad.php: Disable writes to es4 [[phab:T257847|T257847]] (duration: 01m 00s)
* 06:59 kormat: Starting es4 failover from es1020 to es1021 [[phab:T257847|T257847]]
* 06:54 kormat@cumin1001: dbctl commit (dc=all): 'Set es1021 to weight 50 [[phab:T257847|T257847]]', diff saved to https://phabricator.wikimedia.org/P11974 and previous config saved to /var/cache/conftool/dbconfig/20200721-065457-kormat.json
* 06:54 marostegui: Pool db1119 into enwiki with MCR schema change done - [[phab:T238966|T238966]]
* 06:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1119', diff saved to https://phabricator.wikimedia.org/P11973 and previous config saved to /var/cache/conftool/dbconfig/20200721-065430-marostegui.json
* 06:27 _joe_: systemctl reset-failed on lists1001, a network interface was failing since 1 month
* 06:26 _joe_: enabling notifications for lists1001
* 06:23 _joe_: systemctl reset-failed on both centrallogs
* 02:43 eileen: civicrm revision changed from {{Gerrit|7f1e7d8e38}} to {{Gerrit|cc5d17fbaf}}, config revision is {{Gerrit|23460676f6}}
* 00:02 ryankemper: Began Elasticsearch reindex job on index `dewiki_content` across [`eqiad`, `codfw`, `cloudelastic`], on `rkemper@mwmaint1002` under tmux session `reindex`. Should complete in <24 hours
 
== 2020-07-20 ==
* 23:49 eileen: tools revision changed from {{Gerrit|b915d8efbd}} to {{Gerrit|22550f38c5}}
* 23:34 ejegg: updated fundraising CiviCRM from {{Gerrit|8b09c87ce2}} to {{Gerrit|7f1e7d8e38}}
* 23:12 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/ProofreadPage/ProofreadPage.namespaces.php: {{Gerrit|03ed74f0b9b8f55d01f9112c31f2f6ea17990f9c}}: Add ProofreadPage namespace translation for lij ([[phab:T257672|T257672]]) (duration: 00m 57s)
* 23:06 Urbanecm: run mwscript namespaceDupes.php --wiki=lijwikisource -- fix ([[phab:T257672|T257672]])
* 23:05 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2147774caaa0819f8b5d71cc16bc021d94677702}}: Add English aliases for WS-specific namespaces to lijwikisource ([[phab:T257672|T257672]]) (duration: 00m 57s)
* 22:59 ryankemper@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 613669: cirrussearch: Allow 2 dewiki->content shards/node {{!}} https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/613669 (duration: 00m 57s)
* 21:53 eileen: tools revision changed from {{Gerrit|40d52a0008}} to {{Gerrit|b915d8efbd}}
* 21:15 sbassett: Revised mitigation deployed for [[phab:T257687|T257687]]
* 20:07 eileen: tools revision changed from {{Gerrit|711d671600}} to {{Gerrit|40d52a0008}}
* 19:10 mforns@deploy1001: Finished deploy [analytics/refinery@af86a05] (thin): Regular analytics weekly train THIN [analytics/refinery@af86a05be470ed8283f6585afb5cc231b26944a2] (duration: 00m 07s)
* 19:10 mforns@deploy1001: Started deploy [analytics/refinery@af86a05] (thin): Regular analytics weekly train THIN [analytics/refinery@af86a05be470ed8283f6585afb5cc231b26944a2]
* 19:09 mforns@deploy1001: Finished deploy [analytics/refinery@af86a05]: Regular analytics weekly train [analytics/refinery@af86a05be470ed8283f6585afb5cc231b26944a2] (duration: 05m 46s)
* 19:03 mforns@deploy1001: Started deploy [analytics/refinery@af86a05]: Regular analytics weekly train [analytics/refinery@af86a05be470ed8283f6585afb5cc231b26944a2]
* 18:37 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|df2584f181f08da0e1191f97e619e912e587b48d}}: Switch $wgUrlShortenerDomainsWhitelist --> $wgUrlShortenerAllowedDomains ([[phab:T255491|T255491]]) (duration: 00m 57s)
* 18:26 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dfed4727c6f9e003f9e1949b2995a0cf0ad4f1cc}}: Adding rollbacker group for arzwiki ([[phab:T258100|T258100]]) (duration: 00m 57s)
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ee7ac95e16f55e850b318f7354842795e08e0270}}: Change of rollbacker group settings at jawiki ([[phab:T258339|T258339]]) (duration: 00m 57s)
* 17:36 ejegg: updated payments-wiki settings to point TY page at new URL
* 16:32 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@10afb4b]: airflow: Turn off catchup on cirrus_namespace_map (duration: 00m 25s)
* 16:31 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@10afb4b]: airflow: Turn off catchup on cirrus_namespace_map
* 16:27 akosiaris: increase codfw mobileapps kubernetes traffic to 25% [[phab:T218733|T218733]]. Take #2
* 16:27 akosiaris@cumin1001: conftool action : set/weight=8; selector: dc=codfw,service=mobileapps,name=scb.*
* 15:59 elukey: restart airflow-webserver/scheduler to pick up TLS to mysql settings
* 15:21 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:21 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 15:17 hnowlan: draining and restarting sessionstore2002
* 15:17 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:17 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 15:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:13 jynus: dropping and recreating nagios@localhost users on all m1 servers
* 15:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:09 hnowlan: draining and restarting sessionstore2001
* 15:09 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:09 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 15:09 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:09 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 15:08 moritzm: draining restbase2023 for eventual reboot for kernel security update
* 15:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:00 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:56 moritzm: draining restbase2022 for eventual reboot for kernel security update
* 14:56 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:56 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:52 hnowlan: draining and restarting sessionstore1003
* 14:52 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:52 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 14:51 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 14:51 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 14:49 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 14:49 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 14:49 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:47 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 14:47 moritzm: draining restbase2021 for eventual reboot for kernel security update
* 14:44 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:43 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 14:37 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:36 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@ff49fdf]: Update mobileapps to {{Gerrit|0bf7bafa}} (duration: 03m 50s)
* 14:34 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:34 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime
* 14:34 hnowlan: starting drain and restart of sessionstore hosts for new kernel
* 14:33 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:32 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@ff49fdf]: Update mobileapps to {{Gerrit|0bf7bafa}}
* 14:26 moritzm: draining restbase2020 for eventual reboot for kernel security update
* 14:23 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:23 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:20 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:17 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:14 moritzm: draining restbase2019 for eventual reboot for kernel security update
* 14:08 ema: lvs101[34] (primaries) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 14:07 ema: lvs1016 (secondary) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 14:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:02 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:59 ema: lvs300[56] (primaries) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:57 ema: lvs3007 (secondary) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:50 ema: lvs500[12] (primaries) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:48 moritzm: draining restbase2018 for eventual reboot for kernel security update
* 13:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:47 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 13:47 ema: lvs5003 (secondary) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:44 ema: lvs200[78] (primaries) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:42 ema: lvs2010 (secondary) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:31 ema: lvs400[56] (primaries) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:31 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:27 moritzm: draining restbase2017 for eventual reboot for kernel security update
* 13:24 ema: lvs4007 (secondary) - restart pybal to apply varnish healthcheck changes https://gerrit.wikimedia.org/r/c/operations/puppet/+/610047 [[phab:T255015|T255015]]
* 13:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:16 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:09 moritzm: draining restbase2016 for eventual reboot for kernel security update
* 13:08 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:08 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 13:07 moritzm: reset broken ifup systemd states on puppetdb* hosts
* 13:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:59 Urbanecm: creating arywiki ([[phab:T257674|T257674]]), lijwikisource ([[phab:T257672|T257672]]), sysop_itwiki ([[phab:T256545|T256545]]) done
* 12:59 moritzm: draining restbase2015 for eventual reboot for kernel security update
* 12:56 Urbanecm: Create Daimona Eaytoy at sysop_itwiki ([[phab:T256545|T256545]])
* 12:55 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 01m 59s)
* 12:50 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating sysop_itwiki ([[phab:T256545|T256545]]) (duration: 00m 57s)
* 12:49 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating sysop_itwiki ([[phab:T256545|T256545]]) (duration: 00m 57s)
* 12:48 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating sysop_itwiki ([[phab:T256545|T256545]])
* 12:46 urbanecm@deploy1001: Synchronized dblists: Creating sysop_itwiki ([[phab:T256545|T256545]]) (duration: 00m 57s)
* 12:46 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:43 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:40 moritzm: draining restbase2014 for eventual reboot for kernel security update
* 12:38 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:38 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 12:35 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:34 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating lijwikisource ([[phab:T257672|T257672]]) (duration: 00m 57s)
* 12:32 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating lijwikisource ([[phab:T257672|T257672]])
* 12:31 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:30 urbanecm@deploy1001: Synchronized dblists: Creating lijwikisource ([[phab:T257672|T257672]]) (duration: 00m 56s)
* 12:28 urbanecm@deploy1001: Synchronized dblists/rtl.dblist: Add arywiki to rtl.dblist ([[phab:T257674|T257674]]) (duration: 00m 57s)
* 12:27 moritzm: draining restbase2013 for eventual reboot for kernel security update
* 12:27 urbanecm@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 00s)
* 12:21 urbanecm@deploy1001: Synchronized langlist: Creating arywiki ([[phab:T257674|T257674]]) (duration: 00m 56s)
* 12:20 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating arywiki ([[phab:T257674|T257674]]) (duration: 00m 56s)
* 12:19 urbanecm@deploy1001: Synchronized static/images/project-logos/: Creating arywiki ([[phab:T257674|T257674]]) (duration: 00m 57s)
* 12:17 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: Creating arywiki ([[phab:T257674|T257674]])
* 12:16 urbanecm@deploy1001: Synchronized dblists: Creating arywiki ([[phab:T257674|T257674]]) (duration: 00m 57s)
* 12:02 moritzm: installing qemu security updates on buster
* 11:50 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|946bf3d239f278b4e099f5dec676f5e2be61d8ca}}: Update brwikimedia logo and add upscaled versions (config) ([[phab:T257925|T257925]]) (duration: 00m 57s)
* 11:49 urbanecm@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 00s)
* 11:49 Urbanecm: Purge 'https://en.wikipedia.org/static/images/project-logos/bnwikimedia.png'
* 11:46 urbanecm@deploy1001: Synchronized static/images/project-logos/: {{Gerrit|f7560b6061dd3a60ccf56c916ebf70a3f104bea7}}: Update brwikimedia logo and add upscaled versions ([[phab:T257925|T257925]]) (duration: 00m 56s)
* 11:44 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|5b97a06fa2e9a06c251a9c1fd2ddd9beec01a683}}: Set $wgUrlShortenerAllowedDomains for all wikis ([[phab:T258134|T258134]]) (duration: 00m 57s)
* 11:42 urbanecm@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 00s)
* 11:36 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c12f1dee6b9888849c64312c2a4fd65ecbd4091e}}: Remove wgPopupsPageBlacklist config setting ([[phab:T254676|T254676]]) (duration: 00m 57s)
* 11:35 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript createAndPromote.php testwikidatawiki --custom-groups=interface-admin --force 'Lucas Werkmeister (WMDE)'
* 11:34 urbanecm@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 01s)
* 11:25 Urbanecm: mwscript namespaceDupes.php --wiki=kowikiquote  --fix ([[phab:T255031|T255031]])
* 11:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3719668511231589b4fc6a723ccdfa772068ad5f}}: Add NamespaceAliases for kowikiquote ([[phab:T255031|T255031]]) (duration: 00m 57s)
* 11:22 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bc5671a90c65b66989e470fc41225986b2ec9fb5}}: Add media.farsnews.ir to the wgCopyUploadsDomains allowlist of Wikimedia Commons ([[phab:T253800|T253800]]) (duration: 00m 57s)
* 11:18 Urbanecm: Run mwscript updateCollation.php --wiki=bswiktionary --previous-collation=uppercase in a tmux session at mwmaint1002 ([[phab:T258346|T258346]])
* 11:17 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0c784784d75c2bbfb570495a6a097d4c44cbe6b3}}: Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wiktionary ([[phab:T258346|T258346]]) (duration: 00m 58s)
* 11:13 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6830723b0ad5031e67062ba838f09cd07c2b97a1}}: Convert ukwikisource ns:250 and ns:251 to have subpages ([[phab:T255930|T255930]]) (duration: 00m 57s)
* 11:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1c7a6215d06aff6cb0a75701292d8147f006d9e4}}: Create closer group at itwikinews ([[phab:T257927|T257927]]) (duration: 00m 57s)
* 10:55 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:50 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:48 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:48 moritzm: rebooting releases* hosts for kernel security update
* 10:35 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:614698{{!}} Bumping portals to master (614698)]] (duration: 00m 56s)
* 10:34 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:614698{{!}} Bumping portals to master (614698)]] (duration: 00m 59s)
* 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1114', diff saved to https://phabricator.wikimedia.org/P11962 and previous config saved to /var/cache/conftool/dbconfig/20200720-103058-marostegui.json
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P11961 and previous config saved to /var/cache/conftool/dbconfig/20200720-094609-marostegui.json
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P11960 and previous config saved to /var/cache/conftool/dbconfig/20200720-093154-marostegui.json
* 09:25 godog: update compiler facts
* 09:17 jayme: updating envoyproxy to 1.14.4-1 on all eqiad hosts
* 09:11 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1114', diff saved to https://phabricator.wikimedia.org/P11959 and previous config saved to /var/cache/conftool/dbconfig/20200720-091119-marostegui.json
* 09:04 jayme: updating envoyproxy to 1.14.4-1 on all codfw hosts
* 07:54 moritzm: installing libopenmpt security updates
* 07:51 jayme: updating envoyproxy to 1.14.4-1 on all non mw and restbase hosts
* 07:29 marostegui: Move m1-master from dbproxy1012 to dbproxy1014 - [[phab:T255408|T255408]]
* 07:19 marostegui: Drop non used reviewdb database - [[phab:T255715|T255715]]
* 06:55 elukey: restart matomo1002's mariadb to pick up new TLS settings
* 06:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114', diff saved to https://phabricator.wikimedia.org/P11958 and previous config saved to /var/cache/conftool/dbconfig/20200720-065438-marostegui.json
* 06:15 tstarling@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/Score/includes/Score.php: reverting Reedy's temporary patch for hardcoding the lilypond version (duration: 00m 57s)
* 06:07 tstarling@deploy1001: Finished scap: fixing missing message from previous sync-dir (duration: 29m 57s)
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1082 after a crash [[phab:T258336|T258336]]', diff saved to https://phabricator.wikimedia.org/P11957 and previous config saved to /var/cache/conftool/dbconfig/20200720-055614-marostegui.json
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082 after a crash [[phab:T258336|T258336]]', diff saved to https://phabricator.wikimedia.org/P11956 and previous config saved to /var/cache/conftool/dbconfig/20200720-054747-marostegui.json
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082 after a crash [[phab:T258336|T258336]]', diff saved to https://phabricator.wikimedia.org/P11955 and previous config saved to /var/cache/conftool/dbconfig/20200720-053816-marostegui.json
* 05:37 tstarling@deploy1001: Started scap: fixing missing message from previous sync-dir
* 05:30 tstarling@deploy1001: scap sync-l10n completed (1.35.0-wmf.41) (duration: 02m 44s)
* 05:25 marostegui: Deploy MCR schema change on enwiki on db1119 - [[phab:T238966|T238966]]
* 05:24 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: disable lilypond with better error message (duration: 00m 57s)
* 05:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1082 after a crash [[phab:T258336|T258336]]', diff saved to https://phabricator.wikimedia.org/P11953 and previous config saved to /var/cache/conftool/dbconfig/20200720-051846-marostegui.json
* 05:18 tstarling@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/Score: better error message for disabling of Score (duration: 01m 10s)
 
== 2020-07-19 ==
* 19:16 marostegui: Upgrade and reboot db1085 [[phab:T258360|T258360]]
* 18:57 marostegui: Start mysql on db1082 [[phab:T258336|T258336]]
* 18:51 marostegui: Upgrade and reboot db1082 [[phab:T258336|T258336]]
* 18:45 cdanis@cumin1001: dbctl commit (dc=all): 'db1085 also crashed', diff saved to https://phabricator.wikimedia.org/P11952 and previous config saved to /var/cache/conftool/dbconfig/20200719-184511-cdanis.json
* 18:06 Urbanecm: Run mwscript emptyUserGroup.php --wiki=testwiki contestadmin ([[phab:T256555|T256555]])
 
== 2020-07-18 ==
* 21:41 shdubsh: restart logstash on logstash200[456]
* 21:14 shdubsh: bounce logstash on logstash1007
* 21:10 shdubsh: bounce logstash on logstash1008
* 21:06 shdubsh: bounce logstash on logstash1009
* 20:52 marostegui: Due to db1082 crash there will be replication lag on s5 on labsdb hosts - [[phab:T258336|T258336]]
* 20:37 cdanis@cumin1001: dbctl commit (dc=all): 'depool db1082, it crashed', diff saved to https://phabricator.wikimedia.org/P11951 and previous config saved to /var/cache/conftool/dbconfig/20200718-203704-cdanis.json
* 00:13 dpifke: Performing one-time expiration of ArcLamp files older than 40 days (normal retention is 45 days), to solve disk space issue until either Ganeti issue is solved or compressed logfile support is merged.
 
== 2020-07-17 ==
* 21:16 dpifke: Removing MongoDB packages and data from webperf1002.
* 17:39 dpifke@deploy1001: Finished deploy [performance/arc-lamp@a5d2fd3]: (no justification provided) (duration: 00m 05s)
* 17:38 dpifke@deploy1001: Started deploy [performance/arc-lamp@a5d2fd3]: (no justification provided)
* 13:53 akosiaris: powercycle kubernetes2002
* 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1104', diff saved to https://phabricator.wikimedia.org/P11944 and previous config saved to /var/cache/conftool/dbconfig/20200717-122400-marostegui.json
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P11941 and previous config saved to /var/cache/conftool/dbconfig/20200717-120126-marostegui.json
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P11940 and previous config saved to /var/cache/conftool/dbconfig/20200717-115155-marostegui.json
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P11939 and previous config saved to /var/cache/conftool/dbconfig/20200717-113800-marostegui.json
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104', diff saved to https://phabricator.wikimedia.org/P11938 and previous config saved to /var/cache/conftool/dbconfig/20200717-113050-marostegui.json
* 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1104', diff saved to https://phabricator.wikimedia.org/P11937 and previous config saved to /var/cache/conftool/dbconfig/20200717-112413-marostegui.json
* 09:15 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1280.eqiad.wmnet
* 09:12 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1280.eqiad.wmnet
* 08:48 moritzm: imported prometheus-atlas-exporter 1.0+git20191204.ffafab7-2 to buster-wikimedia [[phab:T247967|T247967]]
* 08:29 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 08:05 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 07:54 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104', diff saved to https://phabricator.wikimedia.org/P11936 and previous config saved to /var/cache/conftool/dbconfig/20200717-075124-marostegui.json
* 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1111', diff saved to https://phabricator.wikimedia.org/P11935 and previous config saved to /var/cache/conftool/dbconfig/20200717-074335-marostegui.json
* 07:34 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 07:34 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 07:33 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 07:33 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 07:32 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 07:30 elukey@cumin1001: START - Cookbook sre.ganeti.makevm
* 06:30 XioNoX: rename msw1-codfw interface range
* 06:28 XioNoX: rename msw1-eqiad interface range
* 04:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1111', diff saved to https://phabricator.wikimedia.org/P11934 and previous config saved to /var/cache/conftool/dbconfig/20200717-044748-marostegui.json
* 04:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1092', diff saved to https://phabricator.wikimedia.org/P11933 and previous config saved to /var/cache/conftool/dbconfig/20200717-044658-marostegui.json
 
== 2020-07-16 ==
* 22:15 mutante: testreduce1001 manually git clone 'scandium' branch of integration/visualdiff into /srv/visualdiff ([[phab:T257906|T257906]])
* 21:54 crusnov@deploy1001: Finished deploy [netbox/deploy@39c5cae]: Deploying Netbox 2.8.7 part 3 (duration: 01m 49s)
* 21:52 crusnov@deploy1001: Started deploy [netbox/deploy@39c5cae]: Deploying Netbox 2.8.7 part 3
* 21:42 crusnov@deploy1001: Finished deploy [netbox/deploy@39c5cae]: Deploying Netbox 2.8.7 part 2 (duration: 01m 33s)
* 21:41 crusnov@deploy1001: Started deploy [netbox/deploy@39c5cae]: Deploying Netbox 2.8.7 part 2
* 21:40 crusnov@deploy1001: Finished deploy [netbox/deploy@39c5cae]: Deploying Netbox 2.8.7 (duration: 01m 01s)
* 21:39 crusnov@deploy1001: Started deploy [netbox/deploy@39c5cae]: Deploying Netbox 2.8.7
* 21:08 cstone: payments-wiki revision changed from {{Gerrit|91852dbc9b}} to {{Gerrit|bf91f8adff}}
* 20:32 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable client error logging on Catalan Wikipedia ([[phab:T258073|T258073]]) (duration: 00m 57s)
* 19:32 sbassett: Deployed mitigations for [[phab:T257687|T257687]]
* 19:14 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T248418|T248418]] TimedMediaHandler: Make videojs the only player on all group0 (duration: 00m 57s)
* 18:54 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:53 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:50 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:49 addshore: deployment windows finished with
* 18:46 addshore@deploy1001: Synchronized wmf-config/extension-list: [[gerrit:611393]] extension-list: Load WikibaseClient via JSON (duration: 00m 56s)
* 18:36 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:613226]] Wikibase: Always set wgWBRepoSettings idGeneratorSeparateDbConnection PT 2/2 (duration: 00m 56s)
* 18:35 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:613226]] Wikibase: Always set wgWBRepoSettings idGeneratorSeparateDbConnection PT 1/2 (duration: 00m 56s)
* 18:25 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:613165]] [[phab:T138104|T138104]] Wikibase: stop setting wmgWikibaseTmpSerializeEmptyListsAsObjects (duration: 00m 57s)
* 18:23 addshore@deploy1001: Synchronized wmf-config/config/incubatorwiki.yaml: [[gerrit:613199]] [[phab:T256957|T256957]] Move VisualEditor from beta to default on incubatorwiki PT2/2 (duration: 00m 57s)
* 18:22 addshore@deploy1001: Synchronized dblists/visualeditor-nondefault.dblist: [[gerrit:613199]] [[phab:T256957|T256957]] Move VisualEditor from beta to default on incubatorwiki PT1/2 (duration: 00m 56s)
* 18:20 addshore@deploy1001: Synchronized wmf-config/config/nlwikimedia.yaml: [[gerrit:613198]] [[phab:T256142|T256142]] Move VisualEditor from beta to default on nlwikimedia PT2/2 (duration: 00m 57s)
* 18:18 addshore@deploy1001: Synchronized dblists/visualeditor-nondefault.dblist: [[gerrit:613198]] [[phab:T256142|T256142]] Move VisualEditor from beta to default on nlwikimedia PT1/2 (duration: 00m 56s)
* 18:14 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:613164]] [[phab:T138104|T138104]] Wikibase: stop setting wgWBRepoSettings tmpSerializeEmptyListsAsObjects (duration: 00m 57s)
* 18:12 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:613192]] [[phab:T246420|T246420]] Enable limited-width layout for Modern Vector (duration: 00m 56s)
* 18:08 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:612870]] [[phab:T246977|T246977]] Disable affinity quicksurveys for the following wikis (duration: 00m 57s)
* 18:03 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 18:03 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:54 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:53 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:50 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:50 herron@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 17:49 herron@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:17 XioNoX: msw1-eqiad delete unused VC-ports
* 17:05 XioNoX: msw1-codfw - replace member-range with list of individual interfaces
* 16:45 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/Wikibase/: Backport: [[gerrit:613173{{!}}Re add OtherProjectsSidebarGenerator::buildProjectLinkSidebarFromItemId (T258184)]] (duration: 01m 02s)
* 16:11 effie: reboot rdb1009 - [[phab:T254990|T254990]]
* 16:06 effie: Reboot rdb1010 - [[phab:T254990|T254990]]
* 15:51 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/Wikibase/: Backport: [[gerrit:613170{{!}}Revert "Revert "Removes OtherProjectsSidebar hook"" (T258184)]] (duration: 01m 02s)
* 15:40 lucaswerkmeister-wmde@deploy1001: scap failed: average error rate on 7/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
* 15:15 akosiaris: lower codfw mobileapps kubernetes traffic to 10% [[phab:T218733|T218733]]. Will open up task for it
* 15:15 akosiaris@cumin1001: conftool action : set/weight=24; selector: dc=codfw,service=mobileapps,name=scb.*
* 15:07 XioNoX: repool eqsin - [[phab:T257154|T257154]]
* 15:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:02 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:00 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 14:54 XioNoX: load config on cr3-eqsin - [[phab:T257154|T257154]]
* 14:54 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/Wikibase/: Backport: [[gerrit:613167{{!}}Avoid trying to register wikibase.Site twice (T258065)]] (duration: 01m 03s)
* 14:43 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 14:31 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 14:15 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:12 moritzm: rebooting webperf hosts in eqiad for kernel update
* 14:09 XioNoX: upgrade junos on cr3-eqsin - [[phab:T257154|T257154]]
* 14:03 jayme: published image docker-registry.discovery.wmnet/envoy:1.14.4-1
* 13:47 XioNoX: remove nonstop-bridging from asw1-eqsin
* 13:36 XioNoX: power-off cr3-eqsin - [[phab:T257154|T257154]]
* 13:36 akosiaris: increase codfw mobileapps kubernetes traffic to 25% [[phab:T218733|T218733]]
* 13:35 akosiaris@cumin1001: conftool action : set/weight=8; selector: dc=codfw,service=mobileapps,name=scb.*
* 13:30 XioNoX: deactivate BGP groups IX/Transit/PyBal on cr3-eqsin - [[phab:T257154|T257154]]
* 13:27 moritzm: installing an-tool1008
* 13:23 XioNoX: depool eqsin for cr3 replacement - [[phab:T257154|T257154]]
* 13:13 volans@deploy1001: Finished deploy [homer/deploy@fcf4332]: Force deploy of the homer plugin (duration: 01m 27s)
* 13:12 volans@deploy1001: Started deploy [homer/deploy@fcf4332]: Force deploy of the homer plugin
* 13:04 kormat: restarting tendril to pick up new mariadb config [[phab:T257816|T257816]]
* 13:02 jforrester@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.41
* 13:02 akosiaris: increase codfw mobileapps kubernetes traffic to 10% [[phab:T218733|T218733]]
* 13:01 akosiaris@cumin1001: conftool action : set/weight=24; selector: dc=codfw,service=mobileapps,name=scb.*
* 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1092', diff saved to https://phabricator.wikimedia.org/P11926 and previous config saved to /var/cache/conftool/dbconfig/20200716-125643-marostegui.json
* 12:56 ayounsi@deploy1001: Finished deploy [homer/deploy@fcf4332]: CR607011 (duration: 04m 32s)
* 12:52 ayounsi@deploy1001: Started deploy [homer/deploy@fcf4332]: CR607011
* 12:42 ayounsi@deploy1001: Finished deploy [homer/deploy@fcf4332]: CR607011 (duration: 03m 42s)
* 12:38 ayounsi@deploy1001: Started deploy [homer/deploy@fcf4332]: CR607011
* 12:38 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:36 akosiaris@cumin1001: conftool action : set/weight=50; selector: dc=codfw,service=mobileapps,name=scb.*
* 12:35 akosiaris: increase codfw mobileapps kubernetes traffic to 5% [[phab:T218733|T218733]]
* 12:35 akosiaris: increase codfw mobileapps kubernetes traffic to 5%
* 12:34 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:22 jmm@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 12:12 jmm@cumin1001: START - Cookbook sre.ganeti.makevm
* 12:12 jmm@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 12:12 jmm@cumin1001: START - Cookbook sre.ganeti.makevm
* 12:12 jmm@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 12:12 jmm@cumin1001: START - Cookbook sre.ganeti.makevm
* 12:08 jayme: updated envoyproxy to 1.14.4-1 on mw-canary and restbase-canary
* 11:44 XioNoX: remove BGP to AS396253 in eqdfw (peer left the IX)
* 11:26 jforrester@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/UrlShortener/includes/UrlShortenerUtils.php: [[phab:T258134|T258134]] Fix config variables regex concatenation (duration: 01m 05s)
* 11:23 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: [[phab:T254315|T254315]] [[gerrit:612670]] Wikibase: remove wmgWikibaseLocalEntitySourceName (duration: 01m 05s)
* 11:18 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254315|T254315]] [[phab:T257266|T257266]] [[gerrit:609988]] Wikidata client wikis: Define entity sources configuration (take 3) (duration: 01m 08s)
* 10:17 jbond42: upgrade to hiera5
* 10:08 jbond42: disable puppet for hiera5 deployment
* 09:37 jayme: updated envoyproxy to 1.14.4-1 on mw1325.eqiad.wmnet and restbase1026.eqiad.wmnet
* 09:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:30 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:21 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:17 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:15 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:15 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 09:15 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:15 moritzm: rebooting flowspec1001
* 08:52 jayme: updated envoyproxy to 1.14.4-1 on mwdebug1001.eqiad.wmnet
* 08:41 moritzm: installing sqlite3 security updates
* 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2081', diff saved to https://phabricator.wikimedia.org/P11924 and previous config saved to /var/cache/conftool/dbconfig/20200716-083954-marostegui.json
* 08:35 XioNoX: Remove PIM/IGMP related CR stanza (acls) - [[phab:T257573|T257573]]
* 08:33 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:33 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:33 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:33 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:32 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:26 moritzm: installing dbus security updates
* 08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:24 XioNoX: remove igmp-snooping from access switches - [[phab:T257573|T257573]]
* 08:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:15 moritzm: installing python-urllib3 security updates
* 08:15 XioNoX: remove PIM config from eqord/eqdfw/knams routers - [[phab:T257573|T257573]]
* 08:14 XioNoX: remove PIM config from eqiad routers - [[phab:T257573|T257573]]
* 08:11 XioNoX: remove PIM config from esams routers - [[phab:T257573|T257573]]
* 08:09 XioNoX: remove PIM config from eqsin routers - [[phab:T257573|T257573]]
* 08:08 jbond42: update mail delivery for phabricator to use phabricator.discovery.wmnet cname
* 08:07 XioNoX: remove PIM config from codfw routers - [[phab:T257573|T257573]]
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2081', diff saved to https://phabricator.wikimedia.org/P11923 and previous config saved to /var/cache/conftool/dbconfig/20200716-080613-marostegui.json
* 08:03 XioNoX: remove PIM config from ulsfo routers - [[phab:T257573|T257573]]
* 07:41 jayme: imported envoyproxy_1.14.4-1 to stretch-wikimedia
* 07:31 jayme: imported envoyproxy_1.14.4-1 to buster-wikimedia
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1131', diff saved to https://phabricator.wikimedia.org/P11922 and previous config saved to /var/cache/conftool/dbconfig/20200716-072838-marostegui.json
* 07:25 marostegui: Drop database reviewdb-test [[phab:T255715|T255715]]
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1131', diff saved to https://phabricator.wikimedia.org/P11921 and previous config saved to /var/cache/conftool/dbconfig/20200716-070331-marostegui.json
* 06:40 XioNoX: remove peering with AS8403 in eqsin (peer left the IX)
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1131', diff saved to https://phabricator.wikimedia.org/P11920 and previous config saved to /var/cache/conftool/dbconfig/20200716-051342-marostegui.json
* 05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1131', diff saved to https://phabricator.wikimedia.org/P11919 and previous config saved to /var/cache/conftool/dbconfig/20200716-051109-marostegui.json
 
== 2020-07-15 ==
* 23:54 eileen: tools revision changed from {{Gerrit|7b6018a16e}} to {{Gerrit|711d671600}}
* 23:50 eileen: process-control config revision is {{Gerrit|1fc4a9686d}}
* 23:21 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 23:04 bd808: tools.admin Removed valhallasw from maintainers ([[phab:T255697|T255697]])
* 23:02 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 22:58 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 22:52 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 22:52 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 22:52 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 22:30 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 22:30 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 22:29 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 22:29 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 22:27 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 22:21 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 22:21 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 22:10 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 18:16 brennen: restarting jenkins for upgrade
* 18:00 mutante: DNS - new language 'avk' has been added - This language is called Kotava and is "a proposed international auxiliary language (IAL) that focuses especially on the principle of cultural neutrality". Learn more at https://en.wikipedia.org/wiki/Kotava
* 17:32 mutante: puppetmaster - revoking cert for planet.discovery.wmnet, add planet.wikimedia.org, remove planet.svc records, remove specific and outdated hostnames ([[phab:T257840|T257840]])
* 16:11 moritzm: uploaded jenkins 2.235.2 to thirdparty/ci for stretch/buster [[phab:T257614|T257614]]
* 15:29 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:24 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:24 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:20 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:20 moritzm: rebooting webperf* hosts for kernel update
* 14:58 addshore@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/Wikibase/repo: [[gerrit:612723]] Stop checking if WikibaseLib is loaded [[phab:T258062|T258062]] (already on mwmaint1002) (duration: 01m 08s)
* 14:51 addshore: pulled https://gerrit.wikimedia.org/r/612723 onto mwmaint 1002 ahead of syncing everywhere (and CI finishing)
* 14:37 ema: A:cp: upgrade purged to 0.17 [[phab:T257573|T257573]]
* 14:30 ema: upload purged 0.17 to buster-wikimedia [[phab:T257573|T257573]]
* 14:28 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add exceptional wikitech VE/Parsoid config [[phab:T241961|T241961]] (duration: 01m 04s)
* 14:26 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Add exceptional wikitech VE/Parsoid config [[phab:T241961|T241961]] (duration: 01m 05s)
* 14:25 gehel: repooling wdqs1006 - catched up on lag
* 14:12 akosiaris: increase codfw mobileapps kubernetes traffic to 2% [[phab:T218733|T218733]]
* 14:10 akosiaris@cumin1001: conftool action : set/weight=132; selector: dc=codfw,service=mobileapps,name=scb.*
* 13:58 jforrester@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/UrlShortener/includes/UrlShortenerUtils.php: [[phab:T258056|T258056]] Add temporary fix to ensure array is passed to array_map() (duration: 01m 08s)
* 13:54 akosiaris: pool kubernetes nodes for mobileapps in codfw
* 13:53 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=mobileapps,name=kubernetes.*
* 13:53 akosiaris@cumin1001: conftool action : set/weight=264; selector: dc=codfw,service=mobileapps,name=scb.*
* 13:51 akosiaris@cumin1001: conftool action : set/weight=1; selector: dc=codfw,service=mobileapps,name=kubernetes.*
* 13:04 jforrester@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.41 (duration: 01m 05s)
* 13:03 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.41
* 11:59 addshore: deploy window closed / done :)
* 11:57 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:609987]] Commons: Define entity sources configuration (take 2) [[phab:T254315|T254315]] (duration: 01m 03s)
* 11:36 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:612668]] Wikibase test: Client local entity sources are always testwikidata [[phab:T254315|T254315]] (duration: 01m 05s)
* 11:27 addshore@deploy1001: Synchronized wmf-config: [[phab:T254315|T254315]] [[gerrit:612669]] Wikidata test: Split client db lists. PT2/2 (duration: 01m 06s)
* 11:26 addshore@deploy1001: Synchronized dblists/wikidataclient.dblist: [[phab:T254315|T254315]] [[gerrit:612669]] Wikidata test: Split client db lists. PT1/2 (duration: 01m 05s)
* 11:16 XioNoX: remove as-path prepending in esams
* 11:11 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: LABS [[gerrit:612667]] Wikibase labs: All client "local" entity sources are wikidata [[phab:T254315|T254315]] (duration: 01m 04s)
* 11:08 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:612666]] Wikibase: Split localEntitySourceName config for repo and client [[phab:T254315|T254315]] (duration: 01m 16s)
* 11:05 XioNoX: re-enable ping offload in esams
* 11:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:56 XioNoX: disable ping offload in esams
* 10:55 XioNoX: re-enable ping offload in codfw
* 10:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:50 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:45 XioNoX: disable ping offload in codfw
* 10:44 XioNoX: re-enable ping offload in eqiad
* 10:43 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:41 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:31 XioNoX: disable ping offload in eqiad
* 10:31 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 10:30 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' .
* 10:30 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 10:30 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1120 after reimage', diff saved to https://phabricator.wikimedia.org/P11916 and previous config saved to /var/cache/conftool/dbconfig/20200715-102605-marostegui.json
* 10:20 jayme: updating python3-docker-report to 0.0.5-1 on deneb
* 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1120 after reimage', diff saved to https://phabricator.wikimedia.org/P11915 and previous config saved to /var/cache/conftool/dbconfig/20200715-100855-marostegui.json
* 10:07 jayme: imported docker-report_0.0.5-1 to buster-wikimedia
* 09:48 marostegui: Deploy schema change on s8 codfw master, lag will appear on codfw [[phab:T256685|T256685]]
* 09:42 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1120 after reimage', diff saved to https://phabricator.wikimedia.org/P11914 and previous config saved to /var/cache/conftool/dbconfig/20200715-094226-marostegui.json
* 09:22 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:21 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 09:19 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 09:19 akosiaris: deploy mobileapps in kubernetes to talk HTTPS to the mw API
* 09:10 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 09:10 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 09:07 akosiaris: Correction: deploy eventgate-analytics-external in staging, eqiad, codfw for switching to using discovery records and HTTPS for talking to the API
* 09:06 akosiaris: deploy eventgate-analytics in staging, eqiad, codfw for switching to using discovery records and HTTPS for talking to the API
* 09:06 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 09:06 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P11913 and previous config saved to /var/cache/conftool/dbconfig/20200715-090545-marostegui.json
* 09:04 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 09:04 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1120 after reimage', diff saved to https://phabricator.wikimedia.org/P11912 and previous config saved to /var/cache/conftool/dbconfig/20200715-085032-marostegui.json
* 08:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:19 moritzm: piwik.wikimedia.org switched to CAS authentication
* 08:19 elukey: move piwik.wikimedia.org to CAS (idp.wikimedia.org)
* 07:29 XioNoX: delete deprecated AS3209 AMS-IX router
* 06:59 dcausse: depooling wdqs1006 (high lag)
* 06:09 marostegui: Stop replication on db1120 to avoid having 10.4 -> 10.1 replication for long [[phab:T254871|T254871]]
* 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1120 for reimage [[phab:T254871|T254871]]', diff saved to https://phabricator.wikimedia.org/P11911 and previous config saved to /var/cache/conftool/dbconfig/20200715-060649-marostegui.json
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1103 to x1 master [[phab:T254871|T254871]]', diff saved to https://phabricator.wikimedia.org/P11910 and previous config saved to /var/cache/conftool/dbconfig/20200715-060145-marostegui.json
* 06:00 marostegui: Starting x1 failover from db1120 to db1103 - [[phab:T254871|T254871]]
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 ', diff saved to https://phabricator.wikimedia.org/P11909 and previous config saved to /var/cache/conftool/dbconfig/20200715-052939-marostegui.json
* 04:46 marostegui: Start x1 pre failover steps [[phab:T254871|T254871]]
* 04:44 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1103 weight to 0 before the switchover [[phab:T254871|T254871]]', diff saved to https://phabricator.wikimedia.org/P11908 and previous config saved to /var/cache/conftool/dbconfig/20200715-044432-marostegui.json
* 04:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1135', diff saved to https://phabricator.wikimedia.org/P11907 and previous config saved to /var/cache/conftool/dbconfig/20200715-044332-marostegui.json
* 01:45 eileen: tools revision changed from {{Gerrit|a9e7dc1559}} to {{Gerrit|7b6018a16e}}
* 00:26 ryankemper@deploy1001: Finished deploy [wdqs/wdqs@8f6f660]: 0.3.41 (duration: 15m 10s)
* 00:11 ryankemper@deploy1001: Started deploy [wdqs/wdqs@8f6f660]: 0.3.41
 
== 2020-07-14 ==
* 19:52 jforrester@deploy1001: Synchronized php-1.35.0-wmf.41/vendor/wikimedia/parsoid/: [[phab:T252448|T252448]] [[phab:T255190|T255190]] Bump Parsoid to v0.12.0-a23 (duration: 01m 06s)
* 18:13 ryankemper: all long-running elasticsearch reindex jobs are complete
* 18:09 jforrester@deploy1001: Synchronized dblists/: [[phab:T32405|T32405]] [[phab:T254287|T254287]] Remove the mobilemainpagelegacy dblist (duration: 01m 04s)
* 18:07 jforrester@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: [[phab:T32405|T32405]] [[phab:T254287|T254287]] Stop loading the mobilemainpagelegacy dblist (duration: 01m 05s)
* 18:05 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T32405|T32405]] [[phab:T254287|T254287]] Stop varying wgMFSpecialCaseMainPage (duration: 01m 05s)
* 15:56 elukey: upgrade spark2 on stat100x to 2.4.4-bin-hadoop2.6-3
* 15:40 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 15:25 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:23 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:17 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:13 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:12 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:10 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:02 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:57 jforrester@deploy1001: Synchronized php-1.35.0-wmf.41/skins/Vector/includes/SkinVector.php: [[phab:T257914|T257914]] Restore div wrapper around print footer (duration: 01m 03s)
* 14:53 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 14:50 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:48 jforrester@deploy1001: Synchronized php-1.35.0-wmf.41/extensions/WikibaseMediaInfo/src/WikibaseMediaInfoHooks.php: Fix case of directory name (duration: 01m 05s)
* 14:48 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:48 moritzm: rebooting apt1001 for kernel update
* 14:42 jynus: stopping db1117:3322 (m2) replication temp. for otrs db cloning [[phab:T257928|T257928]]
* 14:40 hnowlan@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' .
* 14:33 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:31 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:26 oblivian@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 14:23 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:21 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:20 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:18 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:18 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 14:18 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:14 oblivian@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 14:13 andrewbogott: upgrading wikitech-static to mw 1.34.2
* 14:11 oblivian@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 14:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:42 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.change-distro (exit_code=0)
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1112', diff saved to https://phabricator.wikimedia.org/P11900 and previous config saved to /var/cache/conftool/dbconfig/20200714-132823-marostegui.json
* 13:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1130', diff saved to https://phabricator.wikimedia.org/P11899 and previous config saved to /var/cache/conftool/dbconfig/20200714-132742-marostegui.json
* 13:27 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns1001.wikimedia.org
* 13:24 jbond42: reboot dns1001
* 13:23 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:23 jbond@cumin1001: START - Cookbook sre.hosts.downtime
* 13:22 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns1001.wikimedia.org
* 13:22 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns1002.wikimedia.org
* 13:18 jbond42: reboot dns1002
* 13:18 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:18 jbond@cumin1001: START - Cookbook sre.hosts.downtime
* 13:18 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns1002.wikimedia.org
* 13:16 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns2002.wikimedia.org
* 13:13 jbond42: reboot dns2002
* 13:13 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:13 jbond@cumin1001: START - Cookbook sre.hosts.downtime
* 13:13 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns2002.wikimedia.org
* 13:13 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns2001.wikimedia.org
* 13:10 jbond42: reboot dns2001
* 13:10 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:10 jbond@cumin1001: START - Cookbook sre.hosts.downtime
* 13:10 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns2001.wikimedia.org
* 13:09 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 13:06 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 13:01 jbond42: rebooting dns3002
* 13:01 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:01 jbond@cumin1001: START - Cookbook sre.hosts.downtime
* 12:58 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 12:57 oblivian@deploy1001: Synchronized wmf-config/InitialiseSettings.php: revert forcehttps after fixing [[phab:T257887|T257887]] (duration: 01m 02s)
* 12:31 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.change-distro (exit_code=0)
* 12:24 jbond42: route ns0.wikimedia.org to codfw for reboot
* 12:20 moritzm: installing xen security updates (client-side tools/libs)
* 12:19 jbond42: re-enable puppet fleet
* 12:07 jbond42: disable puppet fleet wide to reboot puppetdb's
* 12:07 jbond42: disable puppet ro reboot puppetdb's
* 12:01 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.41
* 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 for query plan checks [[phab:T238966|T238966]] ', diff saved to https://phabricator.wikimedia.org/P11898 and previous config saved to /var/cache/conftool/dbconfig/20200714-113612-marostegui.json
* 11:35 _joe_: restart pybal on lvs2009 [[phab:T257887|T257887]]
* 11:31 _joe_: restart pybal on lvs2010 [[phab:T257887|T257887]]
* 11:25 _joe_: restart pybal on lvs1015 [[phab:T257887|T257887]]
* 11:22 _joe_: restart pybal on lvs1016
* 11:15 jayme@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 11:03 jayme@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 10:59 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 10:56 volans@cumin1001: conftool action : set/pooled=inactive; selector: name=wtp2005.codfw.wmnet
* 10:52 volans: powerdown wtp2005, hardware issue - [[phab:T257903|T257903]]
* 10:47 volans@cumin1001: conftool action : set/pooled=no; selector: name=wtp2005.codfw.wmnet
* 10:45 jiji@cumin1001: conftool action : set/pooled=no; selector: name=wtp2005.codfw.wmnet,service=parsoid-php
* 10:45 jiji@cumin1001: conftool action : set/pooled=no; selector: name=wtp2005.codfw.wmnet,service=parsoid
* 10:45 effie: depool wtp2005
* 10:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 10:39 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 10:39 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 10:32 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 10:18 oblivian@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 10:14 James_F: Running AbuseFilter's updateVarDumps for group1 [[phab:T246539|T246539]]
* 10:13 oblivian@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 10:10 oblivian@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 10:10 oblivian@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
* 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P11897 and previous config saved to /var/cache/conftool/dbconfig/20200714-094449-marostegui.json
* 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075', diff saved to https://phabricator.wikimedia.org/P11896 and previous config saved to /var/cache/conftool/dbconfig/20200714-094354-marostegui.json
* 09:39 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor: Add REL1_35 as a candidate release (duration: 01m 06s)
* 09:05 jforrester@deploy1001: Finished scap: Re-re-start full scap to push out wmf.41 and switch testwikis to it [[phab:T256669|T256669]] (duration: 51m 41s)
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 for PDU upgrade [[phab:T257871|T257871]]', diff saved to https://phabricator.wikimedia.org/P11895 and previous config saved to /var/cache/conftool/dbconfig/20200714-084033-marostegui.json
* 08:30 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:30 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:30 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:30 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:30 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:30 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 08:13 jforrester@deploy1001: Started scap: Re-re-start full scap to push out wmf.41 and switch testwikis to it [[phab:T256669|T256669]]
* 08:05 akosiaris: restart pybal on lvs2009
* 08:03 _joe_: restart pybal on lvs1016
* 08:02 akosiaris: restart pybal on lvs2007
* 08:01 akosiaris@cumin1001: conftool action : set/pooled=inactive; selector: name=restbase2009.codfw.wmnet
* 08:00 _joe_: restart pybal on lvs1015
* 08:00 akosiaris: restart pybal on lvs2010 after merging https://gerrit.wikimedia.org/r/612487
* 07:52 jforrester@deploy1001: sync aborted: Re-start full scap to push out wmf.41 and switch testwikis to it [[phab:T256669|T256669]] (duration: 02m 14s)
* 07:50 jforrester@deploy1001: Started scap: Re-start full scap to push out wmf.41 and switch testwikis to it [[phab:T256669|T256669]]
* 07:48 oblivian@deploy1001: Synchronized wmf-config/InitialiseSettings.php: revert forcehttps in an attempt to fix [[phab:T257887|T257887]] (duration: 01m 06s)
* 07:32 oblivian@deploy1001: sync-file aborted: revert forcehttps in an attempt to fix [[phab:T257887|T257887]] (duration: 00m 20s)
* 07:31 oblivian@deploy1001: Scap failed!: 7/9 canaries failed their endpoint checks(http://en.wikipedia.org)
* 07:27 moritzm: installing libtasn1-6 security updates
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075', diff saved to https://phabricator.wikimedia.org/P11894 and previous config saved to /var/cache/conftool/dbconfig/20200714-071233-marostegui.json
* 07:04 marostegui: Drop gerrit, gerritro, gerrittest users from m2 databases - [[phab:T255715|T255715]]
* 06:58 marostegui: Stop mysql on db1131 for HW maintenance
* 06:56 oblivian@deploy2001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 06:54 jforrester@deploy1001: scap failed: RuntimeError Scap failed!: 9/9 canaries failed their endpoint checks(http://en.wikipedia.org) (duration: 24m 59s)
* 06:54 jforrester@deploy1001: Scap failed!: 9/9 canaries failed their endpoint checks(http://en.wikipedia.org)
* 06:53 oblivian@deploy2001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 06:53 marostegui: Deploy MCR schema change on s5 primary master [[phab:T238966|T238966]]
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1078', diff saved to https://phabricator.wikimedia.org/P11893 and previous config saved to /var/cache/conftool/dbconfig/20200714-065229-marostegui.json
* 06:29 jforrester@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.41
* 05:15 marostegui@cumin1001: dbctl commit (dc=all): 'Decrease a bit db1088 load', diff saved to https://phabricator.wikimedia.org/P11891 and previous config saved to /var/cache/conftool/dbconfig/20200714-051551-marostegui.json
* 05:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131 for HW maintenance', diff saved to https://phabricator.wikimedia.org/P11890 and previous config saved to /var/cache/conftool/dbconfig/20200714-050931-marostegui.json
* 05:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 from api', diff saved to https://phabricator.wikimedia.org/P11889 and previous config saved to /var/cache/conftool/dbconfig/20200714-050912-marostegui.json
* 05:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1093 to s6 master and remove read-only from s6 [[phab:T257253|T257253]]', diff saved to https://phabricator.wikimedia.org/P11888 and previous config saved to /var/cache/conftool/dbconfig/20200714-050157-marostegui.json
* 05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Set s6 as read-only for maintenance [[phab:T257253|T257253]]', diff saved to https://phabricator.wikimedia.org/P11887 and previous config saved to /var/cache/conftool/dbconfig/20200714-050039-marostegui.json
* 05:00 marostegui: Starting s6 failover from db1131 to db1093 - [[phab:T257253|T257253]]
* 04:59 James_F: 1.35.0-wmf.41 branched at {{Gerrit|7d04152db4f8ea9a459511bed8117101d9bb4602}}
* 04:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078', diff saved to https://phabricator.wikimedia.org/P11886 and previous config saved to /var/cache/conftool/dbconfig/20200714-043907-marostegui.json
* 04:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 in preparation for failover', diff saved to https://phabricator.wikimedia.org/P11885 and previous config saved to /var/cache/conftool/dbconfig/20200714-041548-marostegui.json
* 04:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1130', diff saved to https://phabricator.wikimedia.org/P11884 and previous config saved to /var/cache/conftool/dbconfig/20200714-041440-marostegui.json
* 01:23 ryankemper: Started long-running Elasticsearch reindex of `eqiad`, `codfw`, and `cloudelastic`. tmux session `reindex` under `ryankemper` on `mwmaint1002`
* 01:20 cdanis: ❌cdanis@lvs1015.eqiad.wmnet ~ 🕤🍺 sudo systemctl restart pybal.service
* 01:15 cdanis: ✔️ cdanis@lvs1016.eqiad.wmnet ~ 🕘🍺 sudo systemctl restart pybal.service
* 01:14 cdanis: ✔️ cdanis@lvs2009.codfw.wmnet ~ 🕘🍺 sudo systemctl restart pybal.service
* 01:01 cdanis: ✔️ cdanis@lvs2010.codfw.wmnet ~ 🕘🍺 sudo systemctl restart pybal.service
 
== 2020-07-13 ==
* 23:06 mutante: releases* delete /usr/local/sbin/sync-* scripts created by rsync::quickdatacopy and let puppet recreate the ones still needed
* 22:27 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I80ca62643f5c}} (duration: 00m 58s)
* 20:12 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@1edde21]: airflow: ship_to_es: Implement multi-index understanding (duration: 00m 29s)
* 20:12 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@1edde21]: airflow: ship_to_es: Implement multi-index understanding
* 20:03 mutante: rsynced reprepro data from releases1001 to releases1002, releases2002
* 19:50 eileen: disable target smart job process-control config revision is {{Gerrit|b00e7680ca}}
* 19:48 milimetric@deploy1001: Finished deploy [analytics/refinery@de0a1f1] (thin): Regular analytics weekly train THIN [analytics/refinery@de0a1f1] (duration: 00m 07s)
* 19:47 milimetric@deploy1001: Started deploy [analytics/refinery@de0a1f1] (thin): Regular analytics weekly train THIN [analytics/refinery@de0a1f1]
* 19:47 milimetric@deploy1001: Finished deploy [analytics/refinery@de0a1f1]: Regular analytics weekly train [analytics/refinery@de0a1f1] (duration: 06m 41s)
* 19:41 milimetric@deploy1001: Started deploy [analytics/refinery@de0a1f1]: Regular analytics weekly train [analytics/refinery@de0a1f1]
* 19:39 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 19:33 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I1a12124f1811e9a}} (duration: 00m 57s)
* 18:53 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T248343|T248343]] Don't use the 'zeroconf' configuration for VisualEditor (duration: 00m 55s)
* 18:43 dcausse: BACON done
* 18:40 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T257745|T257745]]: Add rollbacker to elwiki (duration: 00m 56s)
* 18:26 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T250810|T250810]]: Set proper language code for some wikis (duration: 00m 56s)
* 18:18 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T256928|T256928]]: Scale largest shards to be closer to 30GB (duration: 00m 56s)
* 16:17 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:17 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 15:56 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:610265{{!}}Load WikibaseClient using extension registration in beta (T257435)]] (duration: 00m 55s)
* 15:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P11882 and previous config saved to /var/cache/conftool/dbconfig/20200713-155240-marostegui.json
* 15:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P11881 and previous config saved to /var/cache/conftool/dbconfig/20200713-154847-marostegui.json
* 15:39 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' .
* 15:35 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' .
* 15:30 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 14:50 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting DiscussionToolsEnableVisual, default value (duration: 00m 57s)
* 14:17 moritzm: removing lilypond from production [[phab:T257066|T257066]]
* 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3315', diff saved to https://phabricator.wikimedia.org/P11880 and previous config saved to /var/cache/conftool/dbconfig/20200713-133604-marostegui.json
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1082', diff saved to https://phabricator.wikimedia.org/P11879 and previous config saved to /var/cache/conftool/dbconfig/20200713-133535-marostegui.json
* 13:05 kormat@cumin1001: dbctl commit (dc=all): 'Fully repool es1022, and set es1020 to zero weight [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11878 and previous config saved to /var/cache/conftool/dbconfig/20200713-130532-kormat.json
* 12:08 kormat@cumin1001: dbctl commit (dc=all): 'Start repooling es1022 after reimaging [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11873 and previous config saved to /var/cache/conftool/dbconfig/20200713-120818-kormat.json
* 11:49 Urbanecm: Password reset for User:Alert5 ([[phab:T257806|T257806]])
* 11:44 akosiaris: repool ganeti1007 [[phab:T244530|T244530]]. Start emptying ganeti1008
* 11:08 Urbanecm: EU B&C done
* 11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|896c042296b4e1f5d88f786981537655e5d9fea9}}: Enable SandboxLink extension in trwiki ([[phab:T256782|T256782]]) (duration: 00m 56s)
* 10:44 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:612175{{!}} Bumping portals to master (612175)]] (duration: 00m 56s)
* 10:43 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:612175{{!}} Bumping portals to master (612175)]] (duration: 00m 56s)
* 09:42 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:39 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 08:58 ema: cp: rolling ats-backend-restart to apply SyslogIdentifier changes -> https://gerrit.wikimedia.org/r/c/operations/puppet/+/611311
* 08:57 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T248343|T248343]] Explicitly set visualeditor-enable to 0 when non-default (duration: 00m 57s)
* 08:44 kormat@cumin1001: dbctl commit (dc=all): 'Depool es1022 for reimaging [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11871 and previous config saved to /var/cache/conftool/dbconfig/20200713-084449-kormat.json
* 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1093', diff saved to https://phabricator.wikimedia.org/P11870 and previous config saved to /var/cache/conftool/dbconfig/20200713-083902-marostegui.json
* 08:34 kormat@cumin1001: dbctl commit (dc=all): 'Add weight to es1020, reduce weight on es1022 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11869 and previous config saved to /var/cache/conftool/dbconfig/20200713-083414-kormat.json
* 08:20 kormat: reimaging es1022 [[phab:T257284|T257284]]
* 06:54 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 06:53 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:52 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 06:52 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:51 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 06:50 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:16 marostegui: Reverse gerrit password on m2 master - [[phab:T255715|T255715]]
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P11868 and previous config saved to /var/cache/conftool/dbconfig/20200713-060410-marostegui.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P11867 and previous config saved to /var/cache/conftool/dbconfig/20200713-055422-marostegui.json
* 05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 for upgrade', diff saved to https://phabricator.wikimedia.org/P11866 and previous config saved to /var/cache/conftool/dbconfig/20200713-054840-marostegui.json
* 05:34 marostegui: Deploy schema change on s3 codfw master, lag will appear on codfw [[phab:T253276|T253276]]
* 05:30 marostegui: Stop replication on db1082 for schema change and triggers removal [[phab:T238966|T238966]]
* 05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082', diff saved to https://phabricator.wikimedia.org/P11865 and previous config saved to /var/cache/conftool/dbconfig/20200713-052928-marostegui.json
* 05:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 for innodb compression', diff saved to https://phabricator.wikimedia.org/P11864 and previous config saved to /var/cache/conftool/dbconfig/20200713-051428-marostegui.json
 
== 2020-07-11 ==
* 19:16 qchris: Restarting Gerrit on gerrit1001 to switch to new gerrit.war and zuul plugin
* 19:16 qchris@deploy1001: Finished deploy [gerrit/gerrit@a71a0df]: Gerrit to v3.2.2-138-g230805407f and zuul plugin to master-12-ge51d7e8 on gerrit1001 (duration: 00m 07s)
* 19:15 qchris@deploy1001: Started deploy [gerrit/gerrit@a71a0df]: Gerrit to v3.2.2-138-g230805407f and zuul plugin to master-12-ge51d7e8 on gerrit1001
* 19:08 qchris: Restarting Gerrit on gerrit2001 to switch to new gerrit.war and zuul plugin
* 18:55 qchris@deploy1001: Finished deploy [gerrit/gerrit@a71a0df]: Gerrit to v3.2.2-138-g230805407f and zuul plugin to master-12-ge51d7e8 on gerrit2001 (duration: 00m 10s)
* 18:55 qchris@deploy1001: Started deploy [gerrit/gerrit@a71a0df]: Gerrit to v3.2.2-138-g230805407f and zuul plugin to master-12-ge51d7e8 on gerrit2001
 
== 2020-07-10 ==
* 21:52 ryankemper: Started long-running reindex of Elasticsearch indices in `eqiad`, `codfw`, and `dewiki` on `mwmaint1002` under tmux session `reindex` for user `ryankemper`
* 20:26 jgleeson: updated fundraising-tools from {{Gerrit|08ba1f6177}} to {{Gerrit|f8e424fe32}}
* 19:02 mutante: removing firewall hole for gerrit -> mysql servers on dbproxy servers for misc db's
* 18:44 mutante: kubernetes1004 - started nagios-nrpe-server
* 17:57 ebernhardson: change loginwiki password for Cindy-the-browser-test-bot, no email account was associated to allow for normal reset.
* 17:05 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I63fcea7737}} (duration: 00m 57s)
* 16:16 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.change-distro (exit_code=99)
* 15:57 milimetric@deploy1001: Finished deploy [analytics/refinery@4d40145] (thin): Update EventLogging refine whitelist (THIN) (duration: 00m 08s)
* 15:56 milimetric@deploy1001: Started deploy [analytics/refinery@4d40145] (thin): Update EventLogging refine whitelist (THIN)
* 15:44 milimetric@deploy1001: Finished deploy [analytics/refinery@4d40145]: Update EventLogging refine whitelist (duration: 15m 17s)
* 15:30 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 15:29 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 15:29 milimetric@deploy1001: Started deploy [analytics/refinery@4d40145]: Update EventLogging refine whitelist
* 15:19 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 15:03 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.change-distro (exit_code=0)
* 14:39 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 14:37 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 14:30 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 13:41 godog: bounce ms-be1037, not quite responsive
* 12:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1110', diff saved to https://phabricator.wikimedia.org/P11860 and previous config saved to /var/cache/conftool/dbconfig/20200710-123604-marostegui.json
* 12:20 reedy@deploy1001: Synchronized php-1.35.0-wmf.40/extensions/Score/: Make Score errors use a specific css class (duration: 00m 58s)
* 10:21 kormat@cumin1001: dbctl commit (dc=all): 'Finish repooling es1021, and remove weight from es1010 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11859 and previous config saved to /var/cache/conftool/dbconfig/20200710-102147-kormat.json
* 09:49 kormat@cumin1001: dbctl commit (dc=all): 'Start repooling es1021 after reimage @ 50% [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11858 and previous config saved to /var/cache/conftool/dbconfig/20200710-094954-kormat.json
* 09:04 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:02 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P11857 and previous config saved to /var/cache/conftool/dbconfig/20200710-085157-marostegui.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106', diff saved to https://phabricator.wikimedia.org/P11856 and previous config saved to /var/cache/conftool/dbconfig/20200710-085112-marostegui.json
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1107', diff saved to https://phabricator.wikimedia.org/P11855 and previous config saved to /var/cache/conftool/dbconfig/20200710-085040-marostegui.json
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106', diff saved to https://phabricator.wikimedia.org/P11853 and previous config saved to /var/cache/conftool/dbconfig/20200710-082346-marostegui.json
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1107', diff saved to https://phabricator.wikimedia.org/P11852 and previous config saved to /var/cache/conftool/dbconfig/20200710-082329-marostegui.json
* 08:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:22 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:22 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1107', diff saved to https://phabricator.wikimedia.org/P11851 and previous config saved to /var/cache/conftool/dbconfig/20200710-080912-marostegui.json
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1119', diff saved to https://phabricator.wikimedia.org/P11850 and previous config saved to /var/cache/conftool/dbconfig/20200710-080854-marostegui.json
* 08:09 kormat@cumin1001: dbctl commit (dc=all): 'Depool es1021 for reimaging [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11849 and previous config saved to /var/cache/conftool/dbconfig/20200710-080843-kormat.json
* 08:01 kormat@cumin1001: dbctl commit (dc=all): 'Reset es2020/es2021 to correct weights after master switch [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11848 and previous config saved to /var/cache/conftool/dbconfig/20200710-080133-kormat.json
* 08:00 moritzm: installing cron security updates on jessie (stretch/buster already fixed)
* 07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119', diff saved to https://phabricator.wikimedia.org/P11847 and previous config saved to /var/cache/conftool/dbconfig/20200710-075608-marostegui.json
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1107', diff saved to https://phabricator.wikimedia.org/P11846 and previous config saved to /var/cache/conftool/dbconfig/20200710-075500-marostegui.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1079', diff saved to https://phabricator.wikimedia.org/P11845 and previous config saved to /var/cache/conftool/dbconfig/20200710-075431-marostegui.json
* 07:44 kormat: reimaging es1021 to buster [[phab:T257284|T257284]]
* 07:43 kormat@cumin1001: dbctl commit (dc=all): 'Add weight to es1020, reduce weight on es1021 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11844 and previous config saved to /var/cache/conftool/dbconfig/20200710-074326-kormat.json
* 07:41 jbond@deploy1001: Finished deploy [librenms/librenms@0a88d64]: redeplopy to [try and] fix php errors (duration: 00m 05s)
* 07:41 jbond@deploy1001: Started deploy [librenms/librenms@0a88d64]: redeplopy to [try and] fix php errors
* 07:32 moritzm: installing e2fsprogs security updates on jessie (stretch/buster already fixed)
* 07:15 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' .
* 07:14 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' .
* 07:13 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3311', diff saved to https://phabricator.wikimedia.org/P11843 and previous config saved to /var/cache/conftool/dbconfig/20200710-065751-marostegui.json
* 06:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311', diff saved to https://phabricator.wikimedia.org/P11841 and previous config saved to /var/cache/conftool/dbconfig/20200710-063818-marostegui.json
* 06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1134', diff saved to https://phabricator.wikimedia.org/P11840 and previous config saved to /var/cache/conftool/dbconfig/20200710-063746-marostegui.json
* 06:35 marostegui: Compress InnoDB on db1124:3311 (Sanitarium - lag will appear on s1 on labsdb) - [[phab:T254462|T254462]]
* 04:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134', diff saved to https://phabricator.wikimedia.org/P11839 and previous config saved to /var/cache/conftool/dbconfig/20200710-044428-marostegui.json
* 01:44 mutante: LDAP - adding coka to wmde and nda ([[phab:T257038|T257038]])
* 00:47 Reedy: truncated labswiki.interwiki table (outdated and unnecessary)
 
== 2020-07-09 ==
* 23:10 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|I2c2dea832}} (duration: 00m 56s)
* 21:52 tgr: all sessions have been invalidated due to [[phab:T256395|T256395]]
* 20:58 eileen: https://phabricator.wikimedia.org/T253152
* 19:16 herron: upgraded eqiad elk7 cluster from 7.4.2 to 7.8.0 [[phab:T234854|T234854]]
* 19:05 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.40  refs [[phab:T256668|T256668]]
* 18:51 elukey: update spark2 to 2.4.4-bin-hadoop2.6-3 for buster-wikimedia
* 18:44 mutante: stat1004, stat1006, stat1007 - upgrading git-review package from 1.25 to 1.27 so that it keeps working with new Gerrit 3.2 ([[phab:T257609|T257609]])
* 18:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9f2557f848e99facaa62ca6b3a948cc3e32c32a3}}: Updating config for Readers Web affinity quicksurvey ([[phab:T246977|T246977]]) (duration: 01m 06s)
* 17:42 chaomodus: codfw frack management dns automation deployment complete [[phab:T233183|T233183]]
* 17:37 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 17:36 James_F: Synchronized wmf-config/CommonSettings.php: ExtensionDistribution: Drop REL1_33, EOL'ed [[phab:T256087|T256087]]
* 17:35 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 17:35 moritzm: rebooting moscovium for kernel update
* 17:33 chaomodus: deploying frack codfw management dns automation
* 17:32 crusnov@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 17:28 crusnov@cumin2001: START - Cookbook sre.dns.netbox
* 17:27 moritzm: rebooting planet1002 (planet.wikimedia.org) for kernel update
* 17:27 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 17:10 krinkle@deploy1001: Synchronized wmf-config/: {{Gerrit|Ia2f5eddbf2aad2}} (duration: 01m 04s)
* 17:09 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|Ia2f5eddbf2aad2}} (duration: 01m 05s)
* 15:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:29 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:29 papaul: replacing msw-b1,b2,b3 and b4
* 14:03 moritzm: installing libtirpc security updates
* 13:45 moritzm: installing gnutls28 security updates
* 13:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1089', diff saved to https://phabricator.wikimedia.org/P11831 and previous config saved to /var/cache/conftool/dbconfig/20200709-133134-marostegui.json
* 13:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:29 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:29 moritzm: rebooting puppetboard1001 (puppetboard.wikimedia.org) for kernel update
* 13:15 moritzm: installing ffmpeg security updates
* 13:11 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089', diff saved to https://phabricator.wikimedia.org/P11830 and previous config saved to /var/cache/conftool/dbconfig/20200709-131039-marostegui.json
* 13:08 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:07 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:05 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:58 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:57 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' .
* 12:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:56 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' .
* 12:56 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 12:54 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:54 moritzm: rebooting install* servers for kernel security update
* 12:43 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:40 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:38 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:38 moritzm: rebooting urldownloader1001/2001 for kernel update (failed over, these are now the inactive ones)
* 12:23 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 12:22 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:22 moritzm: rebooting dbmonitor1001 / tendril.wikimedia.org for kernek update
* 12:11 XioNoX: enable asw2-b-eqiad:ae3 (to cloudsw1-c8) - [[phab:T251632|T251632]]
* 11:56 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:54 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:50 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:50 moritzm: rebooting debmonitor1001 for kernel update
* 11:42 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.40/extensions/Translate/tag/SpecialPageTranslation.php: {{Gerrit|6541d3ff51f52fe8a1bdbfa86022f8d97d6c7680}}: DeprecatablePropertyArray: Use MW_VERSION instead of array_key_exists ([[phab:T257531|T257531]]) (duration: 01m 05s)
* 11:28 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3a7c1c33e58637437f819edf039008a00dc5be27}}: Rename namespace on kn.wikipedia.org ([[phab:T255337|T255337]]) (duration: 01m 04s)
* 11:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0a3c1f94a702b527842ed4f34d8bf41b26235e64}}: Add *.oireachtas.ie to the wgCopyUploadsDomains whitelist for commonswiki ([[phab:T256543|T256543]]) (duration: 01m 04s)
* 11:19 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:17 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:10 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:10 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 11:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e6f442c6900524482806aeb1b5162e65bf7c97ac}}: Enable Quicksurveys for Desktop Improvements Project ([[phab:T246977|T246977]]) (duration: 01m 06s)
* 11:01 vgutierrez: restart ats-tls on cp1085
* 10:55 _joe_: restarting php7.2-fpm on mw1282, workers failing with sigill
* 10:54 _joe_: depool mw1282
* 10:54 mvolz@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 10:34 mvolz@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 10:23 _joe_: rolling restart the remaining restbases in eqiad, and all of codfw
* 10:22 mvolz@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 10:09 _joe_: restarting restbase on rb1020-22
* 09:53 _joe_: restarting restbase on restbase1024,1023
* 09:36 _joe_: restarting restbase on rb1026,1027 to switch to proton on k8s
* 09:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 09:28 _joe_: restarting restbase on restbase1025 to pick up the switch to k8s of proton
* 09:27 godog: bounce thanos-compact on thanos-fe2001
* 09:07 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.change-distro (exit_code=0)
* 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P11828 and previous config saved to /var/cache/conftool/dbconfig/20200709-085228-marostegui.json
* 08:44 marostegui: Stop haproxy on dbproxy1017 before upgrading to buster - [[phab:T255408|T255408]]
* 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1136', diff saved to https://phabricator.wikimedia.org/P11827 and previous config saved to /var/cache/conftool/dbconfig/20200709-082355-marostegui.json
* 08:23 moritzm: imported osm2pgsql 0.96.0+ds-1~bpo9+1 to "main" component [[phab:T256877|T256877]]
* 08:22 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 08:20 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 08:13 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 08:11 XioNoX: disable igmp snooping on msw1-codfw
* 07:59 marostegui: Stop db1117:3322 to clone db1084, this will trigger haproxy alerts - [[phab:T257540|T257540]]
* 07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1136', diff saved to https://phabricator.wikimedia.org/P11825 and previous config saved to /var/cache/conftool/dbconfig/20200709-075749-marostegui.json
* 07:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 06:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3317', diff saved to https://phabricator.wikimedia.org/P11824 and previous config saved to /var/cache/conftool/dbconfig/20200709-053905-marostegui.json
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1084 from dbctl', diff saved to https://phabricator.wikimedia.org/P11823 and previous config saved to /var/cache/conftool/dbconfig/20200709-053206-marostegui.json
* 05:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084', diff saved to https://phabricator.wikimedia.org/P11822 and previous config saved to /var/cache/conftool/dbconfig/20200709-051826-marostegui.json
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3317', diff saved to https://phabricator.wikimedia.org/P11821 and previous config saved to /var/cache/conftool/dbconfig/20200709-051355-marostegui.json
* 05:11 marostegui: Remove revision triggers from db2093:3315 [[phab:T238966|T238966]]
* 05:10 marostegui: Deploy schema change on s5 codfw, lag will be generated - [[phab:T238966|T238966]]
* 01:43 tzatziki: reset email for GseSro
* 00:58 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕘🍺 sudo cumin A:cp 'enable-puppet "cdanis deploying {{Gerrit|I6c1b646e}} [[phab:T256395|T256395]]"'
* 00:49 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕘🍺 sudo cumin A:cp 'disable-puppet "cdanis deploying {{Gerrit|I6c1b646e}} [[phab:T256395|T256395]]"'
 
== 2020-07-08 ==
* 21:56 mutante: deleting files from releases2001 that are not existing on releases1001 to make them mirrors. rsync with --delete and the command from quickdatacopy class ([[phab:T247652|T247652]])
* 21:55 mutante: rsyncing releases files from releases1001 to releases2002 and releases1002. deleting files from releases2002 not existing on releases1002 to make them mirrors ( [[phab:T247652|T247652]]_
* 20:59 cstone: civicrm revision changed from {{Gerrit|d73ee2e73f}} to {{Gerrit|8b09c87ce2}},
* 20:27 Amir1: end of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T256012|T256012]])
* 20:08 Amir1_: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T256012|T256012]])
* 19:18 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.40  refs [[phab:T256668|T256668]] (duration: 01m 04s)
* 19:17 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.40  refs [[phab:T256668|T256668]]
* 18:58 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|091442cf035a6d76f1211291afbb3193c513595d}}: Add *.nga.gov to the wgCopyUploadsDomains allowlist of Wikimedia Commons ([[phab:T256518|T256518]]) (duration: 01m 04s)
* 18:55 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2e5943ddb30e08607a9ffb6ed05a042e8367e2e1}}: Add scan-bugs.org to $wgCopyUploadsDomains ([[phab:T256569|T256569]]) (duration: 01m 04s)
* 18:46 urbanecm@deploy1001: Synchronized static/images/project-logos/: {{Gerrit|f42cdf2}}: Change bnwiki logo ([[phab:T255328|T255328]]) (duration: 01m 04s)
* 18:27 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Cleanup: remove temporary wmgDisableHTCP variable gerrit:607596 [[phab:T250781|T250781]] IS.php (duration: 01m 01s)
* 18:20 ppchelko@deploy1001: Synchronized wmf-config/CommonSettings.php: Disable HTCP purging everywhere gerrit:607593 [[phab:T250781|T250781]] CS.php (duration: 01m 03s)
* 18:18 ppchelko@deploy1001: Synchronized wmf-config/wikitech.php: Disable HTCP purging everywhere gerrit:607593 [[phab:T250781|T250781]] wikitech.php (duration: 01m 04s)
* 18:17 ppchelko@deploy1001: Synchronized wmf-config/reverse-proxy.php: Disable HTCP purging everywhere gerrit:607593 [[phab:T250781|T250781]] reverse-proxy.php (duration: 01m 04s)
* 18:11 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add wgEventServiceDefault to refactor EventBus event stream config gerrit:610160 [[phab:T229863|T229863]], IS.php (duration: 01m 03s)
* 18:04 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add wgEventServiceDefault to refactor EventBus event stream config gerrit:610160 [[phab:T229863|T229863]] (duration: 01m 04s)
* 17:34 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.change-distro (exit_code=99)
* 17:16 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 17:16 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 17:08 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 16:57 _joe_: restarting restbase across the fleet to transition to using envoy
* 16:40 _joe_: restarting restbase on restbase2010 to route calls to mediawiki, parsoid via envoy
* 16:40 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:37 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:32 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:27 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:22 jgleeson: updated fundraising-tools from {{Gerrit|a244e0e85f}} --> {{Gerrit|f5b8528214}}
* 15:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:12 moritzm: rebooting people1002 (people.wikimedia.org) for kernel security update
* 15:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:48 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:46 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:46 moritzm: installing isc-dhcp security updates
* 14:31 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.change-distro (exit_code=99)
* 14:31 moritzm: installing gdk-pixbuf security updates
* 14:26 _joe_: repooling mw1346
* 14:24 _joe_: php7adm /opcache-free on mw1346
* 14:15 jbond42: switch icinga authentication to CAS SSO
* 14:12 _joe_: depooling mw1346
* 14:12 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 14:11 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 14:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:04 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 14:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:04 moritzm: rebooting idp-test1001 for kernel update
* 13:59 elukey@cumin1001: END (ERROR) - Cookbook sre.hadoop.stop-cluster (exit_code=97)
* 13:59 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 13:39 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.change-distro (exit_code=0)
* 13:31 jynus: replacing ssh key for ci_docroot at deploy1001
* 13:31 moritzm: imported git 2.20.1-2+deb10u3~wmf1 for stretch-wikimedia component/git [[phab:T257308|T257308]]
* 13:10 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 13:07 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 13:00 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 12:41 marostegui: Deploy schema change on s7 codfw, lag is expected
* 12:17 xionox-tmp: rollout less frequent option-refresh-rate - [[phab:T240658|T240658]]
* 12:01 xionox-tmp: renumber eqiad NTT link - [[phab:T254877|T254877]]
* 11:42 awight: EU BACON complete
* 11:41 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:610234{{!}}Undeploy graphoid for phase 1 wikis (T257402)]] (duration: 01m 03s)
* 11:31 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:610268{{!}}Add nature.com to commonswiki wgCopyUploadDomains (T254342)]] (duration: 01m 03s)
* 11:29 moritzm: installing freetype security updates
* 11:26 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:609991{{!}}[hiwikibooks] Translate sitename for hi.wikibooks (T256587)]] (duration: 01m 03s)
* 11:19 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:609990{{!}}[arwiki] Grant 'patrolmarks' to all (T257106)]] (duration: 01m 04s)
* 11:18 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' .
* 11:18 moritzm: installing libgcrypt20 security updates
* 11:16 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' .
* 11:07 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:610056{{!}}Provision WMDE TeWü survey for prototype 1 (T257306)]], file 2/2 (duration: 01m 03s)
* 11:06 awight@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BACON: [[gerrit:610056{{!}}Provision WMDE TeWü survey for prototype 1 (T257306)]], file 1/2 (duration: 01m 16s)
* 11:05 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'proton' for release 'production' .
* 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1121', diff saved to https://phabricator.wikimedia.org/P11818 and previous config saved to /var/cache/conftool/dbconfig/20200708-110546-marostegui.json
* 10:51 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 10:51 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 10:50 akosiaris: apply calico egress policies
* 10:50 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 10:45 moritzm: installing json-c security updates
* 10:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P11817 and previous config saved to /var/cache/conftool/dbconfig/20200708-102553-marostegui.json
* 10:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084', diff saved to https://phabricator.wikimedia.org/P11816 and previous config saved to /var/cache/conftool/dbconfig/20200708-102500-marostegui.json
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084', diff saved to https://phabricator.wikimedia.org/P11815 and previous config saved to /var/cache/conftool/dbconfig/20200708-101313-marostegui.json
* 09:58 kormat@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:57 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 09:56 kormat@cumin2001: START - Cookbook sre.hosts.downtime
* 09:50 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1149', diff saved to https://phabricator.wikimedia.org/P11814 and previous config saved to /var/cache/conftool/dbconfig/20200708-094539-marostegui.json
* 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1149', diff saved to https://phabricator.wikimedia.org/P11813 and previous config saved to /var/cache/conftool/dbconfig/20200708-092650-marostegui.json
* 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1148', diff saved to https://phabricator.wikimedia.org/P11812 and previous config saved to /var/cache/conftool/dbconfig/20200708-092627-marostegui.json
* 09:24 xionox-tmp: renumber eqord NTT link - [[phab:T254877|T254877]]
* 09:18 xionox-tmp: remove eqord-eqiad tunnel - [[phab:T254877|T254877]]
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1148', diff saved to https://phabricator.wikimedia.org/P11811 and previous config saved to /var/cache/conftool/dbconfig/20200708-091557-marostegui.json
* 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1147', diff saved to https://phabricator.wikimedia.org/P11810 and previous config saved to /var/cache/conftool/dbconfig/20200708-085745-marostegui.json
* 08:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 08:54 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 08:54 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
* 08:54 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1147', diff saved to https://phabricator.wikimedia.org/P11809 and previous config saved to /var/cache/conftool/dbconfig/20200708-085024-marostegui.json
* 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1074', diff saved to https://phabricator.wikimedia.org/P11808 and previous config saved to /var/cache/conftool/dbconfig/20200708-084227-marostegui.json
* 08:40 moritzm: upgrading docker on remaining buster hosts
* 08:38 hashar: Upgraded docker.io on contint1001 and contint2001
* 08:28 marostegui: Remove dbproxy1003 grants from misc hosts [[phab:T231280|T231280]]
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1146:3314', diff saved to https://phabricator.wikimedia.org/P11807 and previous config saved to /var/cache/conftool/dbconfig/20200708-082624-marostegui.json
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146:3314', diff saved to https://phabricator.wikimedia.org/P11806 and previous config saved to /var/cache/conftool/dbconfig/20200708-082040-marostegui.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P11805 and previous config saved to /var/cache/conftool/dbconfig/20200708-081647-marostegui.json
* 08:15 kormat@cumin1001: dbctl commit (dc=all): 'Depool es2020 for reimaging [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11804 and previous config saved to /var/cache/conftool/dbconfig/20200708-081519-kormat.json
* 08:00 marostegui: Failover m1 from db1097 to db1080 - [[phab:T256717|T256717]]
* 07:57 kormat: reimaging es2020 to buster [[phab:T257284|T257284]]
* 07:51 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3314', diff saved to https://phabricator.wikimedia.org/P11803 and previous config saved to /var/cache/conftool/dbconfig/20200708-074939-marostegui.json
* 07:49 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:48 jynus: stop bacula-director on backup1001 in preparation for m1 switchover [[phab:T256717|T256717]]
* 07:47 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:47 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 07:47 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:47 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 07:47 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 07:47 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 07:45 moritzm: installing PHP 7.3 security updates
* 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1143', diff saved to https://phabricator.wikimedia.org/P11802 and previous config saved to /var/cache/conftool/dbconfig/20200708-073548-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1143', diff saved to https://phabricator.wikimedia.org/P11801 and previous config saved to /var/cache/conftool/dbconfig/20200708-073037-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1142', diff saved to https://phabricator.wikimedia.org/P11800 and previous config saved to /var/cache/conftool/dbconfig/20200708-073011-marostegui.json
* 07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1142', diff saved to https://phabricator.wikimedia.org/P11799 and previous config saved to /var/cache/conftool/dbconfig/20200708-072431-marostegui.json
* 07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1141', diff saved to https://phabricator.wikimedia.org/P11798 and previous config saved to /var/cache/conftool/dbconfig/20200708-070921-marostegui.json
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1141', diff saved to https://phabricator.wikimedia.org/P11797 and previous config saved to /var/cache/conftool/dbconfig/20200708-070432-marostegui.json
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1138', diff saved to https://phabricator.wikimedia.org/P11796 and previous config saved to /var/cache/conftool/dbconfig/20200708-070403-marostegui.json
* 06:47 marostegui: start topology changes on m1 [[phab:T256717|T256717]]
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1138', diff saved to https://phabricator.wikimedia.org/P11795 and previous config saved to /var/cache/conftool/dbconfig/20200708-064354-marostegui.json
* 06:36 marostegui: Deploy schema change on s2 primary master db1122 [[phab:T238966|T238966]]
* 06:18 _joe_: rolling restart of restbase to pick up the proton url change
* 03:36 andrew@deploy1001: Finished deploy [horizon/deploy@505819d]: further fixes for proxy editing --bug 610130 (duration: 03m 44s)
* 03:32 andrew@deploy1001: Started deploy [horizon/deploy@505819d]: further fixes for proxy editing --bug 610130
 
== 2020-07-07 ==
* 22:41 mutante: new Wikimedia Annual Report 2019 now available on annual.wikimedia.org
* 21:29 andrew@deploy1001: Finished deploy [horizon/deploy@fce8183]: further fixes for proxy editing --bug 610130 (duration: 03m 35s)
* 21:25 andrew@deploy1001: Started deploy [horizon/deploy@fce8183]: further fixes for proxy editing --bug 610130
* 21:10 andrew@deploy1001: Finished deploy [horizon/deploy@abcd051]: further fixes for proxy editing --bug 610130 (duration: 03m 26s)
* 21:07 andrew@deploy1001: Started deploy [horizon/deploy@abcd051]: further fixes for proxy editing --bug 610130
* 20:41 ppchelko@deploy1001: Finished deploy [restbase/deploy@05b8bd5]: Remove restbase2009, take 2 (duration: 09m 15s)
* 20:32 ppchelko@deploy1001: Started deploy [restbase/deploy@05b8bd5]: Remove restbase2009, take 2
* 20:32 ppchelko@deploy1001: Finished deploy [restbase/deploy@05b8bd5]: Remove restbase2009 (duration: 14m 28s)
* 20:24 mutante: kubernetes1003 - starting nagios-nrpe-server
* 20:23 mutante: kubernetes1001 - starting nagios-nrpe-server
* 20:17 ppchelko@deploy1001: Started deploy [restbase/deploy@05b8bd5]: Remove restbase2009
* 19:29 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 19:27 mutante: destroying VM gerrit1002 - decom cookbook
* 19:26 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 19:18 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.40  refs [[phab:T256668|T256668]]
* 19:04 mutante: contint2001 - move /var/lib/zuul/.ssh/known_hosts to root and run puppet to recreate it
* 18:38 andrew@deploy1001: Finished deploy [horizon/deploy@eaa056e]: fix for proxy editing --bug 610130 (duration: 03m 18s)
* 18:35 andrew@deploy1001: Started deploy [horizon/deploy@eaa056e]: fix for proxy editing --bug 610130
* 18:27 andrew@deploy1001: Finished deploy [horizon/deploy@a39e86c]: update proxy UI to support editing existing proxies (duration: 03m 26s)
* 18:23 andrew@deploy1001: Started deploy [horizon/deploy@a39e86c]: update proxy UI to support editing existing proxies
* 18:10 krinkle@deploy1001: Synchronized w/: remove untracked test cookie file (duration: 01m 04s)
* 18:08 krinkle@deploy1001: Synchronized php-1.35.0-wmf.40/includes/Revision/RevisionStore.php: {{Gerrit|I8f986daeab4}} (duration: 01m 05s)
* 17:59 herron: imported (logstash{{!}}kibana{{!}}elasticsearch)-oss-7.8.0 into buster-wikimedia thirdparty/elastic78
* 17:54 hnowlan: finished removing restbase2009 from cassandra pool
* 17:06 hnowlan: removed restbase2009-b from cassandra pool, removing restbase2009-c
* 16:40 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.40/extensions/Wikibase: Backport: [[gerrit:610086{{!}}Revert "Don’t load $wgWBClientSettings in WikibaseClient.php" (T257296)]] (duration: 01m 10s)
* 15:49 hnowlan: running nodetool removenode for restbase2009-a
* 15:38 hnowlan@deploy1001: Started restart [restbase/deploy@05b8bd5]: Restarting restbase after removal of restbase2009
* 15:27 elukey: root-tmux on cumin1001 - cumin 'c:profile::mediawiki::mcrouter_wancache' '/usr/local/sbin/restart-mcrouter' -b 2 -s 5 - roll restart of mw-mcrouter to pick up new settings - [[phab:T255511|T255511]]
* 15:13 hnowlan@deploy1001: Started restart [restbase/deploy@05b8bd5]: Restarting restbase after removal of restbase2009
* 15:12 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 15:12 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:09 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 15:09 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 15:06 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 15:04 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 15:04 otto@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 15:02 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:02 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 15:01 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 14:58 hashar@deploy1001: Finished deploy [integration/docroot@708d3eb]: Second deployment to ensure everything works fine. Thank you jynus (duration: 00m 04s)
* 14:58 hashar@deploy1001: Started deploy [integration/docroot@708d3eb]: Second deployment to ensure everything works fine. Thank you jynus
* 14:53 _joe_: restarted restbase on restbase2022 after removing restbase2009 from the cassandra seeds
* 14:48 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 14:47 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:38 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 14:38 otto@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 14:31 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:31 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:30 papaul: replacing msw-a5,a6,a7 and a8
* 14:30 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 14:24 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:24 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 14:20 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 14:20 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 14:16 hashar@deploy1001: Finished deploy [integration/docroot@708d3eb]: (no justification provided) (duration: 00m 09s)
* 14:16 hashar@deploy1001: Started deploy [integration/docroot@708d3eb]: (no justification provided)
* 13:38 _joe_: rolling restart of restbase to pick up using envoy
* 13:31 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 13:31 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:29 XioNoX: cr2-eqiad> request vmhost snapshot routing-engine both - [[phab:T257153|T257153]]
* 13:24 XioNoX: cr1-eqiad> request vmhost snapshot routing-engine both - [[phab:T257153|T257153]]
* 13:15 kormat@cumin1001: dbctl commit (dc=all): 'Promote es2021 to es4 master [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11789 and previous config saved to /var/cache/conftool/dbconfig/20200707-131524-kormat.json
* 12:59 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:57 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 12:44 kormat: starting (codfw) es5 failover from es2020 to es2021 [[phab:T257284|T257284]]
* 12:30 kormat@cumin1001: dbctl commit (dc=all): 'Set es2021 to weight 50 [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11787 and previous config saved to /var/cache/conftool/dbconfig/20200707-123003-kormat.json
* 12:12 jforrester@deploy1001: Finished scap: Full scap and testwikis to 1.35.0-wmf.40 for [[phab:T256668|T256668]] (duration: 33m 09s)
* 12:01 marostegui: Deploy schema change on labswiki (wikitech) master - [[phab:T253276|T253276]]
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1082', diff saved to https://phabricator.wikimedia.org/P11786 and previous config saved to /var/cache/conftool/dbconfig/20200707-115838-marostegui.json
* 11:39 jforrester@deploy1001: Started scap: Full scap and testwikis to 1.35.0-wmf.40 for [[phab:T256668|T256668]]
* 11:38 jforrester@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "jforrester"; reason is "testwikis wikis to 1.35.0-wmf.40" (duration: 00m 00s)
* 11:33 moritzm: installing PHP 7.0 security updates
* 11:29 marostegui: Deploy schema change on db1082, this will create lag on s5 labs
* 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082', diff saved to https://phabricator.wikimedia.org/P11784 and previous config saved to /var/cache/conftool/dbconfig/20200707-112926-marostegui.json
* 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1130', diff saved to https://phabricator.wikimedia.org/P11783 and previous config saved to /var/cache/conftool/dbconfig/20200707-112830-marostegui.json
* 11:26 godog: test bumping logstash7 batch size to 256
* 11:17 moritzm: prune PHP 7.0 packages from mwdebug1001/2001/2002
* 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P11782 and previous config saved to /var/cache/conftool/dbconfig/20200707-110506-marostegui.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1110', diff saved to https://phabricator.wikimedia.org/P11781 and previous config saved to /var/cache/conftool/dbconfig/20200707-110412-marostegui.json
* 10:57 moritzm: prune PHP 7.0 packages from mw2190-mw2214
* 10:46 jforrester@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.40
* 10:44 jforrester@deploy1001: Pruned MediaWiki: 1.35.0-wmf.38 (duration: 17m 23s)
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P11780 and previous config saved to /var/cache/conftool/dbconfig/20200707-103255-marostegui.json
* 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3315', diff saved to https://phabricator.wikimedia.org/P11779 and previous config saved to /var/cache/conftool/dbconfig/20200707-102757-marostegui.json
* 10:26 moritzm: prune PHP 7.0 packages from mw2135-mw2147
* 10:12 addshore@deploy1001: Synchronized wmf-config/config/testcommonswiki.yaml: [[gerrit:609985]] Make testcommonswiki a testwikidata client [[phab:T257266|T257266]] PT2/2 (duration: 00m 55s)
* 10:11 addshore@deploy1001: sync-file aborted: [[gerrit:609985]] Make testcommonswiki a testwikidata client [[phab:T257266|T257266]] PT1/2 (duration: 00m 00s)
* 10:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3315', diff saved to https://phabricator.wikimedia.org/P11778 and previous config saved to /var/cache/conftool/dbconfig/20200707-101043-marostegui.json
* 10:10 addshore@deploy1001: Synchronized dblists/wikidataclient-test.dblist: [[gerrit:609985]] Make testcommonswiki a testwikidata client [[phab:T257266|T257266]] PT1/2 (duration: 00m 56s)
* 10:08 addshore@deploy1001: sync-file aborted: [[gerrit:609985]] Make testcommonswiki a testwikidata client [[phab:T257266|T257266]] PT1/2 (duration: 00m 36s)
* 10:06 elukey: decommission archiva1001
* 10:05 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P11777 and previous config saved to /var/cache/conftool/dbconfig/20200707-100328-marostegui.json
* 10:03 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 10:03 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 10:03 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3315', diff saved to https://phabricator.wikimedia.org/P11776 and previous config saved to /var/cache/conftool/dbconfig/20200707-095443-marostegui.json
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P11775 and previous config saved to /var/cache/conftool/dbconfig/20200707-095428-marostegui.json
* 09:42 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:609971]] [[phab:T257266|T257266]] Enable sitelinks to testcommons from test wikidata sites (duration: 00m 56s)
* 09:40 kormat@cumin1001: dbctl commit (dc=all): 'Repool es2021 after reimaging [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11774 and previous config saved to /var/cache/conftool/dbconfig/20200707-094017-kormat.json
* 09:37 addshore@deploy1001: Synchronized wmf-config: [[gerrit:609986]] [[phab:T257266|T257266]] [[phab:T241975|T241975]] Wikibase: Remove config option wmgUseEntitySourceBasedFederation (take2) (duration: 00m 57s)
* 09:36 _joe_: errata: restbase2010, not 2009
* 09:36 _joe_: applying the new configuration using the service proxy to restbase2009 too
* 09:34 godog: bounce logstash on logstash1023
* 09:33 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:609645]] [[phab:T257266|T257266]] [[phab:T241975|T241975]] Wikibase: stop using wmgUseEntitySourceBasedFederation (take2) (duration: 00m 59s)
* 09:33 _joe_: depooling restbase1025 while we fix the troubled relationship between envoy and proton
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315', diff saved to https://phabricator.wikimedia.org/P11773 and previous config saved to /var/cache/conftool/dbconfig/20200707-093345-marostegui.json
* 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es1024 as it is the current master [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11772 and previous config saved to /var/cache/conftool/dbconfig/20200707-092635-marostegui.json
* 09:24 James_F: 1.35.0-wmf.40 was branched at {{Gerrit|88ecd6df00a46e432c06c1cf40d5098128abc4d8}} for [[phab:T256668|T256668]]
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1023 after reimage [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11771 and previous config saved to /var/cache/conftool/dbconfig/20200707-092357-marostegui.json
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1023 after reimage [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11770 and previous config saved to /var/cache/conftool/dbconfig/20200707-091015-marostegui.json
* 08:33 kormat@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1023 after reimage [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11769 and previous config saved to /var/cache/conftool/dbconfig/20200707-083144-marostegui.json
* 08:30 kormat@cumin2001: START - Cookbook sre.hosts.downtime
* 08:26 XioNoX: cr2-codfw> request vmhost snapshot routing-engine both - [[phab:T257153|T257153]]
* 08:22 XioNoX: cr2-eqsin> request vmhost snapshot - [[phab:T257153|T257153]]
* 08:19 XioNoX: cr2-eqord> request vmhost snapshot - [[phab:T257153|T257153]]
* 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1023 after reimage [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11768 and previous config saved to /var/cache/conftool/dbconfig/20200707-081909-marostegui.json
* 08:18 elukey@cumin1001: END (ERROR) - Cookbook sre.hadoop.change-distro (exit_code=97)
* 08:17 XioNoX: cr2-eqdfw> request vmhost snapshot - [[phab:T257153|T257153]]
* 08:15 XioNoX: cr3-knams> request vmhost snapshot - [[phab:T257153|T257153]]
* 08:15 hashar: upgrading and restart CI Jenkins on contint2001 # [[phab:T256978|T256978]]
* 08:12 XioNoX: cr4-ulsfo> request vmhost snapshot - [[phab:T257153|T257153]]
* 08:09 kormat@cumin1001: dbctl commit (dc=all): 'Depool es2021 for reimaging [[phab:T257284|T257284]]', diff saved to https://phabricator.wikimedia.org/P11767 and previous config saved to /var/cache/conftool/dbconfig/20200707-080914-kormat.json
* 07:50 marostegui: Stop MySQL on db1074 to deploy schema change and remove triggers - [[phab:T238966|T238966]]
* 07:45 _joe_: restarting restbase again on rb1025
* 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 for schema change', diff saved to https://phabricator.wikimedia.org/P11766 and previous config saved to /var/cache/conftool/dbconfig/20200707-074435-marostegui.json
* 07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1079 and db1136  [[phab:T257216|T257216]]', diff saved to https://phabricator.wikimedia.org/P11765 and previous config saved to /var/cache/conftool/dbconfig/20200707-073918-marostegui.json
* 07:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:31 _joe_: restarting restbase on restbase1025, reaching proton via envoy for now
* 07:31 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:609644{{!}}Revert "Commons: Define entity sources configuration" (T256906, T256907, T256909, T254315, T257266)]] (forgot to git rebase so the last sync was a no-op) (duration: 00m 56s)
* 07:27 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 07:27 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:609644{{!}}Revert "Commons: Define entity sources configuration" (T256906, T256907, T256909, T254315, T257266)]] (duration: 00m 53s)
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079 and give more main weight to db1136  [[phab:T257216|T257216]]', diff saved to https://phabricator.wikimedia.org/P11764 and previous config saved to /var/cache/conftool/dbconfig/20200707-072703-marostegui.json
* 07:24 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/: Config: [[gerrit:609643{{!}}Revert "Wikidata client wikis: Define entity sources configuration (take 2)" (T254315, T257266)]] (duration: 00m 56s)
* 07:24 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 07:23 lucaswerkmeister-wmde@deploy1001: Synchronized dblists/wikidataclient.dblist: Config: [[gerrit:609643{{!}}Revert "Wikidata client wikis: Define entity sources configuration (take 2)" (T254315, T257266)]] (duration: 00m 56s)
* 07:19 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:609642{{!}}Revert "Wikibase: stop using wmgUseEntitySourceBasedFederation" (T241975, T257266)]] (duration: 00m 55s)
* 07:16 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 07:15 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:609641{{!}}Revert "Wikibase: Remove config option wmgUseEntitySourceBasedFederation" (T241975, T257266)]] (duration: 00m 57s)
* 07:10 _joe_: restart restbase on restbase1025 to pick up the switch to https for cxserver
* 06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079 and give more main weight to db1136  [[phab:T257216|T257216]]', diff saved to https://phabricator.wikimedia.org/P11762 and previous config saved to /var/cache/conftool/dbconfig/20200707-063737-marostegui.json
* 06:29 marostegui: Reimage es1023 to Buster [[phab:T255755|T255755]]
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Give db1136 some weight back into main traffic [[phab:T257216|T257216]]', diff saved to https://phabricator.wikimedia.org/P11761 and previous config saved to /var/cache/conftool/dbconfig/20200707-062008-marostegui.json
* 06:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079 [[phab:T257216|T257216]]', diff saved to https://phabricator.wikimedia.org/P11760 and previous config saved to /var/cache/conftool/dbconfig/20200707-061849-marostegui.json
* 05:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Enable es5 writes [[phab:T255755|T255755]] (duration: 00m 56s)
* 05:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1023 entirely [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11759 and previous config saved to /var/cache/conftool/dbconfig/20200707-051620-marostegui.json
* 05:12 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1024 to es5 master [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11758 and previous config saved to /var/cache/conftool/dbconfig/20200707-051236-marostegui.json
* 05:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Disable es5 writes [[phab:T255755|T255755]] (duration: 00m 56s)
* 05:01 marostegui: "Starting es failover from es1023 to es1024 - https://phabricator.wikimedia.org/T255755"
* 01:05 ejegg: turned on debug logging for Adyen SmashPig
* 00:22 cstone: civicrm revision changed from {{Gerrit|a48caf0f37}} to {{Gerrit|d73ee2e73f}}
 
== 2020-07-06 ==
* 23:32 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable sidebar instrumentation on test wikipedia (duration: 00m 56s)
* 23:32 eileen: process-control config revision is {{Gerrit|3fe6753e56}}
* 23:22 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Change some zh canonical namespaces. Don't index NS_USER on hywiki (duration: 00m 58s)
* 22:59 eileen: tools revision changed from {{Gerrit|e974147f27}} to {{Gerrit|73557b8038}}
* 22:14 ryankemper@deploy1001: Finished deploy [wdqs/wdqs@65502b2]: 0.3.40 (duration: 18m 58s)
* 21:55 ryankemper@deploy1001: Started deploy [wdqs/wdqs@65502b2]: 0.3.40
* 21:52 hashar: Upgraded Jenkins on releases1002 and releases2002 # [[phab:T256978|T256978]]
* 21:41 mutante: upgrading jenkins on releases1001 and releases2001 ([[phab:T256980|T256980]])
* 21:37 mutante: importing jenkins 2.235.1 into APT repo for both stretch and buster [[phab:T256980|T256980]]
* 20:08 eileen: tools revision is {{Gerrit|e974147f27}}
* 19:41 qchris: Enabling puppet on gerrit1002 again to catch up with puppetmaster.
* 18:56 addshore: backport / deploy window done
* 18:55 addshore@deploy1001: Synchronized wmf-config: [[gerrit:569263]] [[phab:T241975|T241975]] Wikibase: Remove config option wmgUseEntitySourceBasedFederation (duration: 00m 58s)
* 18:54 addshore@deploy1001: sync-file aborted: [[gerrit:569263]]  (duration: 00m 00s)
* 18:51 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: [[gerrit:608944]] [[phab:T241975|T241975]] Wikibase: stop using wmgUseEntitySourceBasedFederation (duration: 00m 56s)
* 18:47 addshore@deploy1001: Synchronized dblists/wikidataclient.dblist: [[phab:T254315|T254315]] Wikidata client wikis: Define entity sources configuration (take 2) [[gerrit:608839]] (duration: 00m 56s)
* 18:45 addshore@deploy1001: Synchronized wmf-config: [[phab:T254315|T254315]] Wikidata client wikis: Define entity sources configuration (take 2) [[gerrit:608839]] (duration: 00m 58s)
* 18:38 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T256906|T256906]] [[phab:T256907|T256907]] [[phab:T256909|T256909]] [[phab:T254315|T254315]] [[gerrit:569260]] Commons: Define entity sources configuration (duration: 00m 56s)
* 18:30 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|adffbe6}}: Enable validation of new signatures ([[phab:T248632|T248632]]) (duration: 00m 57s)
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/abusefilter.php: {{Gerrit|8878c60}}: Add `abusefilter-view` as a default right for the CU log user ([[phab:T255506|T255506]]) (duration: 00m 55s)
* 18:22 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1398171}}: Add arbcom group to plwiki ([[phab:T256572|T256572]]) (duration: 00m 56s)
* 18:08 andrew@deploy1001: Finished deploy [horizon/deploy@bb176c2]: update proxy UI to support multiple pre-set domains (duration: 03m 39s)
* 18:04 andrew@deploy1001: Started deploy [horizon/deploy@bb176c2]: update proxy UI to support multiple pre-set domains
* 17:54 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate SearchSatisfaction from EventLogging to EventGate on all wikis - [[phab:T249261|T249261]] - take 2 (duration: 00m 56s)
* 17:50 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate SearchSatisfaction from EventLogging to EventGate on all wikis - [[phab:T249261|T249261]] (duration: 00m 56s)
* 16:09 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate SearchSatisfaction from EventLogging to EventGate on group1 - [[phab:T249261|T249261]] (duration: 00m 58s)
* 15:02 jynus: removing old snapshots for x1 on dbprov[12]002
* 14:50 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:46 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:44 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 14:42 moritzm: installing PHP 7.0 security updates
* 14:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1074', diff saved to https://phabricator.wikimedia.org/P11753 and previous config saved to /var/cache/conftool/dbconfig/20200706-143754-marostegui.json
* 14:36 godog: reboot ms-be2025 for hw raid software upgrade - [[phab:T257214|T257214]]
* 14:28 godog: powercycle ms-be2025, no ssh available - [[phab:T257214|T257214]]
* 14:14 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 marostegui: Stop MySQL and poweroff db1079 [[phab:T257216|T257216]]
* 14:08 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 14:02 jynus@cumin1001: dbctl commit (dc=all): 'depool db1136 from main traffic as it is the only s7 api host right now', diff saved to https://phabricator.wikimedia.org/P11752 and previous config saved to /var/cache/conftool/dbconfig/20200706-140217-jynus.json
* 13:56 marostegui: Downtime and reboot db1079 after BBU crash
* 13:54 jynus@cumin1001: dbctl commit (dc=all): 'depool db1079', diff saved to https://phabricator.wikimedia.org/P11751 and previous config saved to /var/cache/conftool/dbconfig/20200706-135430-jynus.json
* 13:30 marostegui: Deploy schema change on s5 codfw master [[phab:T253276|T253276]]
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Reduce es1024 weight in preparation for tomorrow's switchover [[phab:T255755|T255755]]', diff saved to https://phabricator.wikimedia.org/P11750 and previous config saved to /var/cache/conftool/dbconfig/20200706-132634-marostegui.json
* 13:03 elukey: force umount/mount of /mnt/hdfs on an-airflow1001 to unblock dpkg checks (fuse misbehaving, all checks hanging)
* 12:53 elukey: kill hanging lsof processes on an-airflow to reduce cpu load
* 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074', diff saved to https://phabricator.wikimedia.org/P11748 and previous config saved to /var/cache/conftool/dbconfig/20200706-124237-marostegui.json
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1129', diff saved to https://phabricator.wikimedia.org/P11747 and previous config saved to /var/cache/conftool/dbconfig/20200706-124105-marostegui.json
* 11:17 Urbanecm: EU B&C window was done
* 11:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|5d971dc}}: GrowthExperiments: Remove overrides to welcome survey privacy policy URL ([[phab:T252572|T252572]]) (duration: 00m 56s)
* 11:12 marostegui: Deploy schema changes on db1129
* 11:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P11746 and previous config saved to /var/cache/conftool/dbconfig/20200706-111221-marostegui.json
* 11:09 marostegui: Compress InnoDB on db1107 [[phab:T254462|T254462]]
* 11:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f4b5001}}: Add arxiv.org to commonswiki wgCopyUploadsDomains ([[phab:T257036|T257036]]) (duration: 00m 56s)
* 11:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1107 [[phab:T254462|T254462]]', diff saved to https://phabricator.wikimedia.org/P11745 and previous config saved to /var/cache/conftool/dbconfig/20200706-110723-marostegui.json
* 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1076', diff saved to https://phabricator.wikimedia.org/P11744 and previous config saved to /var/cache/conftool/dbconfig/20200706-110544-marostegui.json
* 11:05 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3bc1b46}}: Remove "Create a book" link from sidebar on Finnish Wikipedia ([[phab:T257073|T257073]]) (duration: 00m 56s)
* 10:52 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:609762{{!}} Bumping portals to master (609762)]] (duration: 00m 57s)
* 10:51 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:609762{{!}} Bumping portals to master (609762)]] (duration: 00m 56s)
* 10:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:28 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:28 moritzm: rebooting idp1001 for kernel update
* 09:35 urbanecm@deploy1001: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 00m 58s)
* 08:51 XioNoX: cr1-codfw> request vmhost snapshot routing-engine both - [[phab:T257153|T257153]]
* 08:44 XioNoX: cr3-ulsfo> request vmhost snapshot - [[phab:T257153|T257153]]
* 08:24 kormat: restarting all mariadb instances on sanitarium hosts [[phab:T256545|T256545]]
* 08:09 elukey: roll restart aqs on aqs100[4-9] to pick up new druid settings
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076', diff saved to https://phabricator.wikimedia.org/P11742 and previous config saved to /var/cache/conftool/dbconfig/20200706-080509-marostegui.json
* 07:58 qchris: Disable puppet on gerrit1002 (gerrit-test) to deploy Gerrit UI updates there to gather more feedback
* 07:51 elukey: enable binlog on matomo's database on matomo1002
* 07:46 XioNoX: repool eqsin - [[phab:T257154|T257154]]
* 07:11 XioNoX: reboot cr3-eqsin - [[phab:T257154|T257154]]
* 06:55 XioNoX: depool eqsin for cr3-eqsin reboot/investigation - [[phab:T257154|T257154]]
* 06:54 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1089', diff saved to https://phabricator.wikimedia.org/P11740 and previous config saved to /var/cache/conftool/dbconfig/20200706-065437-marostegui.json
* 06:54 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.change-distro (exit_code=99)
* 06:22 elukey@cumin1001: START - Cookbook sre.hadoop.change-distro
* 06:21 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 06:14 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 05:45 kart_: Updated cxserver to 2020-07-01-044435-production ([[phab:T254143|T254143]])
* 05:40 kartik@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:36 kartik@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:32 kartik@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P11739 and previous config saved to /var/cache/conftool/dbconfig/20200706-051333-marostegui.json
* 05:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P11738 and previous config saved to /var/cache/conftool/dbconfig/20200706-050347-marostegui.json
* 04:49 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P11737 and previous config saved to /var/cache/conftool/dbconfig/20200706-044908-marostegui.json
 
== 2020-07-05 ==
* 21:50 qchris: Restarting gerrit on gerrit1001 to pick up new war and jars.
* 21:50 qchris@deploy1001: Finished deploy [gerrit/gerrit@fbd0684]: Bump gerrit to 3.2.2-102-g3bbb138e13, zuul plugin to master-0-g7accc67, and gitiles to v3.2.2-1-g00c5ca0-with-0e3b533 on gerrit1001 (duration: 00m 07s)
* 21:50 qchris@deploy1001: Started deploy [gerrit/gerrit@fbd0684]: Bump gerrit to 3.2.2-102-g3bbb138e13, zuul plugin to master-0-g7accc67, and gitiles to v3.2.2-1-g00c5ca0-with-0e3b533 on gerrit1001
* 21:46 qchris: Restarting gerrit on gerrit2001 to pick up new war and jars.
* 21:45 qchris@deploy1001: Finished deploy [gerrit/gerrit@fbd0684]: Bump gerrit to 3.2.2-102-g3bbb138e13, zuul plugin to master-0-g7accc67, and gitiles to v3.2.2-1-g00c5ca0-with-0e3b533 on gerrit2001 (duration: 00m 10s)
* 21:45 qchris@deploy1001: Started deploy [gerrit/gerrit@fbd0684]: Bump gerrit to 3.2.2-102-g3bbb138e13, zuul plugin to master-0-g7accc67, and gitiles to v3.2.2-1-g00c5ca0-with-0e3b533 on gerrit2001
* 21:32 qchris: Restarting gerrit on gerrit1002 to pick up new wars and jars.
* 21:32 qchris@deploy1001: Finished deploy [gerrit/gerrit@fbd0684]: Bump gerrit to 3.2.2-102-g3bbb138e13 and zuul plugin to master-0-g7accc67 (duration: 00m 08s)
* 21:32 qchris@deploy1001: Started deploy [gerrit/gerrit@fbd0684]: Bump gerrit to 3.2.2-102-g3bbb138e13 and zuul plugin to master-0-g7accc67
* 21:20 qchris: Enable puppet on gerrit1002 (gerrit-test) again to let it catch up again
* 16:01 gehel: restart elastic-psi on elastic1052 (high GC rate)
* 15:56 gehel: restart blazegraph + updater on wdqs1007 and depool to allow catching up on lag
 
== 2020-07-04 ==
* 19:23 qchris@deploy1001: Finished deploy [gerrit/gerrit@b78914b]: Bump gitiles to v3.2.2-1-g00c5ca0-with-0e3b533 on gerrit1002 (duration: 00m 08s)
* 19:23 qchris@deploy1001: Started deploy [gerrit/gerrit@b78914b]: Bump gitiles to v3.2.2-1-g00c5ca0-with-0e3b533 on gerrit1002
* 14:05 qchris: Disable puppet on gerrit1002 (gerrit-test) to deploy Gerrit UI updates there to gather feedback
* 12:42 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 24s)
* 02:28 reedy@deploy1001: Synchronized php-1.35.0-wmf.39/extensions/Score/includes/Score.php: Short circuit lilypond version check to allow usage of cached files [[phab:T257066|T257066]] (duration: 00m 55s)
 
== 2020-07-03 ==
* 21:49 reedy@deploy1001: Synchronized php-1.35.0-wmf.39/extensions/Score/: Sync maintenance script (duration: 00m 58s)
* 18:47 cdanis: ✔️ cdanis@an-coord1001.eqiad.wmnet ~ 🕒☕ sudo systemctl restart hive-server2.service
* 16:51 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|Ifa929b2ad4}} (duration: 00m 57s)
* 16:02 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Rename wgRestrictionMethod to wgShellRestrictionMethod (duration: 00m 58s)
* 15:46 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:43 jayme@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:43 jynus@cumin1001: dbctl commit (dc=all): 'Reduce db1118 weight to spread load mode evenly', diff saved to https://phabricator.wikimedia.org/P11730 and previous config saved to /var/cache/conftool/dbconfig/20200703-154337-jynus.json
* 15:40 jayme@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:38 jayme@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:09 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0)
* 15:02 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 14:11 elukey@cumin1001: END (FAIL) - Cookbook sre.hadoop.stop-cluster (exit_code=99)
* 14:11 _joe_: restarted php-fpm on wtp1033, stuck in sigill
* 13:59 elukey@cumin1001: START - Cookbook sre.hadoop.stop-cluster
* 12:41 hashar: Restarting Zuul / CI
* 11:39 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:36 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:29 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:29 moritzm: rebooting urldownloader standby hosts for kernel updates (1002/2002)
* 10:59 moritzm: installing json-c security updates on jessie
* 10:51 moritzm: installing ruby-json security updates
* 10:25 moritzm: installing nss security updates on jessie
* 10:18 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:15 elukey: notebook1004 renamed to an-scheduler1001
* 10:15 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:07 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 09:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:58 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:56 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:55 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:43 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:43 moritzm: rebooting netflow* hosts for kernel security update
* 08:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:14 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:04 jayme: authdns-update for chartmuseum - [[phab:T256970|T256970]]
* 08:03 elukey@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:55 moritzm: installing mutt security updates for jessie (stretch/buster already fixed)
* 07:44 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 07:40 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 07:39 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 07:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:00 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 06:47 moritzm: installing php5 security updates
* 06:27 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:27 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 06:09 moritzm: rebooting mw1390-mw1419 for kernel security updates
* 05:46 XioNoX: remove chassis redundancy failover from fasw-c-eqiad for consistency with all other VCs
* 05:33 XioNoX: remove chassis redundancy failover from fasw-c-codfw for consistency with all other VCs
 
== 2020-07-02 ==
* 23:22 jhuneidi@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 23:16 jhuneidi@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 22:03 jhuneidi@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 21:56 mutante: gerrit1001 (prod gerrit) - restarting gerrit service
* 21:52 maryum: frwikibooks reindex sucessful, continuing on with remainder of french wikis
* 21:32 mutante: gerrit - deleted gerrit db_pass from prod private repo, running puppet
* 21:25 mutante: gerrit2001 - restarted gerrit
* 21:14 mutante: gerrit1002 restarted gerrit
* 20:20 maryum: reindexing frwikibooks to test https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/604221
* 19:52 mutante: gerrit2001 - restarting gerrit after removing db_pass from config
* 16:05 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:01 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:37 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:23 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:07 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:07 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:42 moritzm: rebooting mw1370-mw1389 for kernel security updates
* 14:33 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:33 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 14:03 kormat: stopped mariadb@s8 on dbstore1005 for data restoration [[phab:T256966|T256966]]
* 12:43 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:43 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 12:36 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:36 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 12:32 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:32 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 12:31 moritzm: rebooting mw1349-mw1369 for kernel security updates
* 12:28 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:27 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 12:27 vgutierrez: rolling restart of esams load balancers to catch up on kernel upgrades
* 12:12 XioNoX: pre-configure asw2-b-eqiad<->cloudsw1-c8-eqiad - [[phab:T251632|T251632]]
* 12:10 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:53 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:53 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:40 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:40 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:35 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:35 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:33 vgutierrez: rolling restart of codfw load balancers to catch up on kernel upgrades
* 11:18 akosiaris: preactively restart docker-registry on registry1001, registry1002 to force CA refresh
* 11:16 akosiaris: restart docker-registry on registry2002 for CA refresh
* 11:14 _joe_: restarting docker-registry on registry2001
* 10:34 godog: move "cluster overview" dashboard to Thanos - [[phab:T256954|T256954]]
* 09:35 XioNoX: advertise codfw prefixes from eqord
* 09:28 jayme: imported chartmuseum_0.12.0-2 to buster-wikimedia - [[phab:T253843|T253843]]
* 09:07 addshore: addshore@mwmaint1002:~$ mwscript maintenance/createAndPromote.php --wiki testwikidatawiki --force --custom-groups oversight "DCausse_(WMF)" # [[phab:T256949|T256949]]
* 09:07 addshore: addshore@mwmaint1002:~$ mwscript maintenance/createAndPromote.php --wiki testwikidatawiki --force --custom-groups oversight "Addshore" # [[phab:T256949|T256949]]
* 08:59 XioNoX: deploy flex flow for MX204s - [[phab:T248394|T248394]]
* 05:52 _joe_: removing all tags for envoy-tls-local-proxy
* 05:46 _joe_: upload docker-report 0.0.4 on buster-wikimedia [[phab:T242604|T242604]]
* 04:32 eileen: process-control config revision is {{Gerrit|b4655897b5}}
* 03:17 eileen: process-control config revision is {{Gerrit|12fe6b5151}}
* 03:15 eileen: tools revision changed from {{Gerrit|4ea8567819}} to {{Gerrit|e974147f27}}
* 02:32 eileen: tools revision changed from {{Gerrit|e38f7a83d4}} to {{Gerrit|4ea8567819}}
* 00:53 eileen: tools revision changed from {{Gerrit|806e2b4412}} to {{Gerrit|e38f7a83d4}}
 
== 2020-07-01 ==
* 23:53 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set $wgForceUIAsContentMsg for zhwikibooks, zhwikinews, zhwikiquote, zhwikisource, zhwikiversity, zhwiktionary ([[phab:T256521|T256521]]) (duration: 00m 55s)
* 23:35 ejegg: updated fundraising CiviCRM from {{Gerrit|391d0fdf75}} to {{Gerrit|a48caf0f37}}
* 23:32 catrope@deploy1001: Synchronized static/images/project-logos/: Change Simplified Chinese logo for zhwiki ([[phab:T256839|T256839]]) (duration: 00m 55s)
* 23:18 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|Ibb42db7fd1ee}} (duration: 00m 55s)
* 23:00 bstorm: set a short downtime on labstore1006/7 to prevent alert while disabling direct systemd monitoring
* 22:37 krinkle@deploy1001: Synchronized php-1.35.0-wmf.39/includes/Title.php: {{Gerrit|I8d5bad9c654c4ab}} (duration: 01m 00s)
* 21:00 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 20:58 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 20:56 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 20:56 Krinkle: krinkle@deploy1001 Ran `scap deploy --init` for /srv/deployment/performance/arc-lamp
* 20:55 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@d7476f5]: Update mobileapps to {{Gerrit|953fc41a}} (duration: 04m 08s)
* 20:51 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@d7476f5]: Update mobileapps to {{Gerrit|953fc41a}}
* 20:27 eileen: tools revision changed from {{Gerrit|6f38c14fe3}} to {{Gerrit|806e2b4412}} -
* 20:11 eileen: tools revision changed from {{Gerrit|aab96444df}} to {{Gerrit|6f38c14fe3}}
* 19:23 twentyafterfour: 1.35.0-wmf.39 is now deployed to group2 wikis, everything appears to be normal.  refs [[phab:T254176|T254176]]
* 19:18 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group2 wikis to 1.35.0-wmf.39  refs [[phab:T254176|T254176]]
* 18:44 addshore@deploy1001: Synchronized wmf-config: REVERT [[phab:T254315|T254315]] Wikidata client wikis: Define entity sources configuration [[gerrit:569259]] (duration: 01m 04s)
* 18:41 addshore@deploy1001: sync-file aborted: [[phab:T254315|T254315]] Wikidata client wikis: Define entity sources configuration [[gerrit:569259]] (duration: 00m 38s)
* 18:38 joal@deploy1001: Finished deploy [analytics/refinery@8b7bddf] (thin): Regular analytics weekly train THIN [analytics/refinery@8b7bddf] (duration: 02m 19s)
* 18:36 joal@deploy1001: Started deploy [analytics/refinery@8b7bddf] (thin): Regular analytics weekly train THIN [analytics/refinery@8b7bddf]
* 18:35 joal@deploy1001: Finished deploy [analytics/refinery@8b7bddf]: Regular analytics weekly train [analytics/refinery@8b7bddf] (duration: 08m 09s)
* 18:27 joal@deploy1001: Started deploy [analytics/refinery@8b7bddf]: Regular analytics weekly train [analytics/refinery@8b7bddf]
* 18:25 joal@deploy1001: Finished deploy [analytics/refinery@114bfed]: Regular analytics weekly train [analytics/refinery@114bfed] (duration: 03m 41s)
* 18:21 joal@deploy1001: Started deploy [analytics/refinery@114bfed]: Regular analytics weekly train [analytics/refinery@114bfed]
* 18:18 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Enable kafka purges on wikitech gerrit:607590 IS-labs.php (duration: 01m 03s)
* 18:07 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Deploy MediaModeration on all production wikis gerrit:608753 (duration: 01m 07s)
* 17:14 XioNoX: set flex-flow-sizing to cr2-eqsin - [[phab:T248394|T248394]]
* 16:57 XioNoX: restart cr2-eqsin for software upgrade - [[phab:T243080|T243080]]
* 16:00 XioNoX: updating eqsin LVS BGP neighbors IPs - [[phab:T255766|T255766]]
* 15:16 XioNoX: re0.cr1-eqsin> request system power-off both-routing-engines - [[phab:T255766|T255766]]
* 15:15 XioNoX: disable BGP to pybal on cr1-eqsin - [[phab:T255766|T255766]]
* 15:13 XioNoX: disable cr1-eqsin transit/peering BGP - [[phab:T255766|T255766]]
* 15:09 XioNoX: bump eqsin-codfw ospf link cost - [[phab:T255766|T255766]]
* 15:03 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 15:03 XioNoX: move vrrp master to cr2-eqsin - [[phab:T255766|T255766]]
* 15:00 XioNoX: depool eqsin for routers work - [[phab:T255766|T255766]]
* 14:28 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:28 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 14:04 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:04 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:37 hashar: contint1001 stopped zuul-merger for a test. started it again
* 13:35 hashar: Restarting zuul-merger on contint2001 # [[phab:T252310|T252310]]
* 13:30 hashar@deploy1001: Finished deploy [zuul/deploy@00f69b3]: (no justification provided) (duration: 00m 08s)
* 13:30 hashar@deploy1001: Started deploy [zuul/deploy@00f69b3]: (no justification provided)
* 13:29 hashar@deploy1001: Finished deploy [zuul/deploy@00f69b3]: (no justification provided) (duration: 00m 32s)
* 13:28 hashar@deploy1001: Started deploy [zuul/deploy@00f69b3]: (no justification provided)
* 13:16 hashar@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.39 (duration: 01m 04s)
* 13:15 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.39
* 13:10 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:09 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:09 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:08 cdanis: ✔️ cdanis@netflow2001.codfw.wmnet ~ 🕘☕ sudo apt remove valgrind libc6-dbg
* 13:03 cdanis: [[phab:T256790|T256790]] ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕘☕ sudo cumin 'netflow[3-5]001*' 'systemctl restart nfacctd'
* 12:58 cdanis: [[phab:T256790|T256790]] ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕘☕ sudo debdeploy deploy -u 2020-07-01-pmacct.yaml -s netflow
* 12:55 cdanis: [[phab:T256790|T256790]] ✔️ cdanis@apt1001.wikimedia.org ~ 🕘☕ sudo -E reprepro -C main include buster-wikimedia pmacct_1.7.2-3+wmf1_amd64.changes
* 12:53 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:53 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:47 ema: A:cp upgrade librdkafka1 to 0.11.6-1.1wmf1 and restart purged, varnishkafka [[phab:T256444|T256444]]
* 11:46 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254315|T254315]] Wikidata: Define entity sources configuration [[gerrit:569258]] (duration: 01m 06s)
* 11:32 Lucas_WMDE: EU B&C window done
* 11:24 lucaswerkmeister-wmde@deploy1001: Synchronized w/touch.php: Config: [[gerrit:608713{{!}}Fully set MW_NO_SESSION for browser metadata endpoints]], 4/4 (duration: 01m 06s)
* 11:22 lucaswerkmeister-wmde@deploy1001: Synchronized w/robots.php: Config: [[gerrit:608713{{!}}Fully set MW_NO_SESSION for browser metadata endpoints]], 3/4 (duration: 01m 03s)
* 11:21 lucaswerkmeister-wmde@deploy1001: Synchronized w/favicon.php: Config: [[gerrit:608713{{!}}Fully set MW_NO_SESSION for browser metadata endpoints]], 2/4 (duration: 01m 04s)
* 11:19 lucaswerkmeister-wmde@deploy1001: Synchronized w/extract2.php: Config: [[gerrit:608713{{!}}Fully set MW_NO_SESSION for browser metadata endpoints]], 1/4 (duration: 01m 16s)
* 11:07 Amir1: Changing datatype of several properties with mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php ([[phab:T255241|T255241]])
* 11:07 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 11:02 ema: restbase2009 depooled [[phab:T256863|T256863]]
* 11:02 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2009.codfw.wmnet
* 10:50 ema: power on restbase2009
* 10:45 jayme: draining and docker restart (one at a time) kubernetes[1001-1004].eqiad.wmnet - [[phab:T256786|T256786]]
* 10:34 ema: power-cycle restbase2009
* 10:17 XioNoX: renumber NTT transit links - [[phab:T254877|T254877]]
* 10:16 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:16 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 10:14 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:14 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 10:09 jayme: draining and docker restart (one at a time) kubernetes[2001-2004].codfw.wmnet
* 09:52 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:52 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:46 jayme: cordoning kubernetes[2001-2004].codfw.wmnet,kubernetes[1001-1004].eqiad.wmnet - [[phab:T256786|T256786]]
* 09:42 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:42 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:34 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:34 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:23 jayme: restarting dockerd on kubestage1002.eqiad.wmnet - [[phab:T256786|T256786]]
* 09:15 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:15 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:08 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:08 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:53 jayme: draining kubernetes staging node kubestage1001.eqiad.wmnet - [[phab:T256786|T256786]]
* 08:52 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:52 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:44 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:44 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:29 XioNoX: disable BGP to nfacct in eqiad - [[phab:T256790|T256790]]
* 08:23 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:23 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:08 jayme@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 08:05 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:05 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:01 vgutierrez: rolling restart of esams cache nodes to catch up on kernel upgrades
* 07:42 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:42 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 07:40 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:40 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 07:39 ema: cp2041: restart purged, varnishkafka after librdkafka1 upgrade to 0.11.6-1.1wmf1 [[phab:T256444|T256444]]
* 05:47 _joe_: restarting nfacctd on netflow1001, it's segfaulting
* 04:01 krinkle@deploy1001: Synchronized php-1.35.0-wmf.39/maintenance/findBadBlobs.php: {{Gerrit|I47c11190b665}} (duration: 01m 08s)
* 00:14 krinkle@deploy1001: Synchronized private/PrivateSettings.php: [[phab:T254795|T254795]] - Set $wmgXhguiDBuser and $wmgXhguiDBpasswor (duration: 01m 06s)
 
== 2020-06-30 ==
* 21:48 crusnov@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 21:46 crusnov@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:45 crusnov@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 21:43 crusnov@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:42 crusnov@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 21:40 crusnov@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:40 crusnov@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 21:38 crusnov@cumin1001: START - Cookbook sre.hosts.reboot-single
* 21:38 crusnov@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99)
* 21:38 crusnov@cumin1001: START - Cookbook sre.hosts.reboot-single
* 19:19 hashar@deploy1001: rebuilt and synchronized wikiversions files: group 0 wikis to 1.35.0-wmf.39 # [[phab:T254176|T254176]]
* 18:31 cdanis: [[phab:T256790|T256790]] ✔️ cdanis@netflow2001.codfw.wmnet ~ 🕝☕ sudo apt install valgrind
* 18:27 tgr: Morning deploys done
* 18:23 tgr@deploy1001: Synchronized php-1.35.0-wmf.39/extensions/ElectronPdfService/src/ElectronPdfServiceHooks.php: Backport: [[gerrit:608485{{!}}Hotfix: "Undefined index: print" (T256761)]] (duration: 01m 05s)
* 18:11 shdubsh: restart varnishmtail,atsmtail,ncredirmtail on ncredir,cp hosts in codfw and eqsin
* 18:05 cdanis: installing libc6-dbg on netflow2001 [[phab:T256790|T256790]]
* 17:40 mdholloway: mobileapps deployments on k8s failing with timeouts; filed [[phab:T256786|T256786]]
* 17:37 cdanis: ✔️ cdanis@netflow2001.codfw.wmnet ~ 🕜☕ sudo systemctl restart nfacctd
* 17:33 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 17:18 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 17:17 papaul: uplugging msw-c3 power to relocate port on PDU
* 17:09 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@f9df1af]: Update mobileapps to {{Gerrit|5c7611b9}} (duration: 03m 33s)
* 17:05 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@f9df1af]: Update mobileapps to {{Gerrit|5c7611b9}}
* 16:57 cdanis: [[phab:T256444|T256444]] restarted purged on cp2030 and repooling
* 16:48 cdanis: [[phab:T256444|T256444]] ✔️ cdanis@cp2030.codfw.wmnet ~ 🕐☕ sudo depool
* 15:54 otto@deploy1001: Finished deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] - take 3 (duration: 00m 03s)
* 15:54 otto@deploy1001: Started deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] - take 3
* 15:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:35 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:30 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:24 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:20 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:16 otto@deploy1001: Finished deploy [analytics/refinery@1112749]: roll back to {{Gerrit|1112749}} on an-launcher1002, git-fat not pulling artifacts (duration: 01m 21s)
* 15:14 otto@deploy1001: Started deploy [analytics/refinery@1112749]: roll back to {{Gerrit|1112749}} on an-launcher1002, git-fat not pulling artifacts
* 15:14 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:10 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:10 moritzm: rebooting mwdebug* hosts for kernel security update
* 15:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:03 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:01 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:59 moritzm: rebooting failoid hosts for kernel update
* 14:49 otto@deploy1001: Finished deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] - take 3 (duration: 00m 03s)
* 14:49 otto@deploy1001: Started deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] - take 3
* 14:47 otto@deploy1001: Finished deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] - take 2 (duration: 00m 03s)
* 14:47 otto@deploy1001: Started deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] - take 2
* 14:44 hashar: Train blocked on Flow being broken: [[phab:T256761|T256761]]  # [[phab:T254176|T254176]]
* 14:38 hashar@deploy1001: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.35.0-wmf.39" - [[phab:T256759|T256759]]
* 14:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:32 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:28 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:26 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:25 hashar@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.39
* 14:21 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:21 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 14:21 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:18 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:18 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:15 moritzm: rebooting miscweb servers for kernel security update
* 14:15 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:13 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:11 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:10 otto@deploy1001: Finished deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]] (duration: 01m 56s)
* 14:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:09 hashar@deploy1001: Finished scap: testwikis wikis to 1.35.0-wmf.39 (duration: 62m 30s)
* 14:08 otto@deploy1001: Started deploy [analytics/refinery@d63944e]: Deploying new camus wmf10 jar to an-launcher1002 for [[phab:T256370|T256370]]
* 14:08 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:59 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:59 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:59 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:57 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:51 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:49 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:37 moritzm: rebooting LDAP replicas for kernel security update
* 13:32 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:32 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:31 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:30 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 13:07 hashar@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.39
* 12:03 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:35 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:35 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 11:33 awight: EU BACON cooked
* 11:32 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:608478{{!}}Configure TeWü survey on dewiki (take 2) (T253112)]] (duration: 00m 58s)
* 11:32 jayme: restarted docker-reporter-base-images and docker-reporter-releng-images on deneb - [[phab:T253396|T253396]]
* 11:31 jayme: pushed a scratch docker image as docker-registry.discovery.wmnet/envoy-tls-local-proxy:dontuseme - [[phab:T253396|T253396]]
* 11:28 awight@deploy1001: Synchronized php-1.35.0-wmf.38/extensions/QuickSurveys: BACON: [[gerrit:608477{{!}}Embedded surveys are hidden when no element is available (T256627)]] (duration: 00m 56s)
* 11:26 awight@deploy1001: Synchronized php-1.35.0-wmf.38/extensions/FileImporter: BACON: [[gerrit:608476{{!}}Set Status error if permission check returns false. (T256428)]] (duration: 00m 58s)
* 11:13 ema: deneb: systemctl restart docker-reporter-base-images.service
* 10:59 ema: upload librdkafka 0.11.6-1.1wmf1 to buster-wikimedia https://phabricator.wikimedia.org/P11703 [[phab:T256444|T256444]]
* 10:59 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:59 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1076', diff saved to https://phabricator.wikimedia.org/P11710 and previous config saved to /var/cache/conftool/dbconfig/20200630-105254-marostegui.json
* 10:45 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:45 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 10:41 ema: cp2040: restart purged and varnishkafka to use updated librdkafka1 [[phab:T256444|T256444]]
* 10:38 ema: cp2040: upgrade librdkafka1 to 0.11.6-1.1wmf1 https://phabricator.wikimedia.org/P11703 [[phab:T256444|T256444]]
* 10:37 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:37 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 10:30 hashar@deploy1001: Synchronized php-1.35.0-wmf.39/includes/specials/SpecialUndelete.php: Remove another use of PageArchive::getRevision - [[phab:T249982|T249982]] [[phab:T254176|T254176]] (duration: 00m 56s)
* 10:09 marostegui: Deploy schema change on db1076
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076', diff saved to https://phabricator.wikimedia.org/P11708 and previous config saved to /var/cache/conftool/dbconfig/20200630-100912-marostegui.json
* 10:04 vgutierrez: rolling restart of eqiad cache nodes to catch up on kernel upgrades
* 10:03 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 10:02 volker-e@deploy1001: Finished deploy [design/style-guide@e3fda83]: Deploy design/style-guide:  (duration: 00m 07s)
* 10:02 volker-e@deploy1001: Started deploy [design/style-guide@e3fda83]: Deploy design/style-guide:
* 09:47 hashar@deploy1001: Pruned MediaWiki: 1.35.0-wmf.37 (duration: 02m 20s)
* 09:40 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:40 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 09:21 hashar@deploy1001: Pruned MediaWiki: 1.35.0-wmf.36 (duration: 28m 11s)
* 08:54 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:53 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
* 08:53 hashar@deploy1001: clean aborted: Pruned MediaWiki: 1.35.0-wmf.36 (duration: 00m 00s)
* 08:51 hashar: Applied security patches to wmf/1.35.0-wmf.39 # [[phab:T254176|T254176]]
* 08:51 vgutierrez: rolling restart of codfw cp nodes after "re-formatting" nvme devices - [[phab:T256655|T256655]]
* 08:23 vgutierrez: repool cp3053 - [[phab:T256632|T256632]]
* 08:10 hashar: 1.35.0-wmf.39 was branched at {{Gerrit|e169e3dabcb2217809fc41ba44b43a39ae1a678e}} [[phab:T254176|T254176]]
* 08:05 marostegui: Stop MySQL on db1117:3322 to clone db1080 (this will trigger haproxy alerts) - [[phab:T256717|T256717]]
* 08:05 vgutierrez: powercycle cp3053 (unresponsive after reboot) - [[phab:T256632|T256632]]
* 08:01 jbond42: disable puppet to restart puppetmasters front ends
* 07:42 vgutierrez: reboot cp3053 - [[phab:T256632|T256632]]
* 05:51 jhuneidi@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 05:13 marostegui: Deploy schema change on s8 codfw - [[phab:T256680|T256680]]
* 04:58 marostegui: remove pl_from index from db1141, db1121, db1148 - [[phab:T256684|T256684]]
* 04:57 jhuneidi@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 04:56 marostegui: Remove plfrom from db1096:3316 and db1098:3316 - [[phab:T256684|T256684]]
 
== 2020-06-29 ==
* 23:28 eileen: civicrm revision changed from {{Gerrit|52a32f2d66}} to {{Gerrit|391d0fdf75}}, config revision is {{Gerrit|f1b4bdb7b7}}
* 22:00 sbassett: Deployed patch for [[phab:T256171|T256171]]
* 21:56 sbassett: Deployed patch for [[phab:T255918|T255918]]
* 20:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1144:3315 [[phab:T256679|T256679]]', diff saved to https://phabricator.wikimedia.org/P11699 and previous config saved to /var/cache/conftool/dbconfig/20200629-200002-marostegui.json
* 19:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3315 [[phab:T256679|T256679]]', diff saved to https://phabricator.wikimedia.org/P11698 and previous config saved to /var/cache/conftool/dbconfig/20200629-194327-marostegui.json
* 18:55 shdubsh: test mtail rc35+wmf2 on cp5001 - [[phab:T255776|T255776]]
* 18:15 Urbanecm: Morning B&C done
* 18:15 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c86fcd4}}: Add HTTP proxy to MediaModeration ([[phab:T247943|T247943]]) (duration: 00m 58s)
* 18:10 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|aeb7b52}}: Setup rollbacker and mover on lijwiki ([[phab:T256109|T256109]]) (duration: 02m 05s)
* 17:30 sukhe: LDAP - added datn to groups wmde, nda - [[phab:T254442|T254442]]
* 15:43 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:43 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 15:37 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:37 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 15:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P11696 and previous config saved to /var/cache/conftool/dbconfig/20200629-153140-marostegui.json
* 15:20 gehel: repool wdqs1004 - catched up on lag
* 14:50 hnowlan@deploy1001: Finished deploy [restbase/deploy@900bcf6]: Redeploy to fix transient error in gom wiktionary deploy (duration: 00m 06s)
* 14:50 hnowlan@deploy1001: Started deploy [restbase/deploy@900bcf6]: Redeploy to fix transient error in gom wiktionary deploy
* 14:48 hnowlan@deploy1001: Finished deploy [restbase/deploy@900bcf6]: Enable gom wiktionary (duration: 13m 40s)
* 14:34 hnowlan@deploy1001: Started deploy [restbase/deploy@900bcf6]: Enable gom wiktionary
* 14:33 hnowlan@deploy1001: Finished deploy [restbase/deploy@900bcf6]: Enable gom wiktionary (duration: 17m 49s)
* 14:28 ema: A:cp rolling purged upgrade to 0.16 [[phab:T256479|T256479]]
* 14:22 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:608309{{!}}Add "E" as an alias of EntitySchema namespace on wikidata (T245529)]] (duration: 00m 57s)
* 14:20 ema: upload purged 0.16 to apt.wm.org [[phab:T256479|T256479]]
* 14:16 hnowlan@deploy1001: Started deploy [restbase/deploy@900bcf6]: Enable gom wiktionary
* 14:14 hnowlan@deploy1001: Finished deploy [restbase/deploy@ce5177e]: Enable gom wiktionary (duration: 20m 44s)
* 14:02 jforrester@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: Fix 'closed-labs' reading as 'closed' for static config (duration: 00m 56s)
* 13:54 jforrester@deploy1001: Synchronized dblists/: Drop nonbetafeatures dblist, unused (duration: 00m 57s)
* 13:54 hnowlan@deploy1001: Started deploy [restbase/deploy@ce5177e]: Enable gom wiktionary
* 13:50 jforrester@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: Drop 'nonbetafeatures' dblist from production reads (duration: 00m 56s)
* 13:49 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch uses from nonbetafeatures to lockeddown (duration: 00m 57s)
* 13:47 jforrester@deploy1001: Synchronized multiversion/MWConfigCacheGenerator.php: Add 'lockeddown' dblist to production reads (duration: 00m 57s)
* 13:43 jforrester@deploy1001: Synchronized dblists/lockeddown.dblist: Add lockddown dblist (unused as yet) (duration: 00m 59s)
* 13:35 vgutierrez: depool cp3053 due to nvme hardware issues
* 13:02 XioNoX: test pfw3-codfw uplinks failover
* 13:00 elukey: move archiva.wikimedia.org to archiva1002 (new buster vm); create archiva-old.wikimedia.org to archiva1001
* 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312', diff saved to https://phabricator.wikimedia.org/P11693 and previous config saved to /var/cache/conftool/dbconfig/20200629-125824-marostegui.json
* 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1085', diff saved to https://phabricator.wikimedia.org/P11692 and previous config saved to /var/cache/conftool/dbconfig/20200629-125630-marostegui.json
* 12:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 12:32 jayme: deleted all tags for docker-registry.wikimedia.org/envoy-tls-local-proxy from docker registry - [[phab:T253396|T253396]]
* 12:20 marostegui: Stop MySQL on db2096 (codfw x1 master) for reimage [[phab:T254871|T254871]]
* 12:03 cdanis: re-pool eqiad [[phab:T256512|T256512]]
* 11:59 cdanis: deployed {{Gerrit|I132075ee}} on cr1-eqiad [[phab:T256512|T256512]]
* 11:58 cdanis: deployed {{Gerrit|I132075ee}} on cr2-eqiad [[phab:T256512|T256512]]
* 11:58 cdanis: deployed {{Gerrit|I132075ee}} on cr2-eqiad
* 11:41 cdanis: depool eqiad  [[phab:T256512|T256512]]
* 11:15 awight: EU BACON cooked
* 11:08 marostegui: Deploy schema change on db1095:3312 (lag will show up)
* 10:41 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:608284{{!}} Bumping portals to master (608284)]] (duration: 00m 57s)
* 10:41 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:608284{{!}} Bumping portals to master (608284)]] (duration: 00m 58s)
* 10:29 gehel: restart blazegraph on wdqs1004 + depool to catchup on lag
* 09:59 ema: cp2040: upgrade purged to 0.16 [[phab:T256479|T256479]]
* 09:59 jbond42: switch idp to memcached
* 09:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:47 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:45 marostegui: Deploy schema change on dbstore1004:3312
* 09:11 jbond42: dploying shellcheck CI https://gerrit.wikimedia.org/r/c/operations/puppet/+/602693
* 08:59 marostegui: Compress InnoDB on db1089 (this will cause lag and will take a few days) - [[phab:T254462|T254462]]
* 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for InnoDB compression [[phab:T254462|T254462]]', diff saved to https://phabricator.wikimedia.org/P11690 and previous config saved to /var/cache/conftool/dbconfig/20200629-085854-marostegui.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Fully pool db1135 into s1 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11688 and previous config saved to /var/cache/conftool/dbconfig/20200629-084827-marostegui.json
* 08:40 ema: cp2034: restart purged [[phab:T256444|T256444]]
* 08:36 ema: cp4025: restart purged [[phab:T256444|T256444]]
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly pool db1135 into s1 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11687 and previous config saved to /var/cache/conftool/dbconfig/20200629-083631-marostegui.json
* 08:33 ema: cp1087, cp2033, cp2037, cp2039: repool after spending (way) more than 24h depooled [[phab:T256444|T256444]]
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly pool db1135 into s1 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11686 and previous config saved to /var/cache/conftool/dbconfig/20200629-082635-marostegui.json
* 08:24 marostegui: Deploy schema change on s2 codfw (lag will show up) [[phab:T253276|T253276]]
* 08:04 XioNoX: add term selected-paths to policy BGP_IXP_in on all routers
* 08:03 godog: prometheus eqiad -- lvextend --resizefs --size +200G vg-ssd/prometheus-ops
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly pool db1135 into s1 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11685 and previous config saved to /var/cache/conftool/dbconfig/20200629-080253-marostegui.json
* 07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1135 (depooled) to s1 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11684 and previous config saved to /var/cache/conftool/dbconfig/20200629-074611-marostegui.json
* 07:16 XioNoX: push new pfw firewall rules - [[phab:T256170|T256170]]
* 07:13 marostegui: Deploy schema change on db1085 with replication to labs [[phab:T253276|T253276]]
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085', diff saved to https://phabricator.wikimedia.org/P11683 and previous config saved to /var/cache/conftool/dbconfig/20200629-071236-marostegui.json
* 06:53 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1080 from MW', diff saved to https://phabricator.wikimedia.org/P11682 and previous config saved to /var/cache/conftool/dbconfig/20200629-065335-marostegui.json
* 06:50 elukey: execute gnt-instance remove an-launcher1001.eqiad.wmnet on ganeti1011 - [[phab:T256363|T256363]]
* 06:47 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:46 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 06:45 marostegui: Deploy MCR schema change  on db1090:3312
* 06:35 elukey: force puppet run on ores* to overcome celery OOMs on some nodes
* 04:57 marostegui: Stop MySQL on db1080 to clone db1135 [[phab:T253217|T253217]]
* 04:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
 
== 2020-06-28 ==
* 21:43 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: no-op {{Gerrit|I56eb4a802}} (duration: 00m 58s)
* 21:38 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: beta-only {{Gerrit|I56eb4a802}} (duration: 01m 00s)
 
== 2020-06-27 ==
* 20:22 qchris: Gerrit upgrade done.
* 19:49 mutante: removed 2620:0:861:3:208:80:154:136 from /etc/network/interfaces on gerrit1001, rebooting
* 19:27 mutante: rebooting gerrit1001 one more time
* 19:24 mutante: restarted ferm on gerrit1001
* 19:19 mutante: rebooting gerrit1001 one more time
* 19:05 mutante: rebooting gerrit1001
* 18:58 mutante: rebooting gerrit2001
* 18:49 hashar: Enabling beta cluster update job (gerrit maintenance) https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/
* 18:35 qchris@deploy1001: Finished deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit2001 (duration: 00m 10s)
* 18:34 qchris@deploy1001: Started deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit2001
* 18:27 qchris@deploy1001: Finished deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit1001 (duration: 00m 08s)
* 18:27 qchris@deploy1001: Started deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit1001
* 17:25 hashar: Disabled beta cluster update job (gerrit maintenance) https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/
* 17:19 qchris: Stopping gerrit on gerrit1001 for the Gerrit upgrade
* 17:14 qchris: Duplicating reviewdb changes so we get a cheap and quick rollback
* 17:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:11 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:11 qchris: Disabling puppet on gerrit1001 for Gerrit upgrades + data migrations
* 17:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:11 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:07 qchris: Starting Gerrit upgrade to v3.2.2-98-g98d827eaa3
* 15:44 qchris@deploy1001: Finished deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit1002 (gerrit-test) (duration: 00m 08s)
* 15:44 qchris@deploy1001: Started deploy [gerrit/gerrit@da40615]: Gerrit to v3.2.2-98-g98d827eaa3 on gerrit1002 (gerrit-test)
* 13:03 qchris@deploy1001: Finished deploy [gerrit/gerrit@460e439]: Gerrit to v3.2.2-97-gcaf5020db1 on gerrit1002 (gerrit-test) (duration: 00m 08s)
* 13:03 qchris@deploy1001: Started deploy [gerrit/gerrit@460e439]: Gerrit to v3.2.2-97-gcaf5020db1 on gerrit1002 (gerrit-test)
 
== 2020-06-26 ==
* 18:42 robh: all ulsfo onsite work completed as of 30 minutes ago
* 17:52 robh: msw2-ulsfo work done, all mgmt items confirmed back online and icinga alerts cleared, moving onto msw1-ulsfo (rack 22) and will lose all mgmt in that rack for next 10-20 minutes [[phab:T256300|T256300]]
* 17:52 robh: msw2-ulsfo work done, all mgmt items confirmed back online and icinga alerts cleared, moving onto msw1-ulsfo (rack 22) and will lose all mgmt in that rack for next 10-20 minutes
* 17:11 robh: msw work in ulsfo via [[phab:T256300|T256300]]
* 10:24 ema: pool 5006 [[phab:T256449|T256449]]
* 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1085', diff saved to https://phabricator.wikimedia.org/P11677 and previous config saved to /var/cache/conftool/dbconfig/20200626-102248-marostegui.json
* 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1093', diff saved to https://phabricator.wikimedia.org/P11676 and previous config saved to /var/cache/conftool/dbconfig/20200626-102201-marostegui.json
* 10:03 ema: cp2039: restart purged [[phab:T256444|T256444]]
* 09:57 ema: cp2037: restart purged [[phab:T256444|T256444]]
* 09:55 ema: cp1087: restart purged [[phab:T256444|T256444]]
* 09:46 ema: cp2033: restart purged [[phab:T256444|T256444]]
* 09:38 akosiaris: move the sessionstore eqiad pods back to the dedicated sessionstore nodes
* 09:37 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 09:35 akosiaris: move the sessionstore codfw pods back to the dedicated sessionstore nodes
* 09:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093 for schema change', diff saved to https://phabricator.wikimedia.org/P11675 and previous config saved to /var/cache/conftool/dbconfig/20200626-090813-marostegui.json
* 08:58 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:56 jynus@cumin1001: START - Cookbook sre.hosts.downtime
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1088', diff saved to https://phabricator.wikimedia.org/P11674 and previous config saved to /var/cache/conftool/dbconfig/20200626-083319-marostegui.json
* 08:25 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1088 for schema change', diff saved to https://phabricator.wikimedia.org/P11673 and previous config saved to /var/cache/conftool/dbconfig/20200626-082242-marostegui.json
* 08:20 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:20 ayounsi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:05 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes.*.wmnet
* 08:04 akosiaris@cumin1001: conftool action : set/weight=10; selector: name=kubernetes.*.wmnet
* 08:04 akosiaris: pool all new kubernetes nodes in LVS [[phab:T252185|T252185]] [[phab:T256236|T256236]]
* 07:57 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 07:44 volans: force rebooted cp5006 that is unresponsive (after having depooled it) - [[phab:T256449|T256449]]
* 07:42 volans@cumin1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet
* 06:40 tstarling@deploy1001: Synchronized wmf-config/InitialiseSettings.php: add cache-cookies log channel (duration: 00m 59s)
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2088:3312, db2104', diff saved to https://phabricator.wikimedia.org/P11672 and previous config saved to /var/cache/conftool/dbconfig/20200626-051328-marostegui.json
* 05:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 04:01 cdanis: re-enable puppet on cps
* 03:54 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕛🍺 sudo cumin A:cp 'disable-puppet "I39e1c68a is broken"'
* 03:54 cdanis: https://gerrit.wikimedia.org/r/c/operations/puppet/+/607917
* 02:52 tstarling@deploy1001: Synchronized private/PrivateSettings.php: updating wgAuthenticationTokenVersion per my wikitech-l post (duration: 00m 57s)
* 02:19 cdanis: three more hosts not processing purges for multiple days ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕥🍺 sudo cumin 'cp2033*,cp2037*,cp2039*' 'depool'
* 02:17 cdanis: depooling cp1087 which has not been processing purges for 11.415 days
* 01:53 cdanis: {{Gerrit|I6cc5f3e6}} has been deployed to all cp text nodes [[phab:T256395|T256395]]
* 01:41 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕘🍺 sudo cumin A:cp 'enable-puppet "cdanis deploying {{Gerrit|I6cc5f3e6}} [[phab:T256395|T256395]]"'
* 01:13 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕘🍺 sudo cumin A:cp 'disable-puppet "cdanis deploying {{Gerrit|I6cc5f3e6}} [[phab:T256395|T256395]]"'
* 00:41 eileen: tools revision changed from {{Gerrit|c96813eda4}} to {{Gerrit|aab96444df}}
* 00:38 tstarling@deploy1001: Synchronized w/T256395-cookie-test.php: (no justification provided) (duration: 00m 56s)
* 00:36 tstarling@deploy1001: Synchronized w/T256395-cookie-test.php: (no justification provided) (duration: 00m 58s)
 
== 2020-06-25 ==
* 23:37 mutante: puppetmaster - signing certs and initial puppet run for logstash1030/logstash1031 - no prod role yet
* 22:25 mutante: puppetmaster - signing certs and initial run for logstash2030/2031 - no prod role yet
* 20:57 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 20:31 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:30 dcausse: repooling wdqs1007.eqiad.wmnet
* 19:05 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.38
* 18:58 mutante: LDAP - added qchris to archiva-deployers ([[phab:T256404|T256404]])
* 17:37 mutante: mwmaint1002 - restarted apache2 to add server_headers snippet for [[phab:T255629|T255629]] - but not working as expected yet
* 16:40 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 16:31 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:31 sukhe@cumin1001: START - Cookbook sre.hosts.downtime
* 16:28 krinkle@deploy1001: Synchronized wmf-config/logging.php: {{Gerrit|Ia6ef7617d378}} (duration: 01m 02s)
* 16:20 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 16:16 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 16:16 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 16:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 16:15 Krinkle: I've deleted a "saved object" visualisation in logstash called "Production Errors & Deployments" which seemed to be corrupt and redirect random logstash dashboards to a management page. Backed up at https://phabricator.wikimedia.org/P11666 (NDA)
* 16:15 moritzm: installing libxml2 security updates
* 16:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 16:09 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 16:06 moritzm: installing 4.9.210-1+deb9u1~deb8u1 on jessie hosts (fixed kernel for recent cacheoutattack CPU leaks)
* 16:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 16:03 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 15:55 krinkle@deploy1001: Synchronized wmf-config/logging.php: {{Gerrit|I4c519f88c613fc}} (duration: 01m 05s)
* 15:54 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:53 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:51 vgutierrez: upgrade ATS in eqiad to version 8.0.8
* 15:42 ppchelko@deploy1001: Finished deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], more groups (duration: 05m 09s)
* 15:37 ppchelko@deploy1001: Started deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], more groups
* 15:37 ppchelko@deploy1001: Finished deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], more groups (duration: 03m 38s)
* 15:33 ppchelko@deploy1001: Started deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], more groups
* 15:33 ppchelko@deploy1001: Finished deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], more groups (duration: 03m 24s)
* 15:30 vgutierrez: upgrade ATS in codfw to version 8.0.8
* 15:30 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:30 ppchelko@deploy1001: Started deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], more groups
* 15:29 ppchelko@deploy1001: Finished deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], take 2 (duration: 06m 38s)
* 15:29 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:25 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: structured logging for xff log, stop logging jobrunner requests (duration: 01m 05s)
* 15:23 ppchelko@deploy1001: Started deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]], take 2
* 15:20 ppchelko@deploy1001: Finished deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]] (duration: 01m 37s)
* 15:19 ppchelko@deploy1001: Started deploy [restbase/deploy@821e96b]: Only emit vary: accept-language for feeds when it matters [[phab:T256358|T256358]]
* 14:51 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:48 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 14:43 vgutierrez: upgrade ATS in esams to version 8.0.8
* 14:29 papaul: replacing mr1-codfw
* 14:24 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:19 vgutierrez: upgrade ATS in eqsin to version 8.0.8
* 14:19 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:17 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:05 marostegui: Stop MySQL on db2104 and db2088:3312
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2104', diff saved to https://phabricator.wikimedia.org/P11664 and previous config saved to /var/cache/conftool/dbconfig/20200625-140519-marostegui.json
* 14:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:04 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db2088:3312', diff saved to https://phabricator.wikimedia.org/P11663 and previous config saved to /var/cache/conftool/dbconfig/20200625-140421-marostegui.json
* 13:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:57 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T254301|T254301]] Remove OAuthReplaceMessage hook subscriber (duration: 01m 05s)
* 13:56 vgutierrez: upgrade ATS in ulsfo to version 8.0.8
* 13:51 vgutierrez: upload trafficserver 8.0.8 to apt.wm.o (buster)
* 13:51 reedy@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: Replace PasswordNotInLargeBlacklist with PasswordNotInCommonList (duration: 01m 05s)
* 13:49 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Replace PasswordNotInLargeBlacklist with PasswordNotInCommonList (duration: 01m 06s)
* 13:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:36 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:30 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:28 godog: bounce logstash on logstash1007
* 13:26 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:19 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:13 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:02 moritzm: installing 4.9.210-1+deb9u1~deb8u1 on jessie hosts (fixed kernel for recent cacheoutattack CPU leaks)
* 12:55 elukey: rename notebook1003 to an-launcher1002 - [[phab:T256363|T256363]]
* 12:51 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:47 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:45 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 12:44 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:44 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 12:42 moritzm: installing libmspack security updates
* 12:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:32 moritzm: installing libssh2 security updates
* 12:30 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:30 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:27 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:26 moritzm: installing libjpeg-turbo security updates
* 12:25 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 12:25 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 12:21 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:17 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:13 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:09 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:55 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:55 moritzm: installing python3.4 security updates
* 11:55 awight: EU BACON is cooked
* 11:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:50 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:607767{{!}}Enable QuickSurveys on metawiki (T253112)]] (duration: 01m 05s)
* 11:48 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:45 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:41 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:38 awight@deploy1001: Synchronized wmf-config/InitialiseSettings.php: BACON: [[gerrit:607763{{!}}Enable WMDE Tech Wishes survey configuration (T253112)]] (duration: 01m 09s)
* 11:36 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:27 moritzm: rolling reboot of  ms-be[1044-1059].eqiad.wmnet
* 11:25 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:21 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:56 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:53 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:50 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:45 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:45 moritzm: rolling reboot of ms-be[2044-2056]
* 10:41 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:38 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:33 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:27 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:25 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:22 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:18 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:17 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 10:17 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 10:13 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:12 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 10:12 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 10:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 10:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 10:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 10:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 10:04 akosiaris: poweroff kubestagetcd1004 and ganeti1005 for [[phab:T244530|T244530]]
* 10:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:00 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:00 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:00 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:00 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 09:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:57 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 09:57 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 09:53 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:37 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:34 volans@cumin1001: START - Cookbook sre.dns.netbox
* 09:28 akosiaris: schedule downtime for eqiad wikifeeds as it's flapping too much without yet knowing why. [[phab:T256358|T256358]]
* 09:28 godog: extend lv on thanos-fe2001 and restart thanos-compact
* 09:21 vgutierrez: rolling restart of  ncredir instances to catch up on kernel updates
* 09:13 joal@deploy1001: Finished deploy [analytics/refinery@4aba370] (thin): Analytics fix over weekly train THIN [analytics/refinery@4aba370] (duration: 00m 10s)
* 09:13 joal@deploy1001: Started deploy [analytics/refinery@4aba370] (thin): Analytics fix over weekly train THIN [analytics/refinery@4aba370]
* 09:13 joal@deploy1001: Finished deploy [analytics/refinery@4aba370]: Analytics fix over weekly train [analytics/refinery@4aba370] (duration: 16m 27s)
* 09:01 vgutierrez: restarting acme-chief instances to catch up on kernel updates
* 08:56 joal@deploy1001: Started deploy [analytics/refinery@4aba370]: Analytics fix over weekly train [analytics/refinery@4aba370]
* 08:42 hashar: releases2002: restarted bacula-fd to take in account the puppet provided configuration  # [[phab:T247652|T247652]]
* 08:14 jynus: restarting bacula-dir on backup1001
* 08:09 akosiaris: restart etherpad-lite on etherpad1002
* 08:03 marostegui: Failover m1 from db1135 to db1097 - [[phab:T254556|T254556]]
* 07:52 jynus: stop bacula-director on backup1001 for db maintenance [[phab:T254556|T254556]]
* 07:49 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 07:49 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 07:49 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 07:49 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 07:49 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 07:48 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 07:48 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 07:47 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 07:36 elukey: reboot an-launcher1001 for kernel upgrades
* 07:18 elukey: reboot kafkamon* vms for kernel upgrades
* 07:08 marostegui: Start pre switchover steps on m1 [[phab:T254556|T254556]]
* 06:40 elukey: reboot matomo1002 for kernel upgrades
* 06:35 elukey: reboot archiva1002 (new vm, not yet in service) for kernel upgrades
* 06:34 elukey: reboot archiva for kernel upgrades
* 06:31 elukey: force puppet run on ores1003/1005 to restore celery (killed by the oom)
* 06:24 elukey: reboot an-tool* vms for kernel upgrades
* 06:23 elukey: reboot analytics-tool1004 for kernel upgrades (Superset host)
* 06:22 elukey: reboot analytics-tool1001 for kernel upgrades
* 06:19 elukey: execute ip addr flush ens5 on an-airflow1001 to clear RTNETLINK answers: File exists (error from ifup@ens5.service)
* 06:03 elukey: reboot an-airflow1001 for kernel upgrades
* 04:26 marostegui: Remove triggers from db2095:3312 - [[phab:T238966|T238966]]
* 04:25 marostegui: Deploy schema change on s2 codfw - [[phab:T238966|T238966]]
* 00:48 twentyafterfour: restart php-fpm on phab1001 to fix [[phab:T256343|T256343]]
* 00:12 twentyafterfour: phabricator updated, all seems normal
* 00:11 twentyafterfour: updating phabricator to release/2020-06-25/1, momentary (<1 minute) downtime expected.
 
== 2020-06-24 ==
* 23:44 mutante: releases2002 - systemctl stop jenkins, kill 15244 (rogue jenkins process), start jenkins with systemctl start jenkins ([[phab:T247652|T247652]])
* 23:43 mutante: releases1002 - kill rogue jenkins process, start jenkins with systemctl start jenkins ([[phab:T247652|T247652]])
* 23:02 mutante: releases1002/2002 - disabling puppet, removing failing cron job to pull deployment_charts (because /srv/deployment-charts does not exist yet)
* 21:45 shdubsh: install mtail 3.0.0~rc35+wmf2 on logstash1007 - [[phab:T255776|T255776]]
* 20:42 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.38 (duration: 01m 06s)
* 20:41 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.38
* 20:41 brennen: train 1.35.0-wmf.38: attempting to roll forward to group1 after php-fpm restart on mw1287 ([[phab:T256305|T256305]], [[phab:T254175|T254175]])
* 20:32 cdanis: restarting php-fpm on mw1287 [[phab:T256305|T256305]]
* 20:32 bsitzmann@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 20:30 bsitzmann@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 20:28 bsitzmann@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 20:14 halfak@deploy1001: Finished deploy [ores/deploy@1b87365]: [[phab:T254505|T254505]] (duration: 14m 08s)
* 20:09 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@80c763d]: Update mobileapps to {{Gerrit|a413db4f}} (duration: 03m 37s)
* 20:06 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@80c763d]: Update mobileapps to {{Gerrit|a413db4f}}
* 20:00 halfak@deploy1001: Started deploy [ores/deploy@1b87365]: [[phab:T254505|T254505]]
* 19:38 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert Migrate SearchSatisfaction from EventLogging to EventGate on group1 - [[phab:T249261|T249261]] (duration: 01m 06s)
* 19:17 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.35.0-wmf.37
* 19:11 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.38 (duration: 01m 04s)
* 19:10 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.38
* 19:01 brennen: train 1.35.0-wmf.38: finished triage meeting, clear to proceed to group 1 ([[phab:T254175|T254175]])
* 18:53 joal@deploy1001: Finished deploy [analytics/refinery@1112749] (thin): Regular analytics weekly train THIN [analytics/refinery@1112749] (duration: 00m 09s)
* 18:53 joal@deploy1001: Started deploy [analytics/refinery@1112749] (thin): Regular analytics weekly train THIN [analytics/refinery@1112749]
* 18:53 joal@deploy1001: Finished deploy [analytics/refinery@1112749]: Regular analytics weekly train [analytics/refinery@1112749] (duration: 05m 50s)
* 18:49 Urbanecm: Morning B&C deploy window is done
* 18:48 cstone: payments-wiki revision changed from {{Gerrit|28ad76dcd7}} to {{Gerrit|91852dbc9b}}
* 18:47 Urbanecm: mwscript namespaceDupes.php --wiki=guwiki --fix ([[phab:T255358|T255358]])
* 18:47 joal@deploy1001: Started deploy [analytics/refinery@1112749]: Regular analytics weekly train [analytics/refinery@1112749]
* 18:46 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2a1dfc5}}: Set namespace aliases for guwiki ([[phab:T255358|T255358]]) (duration: 01m 05s)
* 18:42 Urbanecm: mwscript namespaceDupes.php --wiki=banwiki --add-prefix=[[phab:T255941|T255941]] --fix ([[phab:T255941|T255941]])
* 18:41 Urbanecm: Run mwscript namespaceDupes.php --wiki=banwiki --fix ([[phab:T255941|T255941]])
* 18:41 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c6d6c85}}: Set WP as a NS_PROJECT alias for banwiki ([[phab:T255941|T255941]]) (duration: 01m 06s)
* 18:38 Urbanecm: Run mwscript namespaceDupes.php dewiktionary --fix ([[phab:T256242|T256242]])
* 18:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2b93e0f}}: Define Rekonstruktion NS for dewiktionary ([[phab:T256242|T256242]]) (duration: 01m 05s)
* 18:29 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dea9214}}: Revert "IS: Cleanup some redundant rows." ([[phab:T256279|T256279]]) (duration: 01m 05s)
* 18:25 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventBus: Emit kafka purges for everything gerrit:607298 (duration: 01m 05s)
* 18:19 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable MediaModeration on group0 gerrit:607327 (duration: 01m 04s)
* 18:08 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable click tracking in Vector on beta cluster gerrit:607136 IS.php (duration: 01m 05s)
* 18:06 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Enable click tracking in Vector on beta cluster gerrit:607136 IS-labs.php (duration: 01m 07s)
* 17:31 elukey: update archiva-ci user's password in Jenkins credentials plugin
* 16:56 elukey: update archiva-deploy user's password in Jenkins credentials plugin
* 16:46 ppchelko@deploy1001: Finished deploy [restbase/deploy@5f08f32]: Release PCS endpoints updates, feeds timed out, redo (duration: 05m 11s)
* 16:41 ppchelko@deploy1001: Started deploy [restbase/deploy@5f08f32]: Release PCS endpoints updates, feeds timed out, redo
* 16:40 ppchelko@deploy1001: Finished deploy [restbase/deploy@5f08f32]: Release PCS endpoints updates, take 2 (duration: 14m 11s)
* 16:34 brennen@deploy1001: Finished scap: (no justification provided) (duration: 60m 22s)
* 16:30 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:27 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 16:26 ppchelko@deploy1001: Started deploy [restbase/deploy@5f08f32]: Release PCS endpoints updates, take 2
* 16:17 elukey: reimage db1108 to debian Buster - [[phab:T234826|T234826]]
* 15:53 ppchelko@deploy1001: Finished deploy [restbase/deploy@386b736]: Revert (duration: 27m 21s)
* 15:38 brennen: previous scap sync for [[phab:T256151|T256151]] - [[gerrit:607379]] and [[gerrit:607380]]
* 15:36 kormat@cumin1001: dbctl commit (dc=all): 'Pool db1088 @ 100% into s6 [[phab:T255927|T255927]]', diff saved to https://phabricator.wikimedia.org/P11652 and previous config saved to /var/cache/conftool/dbconfig/20200624-153604-kormat.json
* 15:34 brennen@deploy1001: Started scap: (no justification provided)
* 15:25 ppchelko@deploy1001: Started deploy [restbase/deploy@386b736]: Revert
* 15:24 ppchelko@deploy1001: deploy aborted: Release updates to PCS endpoints (duration: 05m 04s)
* 15:20 jayme: rolling restart of swift-proxy on thanos-fe[2001-2003].codfw.wmnet,thanos-fe[1001-1003].eqiad.wmnet - [[phab:T256020|T256020]]
* 15:19 ppchelko@deploy1001: Started deploy [restbase/deploy@9686627]: Release updates to PCS endpoints
* 15:06 brennen: merging backports and running a full scap sync for UBN at [[phab:T256151|T256151]]
* 15:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:57 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:57 moritzm: rebooting deneb for kernel update
* 14:57 ema: rmlist teampractices [[phab:T255525|T255525]]
* 14:42 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate SearchSatisfaction from EventLogging to EventGate on group0 - [[phab:T249261|T249261]] (duration: 01m 06s)
* 13:28 nikerabbit@deploy1001: Synchronized wmf-config/CommonSettings.php: [config] 603167 Remove TranslationNotifications user settings 1/2 (2nd attempt, now with correct file) (duration: 01m 06s)
* 13:23 marostegui: Deploy schema change on s6 eqiad primary master - [[phab:T238966|T238966]]
* 12:59 jbond42: update metamonitoring to use icinga-extmon.wikimedia.org
* 12:23 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes1005.eqiad.wmnet
* 12:23 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes1006.eqiad.wmnet
* 12:19 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=kubernetes1006.eqiad.wmnet
* 12:19 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=kubernetes1005.eqiad.wmnet
* 12:19 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes2005.codfw.wmnet
* 12:19 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes2006.codfw.wmnet
* 12:17 akosiaris: depool/drain/reboot/pool kubernetes1005,6 for CPU capacity increase [[phab:T256236|T256236]]
* 12:14 akosiaris: reboot kubernetes2005,6 for CPU capacity increase [[phab:T256236|T256236]]
* 12:11 akosiaris: depool kubernetes2005,kubernetes2006 for CPU capacity increase [[phab:T256236|T256236]]
* 12:10 akosiaris: depool kubernetes2005,kubernetes2006 for CPU capacity increase
* 12:05 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=kubernetes2006.codfw.wmnet
* 12:05 akosiaris@cumin1001: conftool action : set/pooled=no; selector: name=kubernetes2005.codfw.wmnet
* 12:04 awight: EU vegan BACON cooked
* 12:03 awight@deploy1001: Synchronized php-1.35.0-wmf.38/extensions/GrowthExperiments: BACON: [[gerrit:607453{{!}}Help panel home screen menu item fixes (T255254)]] (duration: 01m 06s)
* 11:40 nikerabbit@deploy1001: Synchronized private/PrivateSettings.php: Remove TranslationNotifications user settings 3/2 (duration: 01m 06s)
* 11:35 nikerabbit@deploy1001: Synchronized private/readme.php: [config] 607414 Remove TranslationNotifications user settings 2/2 (duration: 01m 04s)
* 11:28 nikerabbit@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [config] 603167 Remove TranslationNotifications user settings 1/2 (duration: 01m 03s)
* 11:09 awight@deploy1001: Synchronized wmf-config/CommonSettings.php: BACON: [[gerrit:605255{{!}}TwoColConflict: Talk page small deployment CommonSettings.php (T254458)]] (duration: 01m 17s)
* 10:45 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:38 marostegui: Stop haproxy on dbproxy1003 [[phab:T256216|T256216]]
* 10:36 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:36 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 10:01 volans: Production management IP allocation must be done from Netbox from now on, see https://wikitech.wikimedia.org/wiki/DNS/Netbox#Cutoff_dates
* 09:55 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:53 kormat@cumin1001: dbctl commit (dc=all): 'Pool db1088 @ 75% into s6 [[phab:T255927|T255927]]', diff saved to https://phabricator.wikimedia.org/P11648 and previous config saved to /var/cache/conftool/dbconfig/20200624-095338-kormat.json
* 09:50 volans@cumin1001: START - Cookbook sre.dns.netbox
* 09:36 kormat@cumin1001: dbctl commit (dc=all): 'Pool db1088 @ 50% into s6 [[phab:T255927|T255927]]', diff saved to https://phabricator.wikimedia.org/P11647 and previous config saved to /var/cache/conftool/dbconfig/20200624-093624-kormat.json
* 09:13 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:10 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 08:40 moritzm: prune remaining nginx packages on mw* servers [[phab:T255565|T255565]]
* 08:31 kormat@cumin1001: dbctl commit (dc=all): 'Pool db1088 @ 20% into s6 [[phab:T255927|T255927]]', diff saved to https://phabricator.wikimedia.org/P11645 and previous config saved to /var/cache/conftool/dbconfig/20200624-083120-kormat.json
* 08:06 moritzm: re-enable puppet in eqiad
* 08:04 marostegui@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 08:04 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 08:00 moritzm: disable puppet in eqiad to unblock puppetdb1002 VM migration
* 07:22 gehel: restarting blazegraph on wdqs1007
* 06:53 moritzm: draining ganeti1009 for eventual reboot
* 06:28 XioNoX: enable peering BGP sessions on AMS-IX - [[phab:T253970|T253970]]
* 05:59 XioNoX: disable peering BGP sessions on AMS-IX - [[phab:T253970|T253970]]
* 05:34 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 05:33 marostegui@cumin2001: START - Cookbook sre.hosts.decommission
* 05:14 marostegui: Remove grants from dbproxy1008 - [[phab:T231280|T231280]] [[phab:T255406|T255406]]
* 05:03 marostegui: Remove revision triggers from db1125:·3316
* 05:02 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1085 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P11643 and previous config saved to /var/cache/conftool/dbconfig/20200624-050235-marostegui.json
* 04:53 marostegui: Reload haproxy on dbproxy1012 and dbproxy1014
* 00:35 ejegg: restarted fundraising jobs on main CiviCRM box
* 00:33 ejegg: updated Fundraising CiviCRM from {{Gerrit|f01b036128}} to {{Gerrit|52a32f2d66}}
 
== 2020-06-23 ==
* 23:16 wkandek: releases1002 is back after being moved to row D ([[phab:T255590|T255590]])
* 23:11 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 22:35 ejegg: disabled fundraising jobs on civi1001 for testing on civi2001
* 22:24 wkandek@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 22:13 AndyRussG: updated payments-wiki from {{Gerrit|5fd4eb1519}} to {{Gerrit|28ad76dcd7}}
* 22:06 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:23 wkandek@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:23 dzahn@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97)
* 21:23 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:22 wkandek@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 21:22 wkandek@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:22 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 21:22 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 21:15 wkandek@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)
* 21:14 wkandek@cumin1001: START - Cookbook sre.hosts.decommission
* 20:31 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on all wikis - take 2 - [[phab:T238230|T238230]] (duration: 01m 06s)
* 19:16 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on all wikis - [[phab:T238230|T238230]] (duration: 01m 05s)
* 19:06 brennen@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.38
* 18:55 mutante: gerrit1001 (prod) - restarting gerrit service to verify config changes
* 18:53 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Migrate TemplateWizard from EventLogging to EventGate on group0 - [[phab:T238230|T238230]] (duration: 01m 06s)
* 18:24 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T254925|T254925]] [[phab:T246489|T246489]] (duration: 01m 06s)
* 18:04 brennen@deploy1001: Finished scap: testwikis wikis to 1.35.0-wmf.38 (duration: 85m 53s)
* 16:39 brennen@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.38
* 16:01 brennen: 1.35.0-wmf.38 was branched at {{Gerrit|a35f7318}} for https://phabricator.wikimedia.org/T254175
* 15:47 moritzm: prune nginx packages on mwdebug hosts [[phab:T255565|T255565]]
* 15:37 moritzm: prune nginx packages on mw1380-mw1412 [[phab:T255565|T255565]]
* 15:28 moritzm: installing libvpx security updates
* 15:27 mutante: removing ganeti VM xhgui1001 from eqiad row_A, will recreate in another row for rebalancing VMs between rows ([[phab:T180761|T180761]] [[phab:T238098|T238098]])
* 15:26 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 15:18 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 15:12 mutante: removing ganeti VM releases1002 in eqiad row_A - will recreate in another row to re-balance ([[phab:T255590|T255590]])
* 15:12 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 15:10 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
* 14:56 moritzm: failover ganeti master in eqiad to ganeti1011
* 14:55 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:50 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:48 urbanecm@deploy1001: Synchronized private/PrivateSettings.php: [[phab:T250887|T250887]] (duration: 00m 58s)
* 14:08 mholloway-shell@deploy1001: Finished deploy [recommendation-api/deploy@db7fd80]: Update recommendation-api to {{Gerrit|7e00177}} (duration: 03m 13s)
* 14:05 mholloway-shell@deploy1001: Started deploy [recommendation-api/deploy@db7fd80]: Update recommendation-api to {{Gerrit|7e00177}}
* 13:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:54 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 13:54 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:54 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 13:34 moritzm: draining ganeti1012 for eventual reboot
* 13:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:28 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:56 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:54 jynus@cumin1001: START - Cookbook sre.hosts.downtime
* 12:45 moritzm: draining ganeti1011 for eventual reboot
* 12:45 marostegui: Deploy schema change on s6 codfw master (lag will appear on codfw) - [[phab:T253276|T253276]]
* 12:36 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:31 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:56 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 11:35 awight: EU BACON cooked
* 11:34 awight@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/TwoColConflict/: BACON: [[gerrit:607248{{!}}Fix broken copy link in JS mode (T253724)]] (duration: 00m 57s)
* 11:07 mlitn@deploy1001: Synchronized wmf-config/InitialiseSettings.php: test commons: Use the database name in the Wikibase entity source config (duration: 00m 59s)
* 11:04 moritzm: draining ganeti1008 for eventual reboot
* 10:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 10:55 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:42 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:42 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 10:40 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:38 moritzm: temporarily shutdown xhgui1001/releases1002 to reshuffle Ganeti instances for reboots
* 10:38 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 10:35 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:35 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 10:22 kormat: reimaging db1088 to buster [[phab:T250666|T250666]]
* 10:03 jynus@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:01 jynus@cumin2001: START - Cookbook sre.hosts.downtime
* 09:48 jbond42: add new CI check for cloud yaml data https://gerrit.wikimedia.org/r/c/operations/puppet/+/606444/
* 09:46 jynus: stopping and reimaging db2101 into buster [[phab:T254871|T254871]]
* 09:32 marostegui: Reload haproxy on dbproxy1012 and dbproxy1014 to test db1097 as secondary for 24h [[phab:T254556|T254556]]
* 08:46 ema: mwmaint1002: add uid=abban,ou=people,dc=wikimedia,dc=org to group 'nda' [[phab:T255775|T255775]]
* 08:38 XioNoX: re-enable peering BGP sessions on AMS-IX - [[phab:T253970|T253970]]
* 08:03 moritzm: draining ganeti1007 for eventual reboot
* 07:58 XioNoX: restart scs-a8-eqiad - [[phab:T256101|T256101]]
* 07:51 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:49 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 07:42 marostegui: Deploy schema change on db1088
* 07:30 marostegui: Reimage db2133 (m2 codfw master) to Buster (this will trigger haproxy IRC alert) [[phab:T250666|T250666]]
* 07:01 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1118', diff saved to https://phabricator.wikimedia.org/P11637 and previous config saved to /var/cache/conftool/dbconfig/20200623-070120-marostegui.json
* 06:06 XioNoX: disable peering BGP sessions on AMS-IX - [[phab:T253970|T253970]]
* 05:24 marostegui: Compress InnoDB on db1080 [[phab:T254462|T254462]]
* 05:23 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1080 for InnoDB compression', diff saved to https://phabricator.wikimedia.org/P11636 and previous config saved to /var/cache/conftool/dbconfig/20200623-052350-marostegui.json
* 05:22 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P11635 and previous config saved to /var/cache/conftool/dbconfig/20200623-052254-marostegui.json
* 05:12 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P11634 and previous config saved to /var/cache/conftool/dbconfig/20200623-051159-marostegui.json
* 05:03 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P11633 and previous config saved to /var/cache/conftool/dbconfig/20200623-050314-marostegui.json
 
== 2020-06-22 ==
* 23:41 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: touch for [[phab:T247330|T247330]] (duration: 00m 56s)
* 23:36 catrope@deploy1001: Synchronized dblists/: Close trwikinews ([[phab:T247330|T247330]]) (duration: 00m 58s)
* 23:28 RoanKattouw: Synchronized wmf-config/InitialiseSettings.php: Create rollbacker group on elwiktionary ([[phab:T255569|T255569]])  (typoed the task number before)
* 23:26 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Create rollbacker group on elwiktionary ([[phab:T225569|T225569]]) (duration: 00m 56s)
* 23:21 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add localized sitename for bewikibooks ([[phab:T253962|T253962]]) (duration: 00m 57s)
* 23:16 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add domains to wgCopyUploadsDomains ([[phab:T255336|T255336]], [[phab:T255363|T255363]], [[phab:T255386|T255386]], [[phab:T255313|T255313]]) (duration: 01m 01s)
* 22:39 bstorm_: downtimed labstore1005 to prevent an alert during puppet merge [[phab:T253353|T253353]]
* 22:38 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:35 volans@cumin1001: START - Cookbook sre.dns.netbox
* 22:16 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@f2002c8]: bump glent jar to 0.2.2 (duration: 00m 56s)
* 22:15 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@f2002c8]: bump glent jar to 0.2.2
* 22:12 volans: cleanup interfaces and addresses in Netbox for offline servers - [[phab:T233183|T233183]]
* 21:59 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@6e7f9f7]: bump glent jar to 0.2.2 (duration: 00m 18s)
* 21:58 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@6e7f9f7]: bump glent jar to 0.2.2
* 17:19 mutante: gerrit1002 - let puppet remove [database] secttion from config; restart gerrit another time
* 17:14 mutante: gerrit1002 (gerrit-test): re-enabled puppet, restarted gerrit service
* 16:58 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:49 volans@cumin1001: START - Cookbook sre.dns.netbox
* 15:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 15:01 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:48 moritzm: installing mutt security updates
* 14:47 Amir1: creating shnwiktionary is done
* 14:44 ladsgroup@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 58s)
* 14:42 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:41 ladsgroup@deploy1001: Synchronized static/images/project-logos/: Creating shnwiktionary ([[phab:T253029|T253029]]) (duration: 00m 56s)
* 14:40 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Creating shnwiktionary ([[phab:T253029|T253029]]) (duration: 00m 56s)
* 14:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:37 ladsgroup@deploy1001: rebuilt and synchronized wikiversions files: Creating shnwiktionary ([[phab:T253029|T253029]])
* 14:36 ladsgroup@deploy1001: Synchronized dblists: Creating shnwiktionary ([[phab:T253029|T253029]]) (duration: 00m 58s)
* 14:16 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 14:10 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 14:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:59 moritzm: re-enabling Puppet in codfw
* 13:55 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:51 moritzm: disable Puppet in codfw to reduce puppetdb2002 memory activity, unblocking the migration of the Ganeti instance for a reboot
* 13:19 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump eventlogging_Test schema version to 1.1.0 to pick up client_dt and set wgEventLoggingServiceUri for all wikis - [[phab:T238230|T238230]] (duration: 00m 58s)
* 13:11 marostegui: Stop MySQL on db2078 instances
* 12:53 vgutierrez: upgrade to trafficserver 8.0.8~rc0-1wm1 on cp5006 and cp5012
* 12:45 moritzm: draining ganeti2007 for eventual reboot
* 12:42 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 12:38 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:31 akosiaris: failover logstash2023 from ganeti2007->ganeti2023 for migration_downtime change to apply
* 12:26 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster (duration: 01m 25s)
* 12:24 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster
* 12:22 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster (duration: 00m 03s)
* 12:22 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster
* 11:53 Urbanecm: EU B&C window done
* 11:50 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/VisualEditor/modules/: Backport: {{Gerrit|0a08066}}: Revert "Allow generic params to be passed to getWikitextFragment" ([[phab:T255785|T255785]]) (duration: 00m 58s)
* 11:45 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1094', diff saved to https://phabricator.wikimedia.org/P11627 and previous config saved to /var/cache/conftool/dbconfig/20200622-114554-marostegui.json
* 11:40 moritzm: draining ganeti2008 for eventual reboot
* 11:37 volans@deploy1001: Finished deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster (duration: 00m 28s)
* 11:37 volans@deploy1001: Started deploy [homer/deploy@e9acec8]: Release v0.2.3 on cumin1001 now on buster
* 11:34 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1094', diff saved to https://phabricator.wikimedia.org/P11625 and previous config saved to /var/cache/conftool/dbconfig/20200622-113401-marostegui.json
* 11:30 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|74e8295}}: IS: Cleanup some redundant rows (duration: 00m 56s)
* 11:29 Urbanecm: Run namespaceDupes.php for zh* projects ([[phab:T165593|T165593]])
* 11:24 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1094', diff saved to https://phabricator.wikimedia.org/P11623 and previous config saved to /var/cache/conftool/dbconfig/20200622-112451-marostegui.json
* 11:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|db952ba}}: Add zh-hans and zh-hant translation of Module and Module_talk aliases for all Zh Projects ([[phab:T165593|T165593]]) (duration: 00m 56s)
* 11:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1301fd4}}: Add import sources for gomwiktionary ([[phab:T255098|T255098]]) (duration: 00m 57s)
* 11:08 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1094', diff saved to https://phabricator.wikimedia.org/P11622 and previous config saved to /var/cache/conftool/dbconfig/20200622-110806-marostegui.json
* 11:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|defa81e}}: Disable NS_USER(_TALK) search engine indexing on trwiki ([[phab:T255538|T255538]]) (duration: 00m 58s)
* 10:35 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:606985{{!}} Bumping portals to master (606985)]] (duration: 00m 56s)
* 10:34 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:606985{{!}} Bumping portals to master (606985)]] (duration: 01m 12s)
* 09:58 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:56 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 09:33 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1094 for reimage', diff saved to https://phabricator.wikimedia.org/P11621 and previous config saved to /var/cache/conftool/dbconfig/20200622-093323-marostegui.json
* 09:31 godog: roll-restart logstash in codfw/eqiad to apply configuration change
* 08:59 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:56 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:33 moritzm: reimaging cumin1001 to buster [[phab:T245114|T245114]]
* 08:13 godog: extend prometheus codfw ops filesystem to 1TB
* 08:02 vgutierrez: upgrade to trafficserver 8.0.8~rc0-1wm1 on cp4026 and cp4032
* 08:02 vgutierrez: upload trafficserver 8.0.8~rc0-1wm1 to apt.wm.o (buster)
* 07:33 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:30 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 07:16 marostegui: Reimage db1117 (irc haproxy alerts will be triggered)
* 06:26 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:24 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:06 marostegui: Stop MySQL on dbstore1005 for reimage to Buster - [[phab:T254870|T254870]]
* 05:58 marostegui: Compress InnoDb on db1118 [[phab:T254462|T254462]]
* 05:51 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:49 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 05:43 marostegui: Stop haproxy on dbproxy1008 - [[phab:T255406|T255406]]
* 05:33 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1118 for reimage and InnoDB compression', diff saved to https://phabricator.wikimedia.org/P11617 and previous config saved to /var/cache/conftool/dbconfig/20200622-053334-marostegui.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1134', diff saved to https://phabricator.wikimedia.org/P11616 and previous config saved to /var/cache/conftool/dbconfig/20200622-053104-marostegui.json
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11615 and previous config saved to /var/cache/conftool/dbconfig/20200622-051730-marostegui.json
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11614 and previous config saved to /var/cache/conftool/dbconfig/20200622-051720-marostegui.json
* 05:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11613 and previous config saved to /var/cache/conftool/dbconfig/20200622-050259-marostegui.json
* 04:50 marostegui: Deploy schema change on s3 primary master with a big sleep between wikis - [[phab:T250066|T250066]]
* 04:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P11612 and previous config saved to /var/cache/conftool/dbconfig/20200622-044853-marostegui.json
 
== 2020-06-20 ==
* 22:56 cdanis@cumin2001: dbctl commit (dc=all): 'db1088 seems to have crashed', diff saved to https://phabricator.wikimedia.org/P11611 and previous config saved to /var/cache/conftool/dbconfig/20200620-225624-cdanis.json
* 07:42 elukey: powercycle an-worker1093 - bug soft lock up CPU showed in mgmt console
* 07:36 elukey: powercycle an-worker1091 - bug soft lock up CPU showed in mgmt console
 
== 2020-06-19 ==
* 18:10 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump eventlogging_Test schema version to 1.1.0 to pick up client_dt - [[phab:T238230|T238230]] (duration: 00m 59s)
* 16:07 mutante: ganeti4003 - rebooting install4001 - trying to bootstrap OS install from install2003
* 15:47 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:28 godog: roll-restart kibana to apply new settings
* 13:01 moritzm: installing cups security updates (client side libs/tools)
* 12:31 qchris: Disabling puppet on gerrit1002 (test instance) to do some more testing
* 12:14 godog: delete march indices from logstash 5 eqiad to free up space
* 12:12 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:10 marostegui@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:08 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 12:07 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:06 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 12:05 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 11:39 marostegui: Reimage db2116 db2119 db2130
* 10:55 moritzm: installing mesa security updates
* 10:49 godog: close april logstash indices on logstash 5 eqiad
* 10:45 moritzm: installing tomcat8 security updates
* 10:38 jayme: imported chartmuseum_0.12.0-1 to buster-wikimedia
* 10:24 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1093', diff saved to https://phabricator.wikimedia.org/P11604 and previous config saved to /var/cache/conftool/dbconfig/20200619-102447-marostegui.json
* 10:21 godog: start closing logstash indices for 2020.03 in elastic 5 eqiad
* 09:22 godog: restart elasticsearch on logstash1010
* 09:14 apergos: rsync from dumpsdata1003 as root to labstore1007 of dumps output files to catch up, with --bwlimit=160000 up from 80000
* 08:45 volans: backup netbox and run one-time script to reserve first IPs on all infra prefixes on Netbox - [[phab:T233183|T233183]]
* 08:45 godog: roll restart elasticsearch_5@production-logstash-eqiad
* 08:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:21 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:18 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:15 godog: roll-restart logstash elk5 for "JVM GC Old generation-s runs" alert
* 08:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:59 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1093', diff saved to https://phabricator.wikimedia.org/P11601 and previous config saved to /var/cache/conftool/dbconfig/20200619-075907-marostegui.json
* 07:54 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:47 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:44 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P11600 and previous config saved to /var/cache/conftool/dbconfig/20200619-074420-marostegui.json
* 07:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:28 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:23 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:16 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:15 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:10 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:09 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:03 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:02 moritzm: rebooting ganeti nodes in eqiad for kernel security updates
* 06:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 06:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 06:47 moritzm: force reinstall of memcached 1.6 deb packages to ensure that the override is used in addition to the unmodified systemd unit from the deb [[phab:T233933|T233933]]
* 06:39 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:36 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:20 marostegui: Stop mysql on db2132 to reimage m1 codfw master - [[phab:T254556|T254556]]
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2075 db2111', diff saved to https://phabricator.wikimedia.org/P11599 and previous config saved to /var/cache/conftool/dbconfig/20200619-061922-marostegui.json
* 06:05 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:02 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:01 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 06:00 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1112', diff saved to https://phabricator.wikimedia.org/P11598 and previous config saved to /var/cache/conftool/dbconfig/20200619-055430-marostegui.json
* 05:41 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db2075 and db2111 for reimage', diff saved to https://phabricator.wikimedia.org/P11597 and previous config saved to /var/cache/conftool/dbconfig/20200619-054118-marostegui.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2108', diff saved to https://phabricator.wikimedia.org/P11596 and previous config saved to /var/cache/conftool/dbconfig/20200619-053402-marostegui.json
* 05:25 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:23 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 04:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2108 for reimage', diff saved to https://phabricator.wikimedia.org/P11595 and previous config saved to /var/cache/conftool/dbconfig/20200619-044440-marostegui.json
* 04:39 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1098:3316', diff saved to https://phabricator.wikimedia.org/P11594 and previous config saved to /var/cache/conftool/dbconfig/20200619-043956-marostegui.json
* 04:35 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P11593 and previous config saved to /var/cache/conftool/dbconfig/20200619-043554-marostegui.json
 
== 2020-06-18 ==
* 22:30 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on all wikis - [[phab:T249261|T249261]] (duration: 00m 56s)
* 21:14 volans: start check-homer-diff.service on cumin2001 after merging the fix r/606526
* 20:17 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on all wikis - [[phab:T249261|T249261]] (duration: 00m 57s)
* 19:44 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on group1 wikis - [[phab:T249261|T249261]] (duration: 00m 57s)
* 18:53 wkandek@cumin1001: conftool action : set/pooled=yes; selector: name=mw2339.codfw.wmnet
* 18:35 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:32 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 17:16 wkandek@cumin1001: conftool action : set/pooled=no; selector: name=mw2339.codfw.wmnet
* 17:14 wkandek@cumin1001: conftool action : set/pooled=yes; selector: name=mw2339.codfw.wmnet
* 17:12 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2339.codfw.wmnet
* 16:51 maryum: reindex suspended until deployment of code
* 16:49 hnowlan: Shut off non-dockerised deployment-prep instance of changeprop
* 16:15 maryum: reindexing French wiki in Elasticsearch
* 15:37 Reedy: creatd bot_passwords tables on officeiwki and otrs_wikiwiki [[phab:T254925|T254925]] [[phab:T246489|T246489]]
* 15:34 moritzm: installing harfbuzz security updates
* 15:23 moritzm: installing Ruby 2.1 security updates
* 15:15 moritzm: installing python-django security updates (packaged buster version)
* 15:04 moritzm: installing bind updates on jessie (client side tools/libs)
* 14:19 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1078', diff saved to https://phabricator.wikimedia.org/P11591 and previous config saved to /var/cache/conftool/dbconfig/20200618-141941-marostegui.json
* 14:14 moritzm: failover ganeti master in codfw to ganeti2021
* 14:03 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1078 for schema change', diff saved to https://phabricator.wikimedia.org/P11590 and previous config saved to /var/cache/conftool/dbconfig/20200618-140352-marostegui.json
* 14:02 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1075', diff saved to https://phabricator.wikimedia.org/P11589 and previous config saved to /var/cache/conftool/dbconfig/20200618-140203-marostegui.json
* 13:53 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:53 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 13:52 akosiaris: restart logstash2005 for applying an increased ganeti migration_downtime of 10k
* 13:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:43 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:34 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 13:11 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 13:05 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 12:52 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1075 for schema change', diff saved to https://phabricator.wikimedia.org/P11586 and previous config saved to /var/cache/conftool/dbconfig/20200618-125216-marostegui.json
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es5 master as es1024 is fully repooled now', diff saved to https://phabricator.wikimedia.org/P11585 and previous config saved to /var/cache/conftool/dbconfig/20200618-124801-marostegui.json
* 12:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:20 kormat@cumin1001: START - Cookbook sre.hosts.downtime
* 12:05 kormat: reimaging db1077 for final test [[phab:T251768|T251768]]
* 11:51 jbond@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: (no justification provided) (duration: 01m 00s)
* 11:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 11:00 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 10:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:34 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:54 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:48 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2076', diff saved to https://phabricator.wikimedia.org/P11583 and previous config saved to /var/cache/conftool/dbconfig/20200618-094001-marostegui.json
* 09:39 akosiaris: update wikifeeds to latest chart version in codfw
* 09:39 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 09:38 marostegui@cumin2001: dbctl commit (dc=all): 'Repool es2022', diff saved to https://phabricator.wikimedia.org/P11582 and previous config saved to /var/cache/conftool/dbconfig/20200618-093803-marostegui.json
* 09:38 akosiaris: uncordon kubernetes20<nowiki>{</nowiki>07..14<nowiki>}</nowiki> and kubernetes10<nowiki>{</nowiki>07..14<nowiki>}</nowiki>. Nodes are now fully put in rotation and ready to receive production traffic
* 09:34 marostegui: Deploy schema change on s3 codfw master (this will create lag on codfw) - [[phab:T250066|T250066]]
* 09:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:30 godog: temp stop logstash on elk7 to test 8 pipeline workers - [[phab:T255243|T255243]]
* 09:25 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:15 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 09:09 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:08 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 09:06 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 09:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:59 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool es1025', diff saved to https://phabricator.wikimedia.org/P11581 and previous config saved to /var/cache/conftool/dbconfig/20200618-085927-marostegui.json
* 08:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:50 ayounsi@cumin2001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1)
* 08:49 ayounsi@cumin2001: START - Cookbook sre.network.prepare-upgrade
* 08:49 ayounsi@cumin2001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
* 08:49 ayounsi@cumin2001: START - Cookbook sre.network.prepare-upgrade
* 08:49 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool es1025', diff saved to https://phabricator.wikimedia.org/P11580 and previous config saved to /var/cache/conftool/dbconfig/20200618-084929-marostegui.json
* 08:47 marostegui@cumin2001: dbctl commit (dc=all): 'Depool es2022 for reimage', diff saved to https://phabricator.wikimedia.org/P11578 and previous config saved to /var/cache/conftool/dbconfig/20200618-084720-marostegui.json
* 08:47 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:40 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:37 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool es1025', diff saved to https://phabricator.wikimedia.org/P11577 and previous config saved to /var/cache/conftool/dbconfig/20200618-083749-marostegui.json
* 08:25 elukey: change archiva-ci password in archiva
* 08:24 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool es1025', diff saved to https://phabricator.wikimedia.org/P11576 and previous config saved to /var/cache/conftool/dbconfig/20200618-082432-marostegui.json
* 08:12 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:10 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:08 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 08:06 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 08:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 08:00 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:57 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:51 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:50 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:45 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:41 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:41 marostegui: Reimage es1025
* 07:35 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:34 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1136', diff saved to https://phabricator.wikimedia.org/P11574 and previous config saved to /var/cache/conftool/dbconfig/20200618-073414-marostegui.json
* 07:33 ayounsi@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:31 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0)
* 07:25 ayounsi@cumin2001: START - Cookbook sre.dns.netbox
* 07:24 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single
* 07:22 moritzm: rolling reboot of ganeti servers in codfw
* 07:10 ayounsi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:07 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 04:50 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1136', diff saved to https://phabricator.wikimedia.org/P11573 and previous config saved to /var/cache/conftool/dbconfig/20200618-045047-marostegui.json
 
== 2020-06-17 ==
* 23:25 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0e7079d}}: Install DiscussionTools on all wikis (attempt 2) ([[phab:T252264|T252264]]; [[phab:T253943|T253943]]) (duration: 00m 56s)
* 23:23 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/DiscussionTools/includes/Hooks.php: {{Gerrit|ff01083}}: Use $wgLocaltimezone global instead of request context ([[phab:T255704|T255704]]) (duration: 00m 57s)
* 23:21 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/DiscussionTools/includes/Hooks.php: {{Gerrit|4551d29}}: Use $wgLocaltimezone global instead of request context ([[phab:T252264|T252264]]; [[phab:T253943|T253943]]; [[phab:T255704|T255704]]) (duration: 00m 58s)
* 23:01 ryankemper@deploy1001: Finished deploy [wdqs/wdqs@79fb82f]: 0.3.39 (duration: 14m 38s)
* 22:47 ryankemper@deploy1001: Started deploy [wdqs/wdqs@79fb82f]: 0.3.39
* 21:01 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:32 hashar: Fixed up zuul-merger on contint1001 due to some faulty hotfix
* 20:08 hashar: Stopped zuul-merger on contint1001
* 19:21 marostegui: Deploy schema change on s6 codfw master [[phab:T238966|T238966]]
* 19:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1094', diff saved to https://phabricator.wikimedia.org/P11572 and previous config saved to /var/cache/conftool/dbconfig/20200617-191723-marostegui.json
* 19:11 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 19:08 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 19:05 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 18:57 milimetric@deploy1001: Finished deploy [analytics/refinery@6640d6f] (thin): Quick fix for data quality bundles (THIN) (duration: 00m 10s)
* 18:57 milimetric@deploy1001: Started deploy [analytics/refinery@6640d6f] (thin): Quick fix for data quality bundles (THIN)
* 18:52 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:44 milimetric@deploy1001: Finished deploy [analytics/refinery@6640d6f]: Quick fix for data quality bundles (duration: 27m 55s)
* 18:41 Urbanecm: Morning B&C window done
* 18:31 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|96153f9}}: Add temporary logging for mediamoderation ([[phab:T247943|T247943]]) (duration: 00m 56s)
* 18:24 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: REVERT: {{Gerrit|ae76450}}: Install DiscussionTools on all wikis ([[phab:T252264|T252264]]; [[phab:T253943|T253943]]) (duration: 00m 34s)
* 18:22 urbanecm@deploy1001: scap failed: average error rate on 3/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
* 18:21 urbanecm@deploy1001: scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/e474f13ffac6b8c3bf919c4aeafc8c9b for details)
* 18:16 milimetric@deploy1001: Started deploy [analytics/refinery@6640d6f]: Quick fix for data quality bundles
* 18:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c9f6452}}: Set DiscussionToolsEnableVisual to true by default ([[phab:T251654|T251654]]) (duration: 00m 56s)
* 18:05 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 18:04 elukey@cumin1001: START - Cookbook sre.hosts.decommission
* 16:57 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on group0 wikis - [[phab:T249261|T249261]] (duration: 00m 56s)
* 16:00 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1094', diff saved to https://phabricator.wikimedia.org/P11571 and previous config saved to /var/cache/conftool/dbconfig/20200617-160013-marostegui.json
* 15:28 godog: temp bump logstash7 workers to 8 and temp stop logstash - [[phab:T255243|T255243]]
* 15:17 jforrester@deploy1001: Synchronized private/PrivateSettings.php: [[phab:T247943|T247943]] Add API key and recipient config for MediaModeration (duration: 00m 55s)
* 15:17 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2338.codfw.wmnet
* 15:11 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw233[5-9].codfw.wmnet
* 15:11 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T247943|T247943]] Install MediaModeration extension - III: Install where enabled (duration: 00m 56s)
* 15:10 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2335.codfw.wmnet
* 15:09 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2336.codfw.wmnet
* 15:09 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2337.codfw.wmnet
* 15:09 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2339.codfw.wmnet
* 15:08 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw233[5-9].codfw.wmnet
* 14:58 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/GrowthExperiments/modules/help/ext.growthExperiments.HelpPanelProcessDialog.js: [[phab:T255607|T255607]] Fix help panel sizing logic (duration: 00m 56s)
* 14:54 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:52 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:52 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 14:49 mdholloway: rolled back recommendation-api deployment due to canary endpoint check failure ([[phab:T255683|T255683]])
* 14:44 mholloway-shell@deploy1001: Finished deploy [recommendation-api/deploy@c39d567]: Update recommendation-api to {{Gerrit|db97742}} (duration: 01m 16s)
* 14:43 mholloway-shell@deploy1001: Started deploy [recommendation-api/deploy@c39d567]: Update recommendation-api to {{Gerrit|db97742}}
* 14:30 akosiaris: redrain kubernetes1007-14
* 14:27 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 14:27 mutante: disabling puppet on icinga to avoid alert spam when adding new appservers
* 14:25 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:22 akosiaris: uncordon kubernetes10<nowiki>{</nowiki>07..14<nowiki>}</nowiki> again
* 14:13 mutante: generating new mcrouter certs for mw2335 - mw2339 ([[phab:T247021|T247021]])
* 14:02 mutante: rebooting mw2335 through mw2339 (not in service)
* 13:51 XioNoX: cleanup msw1-codfw interfaces
* 13:44 akosiaris: redrain kubernetes1007-14
* 13:37 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 13:35 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 13:31 otto@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EventLogging to EventGate: - SearchSatisfaction on testwiki version 1.1.0 - [[phab:T249261|T249261]] (duration: 00m 58s)
* 13:30 moritzm: upgrade remaining parsoid nodes to PHP 7.2.31
* 13:21 jbond42: re-enable puppet on C:memcached nodes
* 13:04 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:04 marostegui: The above db1129 depool was meant to be a repool, wrong commit message
* 13:03 liw@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.37
* 13:03 jbond42: disable puppet on C:memcache to deploy a new change
* 13:02 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P11567 and previous config saved to /var/cache/conftool/dbconfig/20200617-130236-marostegui.json
* 13:02 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:00 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 13:00 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:00 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 13:00 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:00 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 13:00 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 12:59 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 12:59 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 12:59 akosiaris@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 12:59 akosiaris@cumin2001: START - Cookbook sre.hosts.downtime
* 12:54 hnowlan: upgraded cpjobqueue to newer container image, rolled back
* 12:40 marostegui@cumin2001: dbctl commit (dc=all): 'Add db2091 to s8 [[phab:T253217|T253217]]', diff saved to https://phabricator.wikimedia.org/P11566 and previous config saved to /var/cache/conftool/dbconfig/20200617-124034-marostegui.json
* 12:32 hnowlan: Removed remaining changeprop systemd components from scb
* 12:06 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db2076 to remove triggers from sanitarium [[phab:T238966|T238966]]', diff saved to https://phabricator.wikimedia.org/P11565 and previous config saved to /var/cache/conftool/dbconfig/20200617-120622-marostegui.json
* 11:59 Amir1: not today, just EU noon
* 11:59 Amir1: B&C is done for today
* 11:58 ladsgroup@deploy1001: Synchronized wmf-config/config/trwikisource.yaml: [[gerrit:605656{{!}}Change sidebar upload link destination for tr.wikisource (T253490)]] (duration: 01m 03s)
* 11:55 ladsgroup@deploy1001: Synchronized dblists/commonsuploads.dblist: [[gerrit:605656{{!}}Change sidebar upload link destination for tr.wikisource (T253490)]] (duration: 01m 04s)
* 11:48 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 11:47 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:605652{{!}}Add extended-confirmed group and restriction level for rowiki (T254471)]] (duration: 01m 04s)
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1025 for reimage, give weight to es1023 (es5 master)', diff saved to https://phabricator.wikimedia.org/P11563 and previous config saved to /var/cache/conftool/dbconfig/20200617-113026-marostegui.json
* 11:23 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/GrowthExperiments/extension.json: [[gerrit:606122{{!}}Fix NewcomerTask schema (T255597)]] (duration: 01m 04s)
* 11:18 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.36/extensions/GrowthExperiments/extension.json: [[gerrit:606121{{!}}Fix NewcomerTask schema (T255597)]] (duration: 01m 06s)
* 11:07 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:606075{{!}}Set hiwiktionary timezone to Asia/Kolkata (T255531)]] (duration: 01m 05s)
* 10:48 marostegui@cumin2001: dbctl commit (dc=all): 'Remove db2091 from dbctl in s2 and s4', diff saved to https://phabricator.wikimedia.org/P11562 and previous config saved to /var/cache/conftool/dbconfig/20200617-104816-marostegui.json
* 10:40 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:38 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 10:31 liw@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.37 (duration: 01m 04s)
* 10:30 liw@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.37
* 09:44 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:42 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 09:40 hnowlan: killing stale changeprop instances running on scb hosts
* 09:16 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/Flow/: [[phab:T255608|T255608]] Revert 'Hooks: Use PageMoveComplete instead of TitleMoveCompleting' (duration: 01m 05s)
* 09:15 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11558 and previous config saved to /var/cache/conftool/dbconfig/20200617-091509-marostegui.json
* 09:11 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/includes/HookContainer/DeprecatedHooks.php: [[phab:T255608|T255608]] Revert 'Hard deprecate the  hook' (duration: 01m 05s)
* 09:02 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T247943|T247943]] Install MediaModeration extension - II: Add flag to IS (duration: 01m 05s)
* 08:56 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:56 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 08:52 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:49 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 08:47 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11557 and previous config saved to /var/cache/conftool/dbconfig/20200617-084751-marostegui.json
* 08:44 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11556 and previous config saved to /var/cache/conftool/dbconfig/20200617-084402-marostegui.json
* 08:43 jforrester@deploy1001: Synchronized php-1.35.0-wmf.37/includes/EditPage.php: [[phab:T255177|T255177]] [[phab:T255614|T255614]] Do not return internal edit status from EditPage (duration: 01m 08s)
* 08:31 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1113:3315, db1113:3316', diff saved to https://phabricator.wikimedia.org/P11554 and previous config saved to /var/cache/conftool/dbconfig/20200617-083120-marostegui.json
* 08:30 godog: start logstash on logstash7 - [[phab:T255243|T255243]]
* 08:29 moritzm: prune nginx from remaining mw* servers in codfw [[phab:T255565|T255565]]
* 08:23 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:20 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 08:10 godog: stop logstash temporarily on logstash7 hosts to test increased es shards - [[phab:T255243|T255243]]
* 08:05 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1113:3315 db1113:3316', diff saved to https://phabricator.wikimedia.org/P11553 and previous config saved to /var/cache/conftool/dbconfig/20200617-080511-marostegui.json
* 07:53 elukey: reboot kafka-jumbo1009 for kernel upgrades
* 06:40 elukey: reboot krb1001 for kernel upgrades
* 06:24 elukey: reboot an-master100[1,2] for kernel upgrades
* 06:23 XioNoX: set lacp active on cr2-esams:ae2 - [[phab:T253970|T253970]]
* 06:15 tstarling@deploy1001: Synchronized wmf-config/PoolCounterSettings.php: test fast stale mode on testwiki [[phab:T250248|T250248]] (duration: 01m 17s)
* 06:03 elukey: reboot an-conf100[1-3] for kernel upgrades
* 05:45 elukey: reboot stat1007/8 for kernel upgrades
* 05:45 elukey: clean up old systemd timer config on an-coord1001 (came up after the last reboot)
* 05:42 volker-e@deploy1001: Finished deploy [design/style-guide@37c67dd]: Deploy design/style-guide:  (duration: 00m 05s)
* 05:42 volker-e@deploy1001: Started deploy [design/style-guide@37c67dd]: Deploy design/style-guide:
* 05:34 marostegui@cumin2001: dbctl commit (dc=all): 'Fully repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11552 and previous config saved to /var/cache/conftool/dbconfig/20200617-053421-marostegui.json
* 05:29 marostegui: Deploy schema change on s7 codfw (lag will appear) - [[phab:T250066|T250066]]
* 05:28 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11551 and previous config saved to /var/cache/conftool/dbconfig/20200617-052809-marostegui.json
* 05:22 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11550 and previous config saved to /var/cache/conftool/dbconfig/20200617-052202-marostegui.json
* 05:19 marostegui@cumin2001: dbctl commit (dc=all): 'Slowly repool db1090:3312, db1090:3317', diff saved to https://phabricator.wikimedia.org/P11549 and previous config saved to /var/cache/conftool/dbconfig/20200617-051916-marostegui.json
* 05:10 marostegui@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:08 marostegui@cumin2001: START - Cookbook sre.hosts.downtime
* 04:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1090:3312, db1090:3317 for reimage', diff saved to https://phabricator.wikimedia.org/P11548 and previous config saved to /var/cache/conftool/dbconfig/20200617-045105-marostegui.json
* 04:44 marostegui: Reload pt-kill on labsdb analytics host to pick up new config
* 04:38 marostegui@cumin2001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P11547 and previous config saved to /var/cache/conftool/dbconfig/20200617-043826-marostegui.json
* 01:43 shdubsh: restart elasticsearch on logstash1011
 
== 2020-06-16 ==
* 23:43 crusnov@deploy1001: Finished deploy [netbox/deploy@5251cf1]: Deploying Netbox to netbox-dev [[phab:T253140|T253140]] (duration: 00m 05s)
* 23:43 crusnov@deploy1001: Started deploy [netbox/deploy@5251cf1]: Deploying Netbox to netbox-dev [[phab:T253140|T253140]]
* 23:35 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrus: update ML models for ko and zh, drop ja (duration: 01m 00s)
* 23:34 ebernhardson@deploy1001: sync-file aborted: cirrus: update ML models for ko and zh, drop ja (duration: 00m 04s)
* 22:40 krinkle@deploy1001: Synchronized src/Noc/: (no justification provided) (duration: 01m 04s)
* 22:31 krinkle@deploy1001: Synchronized docroot/noc: (no justification provided) (duration: 01m 05s)
* 21:12 krinkle@deploy1001: Synchronized php-1.35.0-wmf.37/extensions/WikimediaEvents/modules/: {{Gerrit|I67794c6c7192571}} (duration: 01m 04s)
* 20:42 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.35.0-wmf.37
* 20:41 foks: reset email and pw for CactusJack
* 20:32 brennen: rolling 1.35.0-wmf.37 back to group0
* 20:29 mutante: signing puppet cert requests for releases1002 and releases2002 - [[phab:T255590|T255590]]
* 19:24 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.37 (duration: 01m 04s)
* 19:23 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.37
* 19:18 otto@deploy1001: Started deploy [analytics/refinery@8b8ce6e]: deploying refinery source 0.0.127 for eventlogging -> eventgate migration - [[phab:T249261|T249261]]
* 19:15 brennen@deploy1001: Synchronized php-1.35.0-wmf.37/skins/Vector/resources/skins.vector.styles/: [[gerrit:605975{{!}}Restore Watchlist star]] (duration: 01m 05s)
* 19:03 brennen: CORRECTION: holding _1.35.0-wmf.37_ deploy to group1 for a few minutes while merging & testing fix for [[phab:T255574|T255574]]
* 19:01 brennen: holding 1.35.0-wmf.27 deploy to group1 for a few minutes while merging & testing fix for [[phab:T255574|T255574]]
* 18:59 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:52 qchris: Turning on puppet again on gerrit1002 to avoid having it lag too far behind.
* 18:32 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 18:18 mutante: mw2293 - scap pull (because Icinga reports mismatched MW versions)
* 18:01 crusnov@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 17:55 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:52 crusnov@cumin2001: START - Cookbook sre.ganeti.makevm
* 17:44 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@f4f5d7b]: airflow: adjust glent legal cutoff (duration: 01m 35s)
* 17:42 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@f4f5d7b]: airflow: adjust glent legal cutoff
* 17:32 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 17:03 herron: performing rolling reboots of kafka-main hosts for security updates [[phab:T254990|T254990]]
* 16:27 hnowlan@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' .
* 16:26 hnowlan: Updating changeprop to new container version with updated dependencies
* 16:07 hnowlan@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace '