You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(ema: cp1083: restart ats-tls and varnish-fe after crashes - T241593)
imported>Nhatminh01
(archive)
Line 1: Line 1:
== 2019-12-30 ==
* 16:00 ema: cp1083: restart ats-tls and varnish-fe after crashes - [[phab:T241593|T241593]]
* 11:45 arturo: importing more packages into stretch-wikimedia/thirdparty/openstack-pike-stretch ([[phab:T241347|T241347]])
== 2019-12-29 ==
* 22:29 gehel: repooling wdqs1007
* 17:58 sbassett@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/EventBus/includes/EventFactory.php: Security fix for [[phab:T241410|T241410]] (duration: 00m 56s)
* 11:13 ema: restarted wikibugs to fix phab irc notifications
* 10:57 ema: repool cp3061 [[phab:T238305|T238305]]
* 10:07 elukey: powercycle cp3061 - mgmt serial console not showing a working tty, no ssh
* 10:06 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3061.esams.wmnet
== 2019-12-28 ==
* 22:54 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrus: re-sync move more_like traffic to codfw (duration: 00m 54s)
* 21:54 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrus: Move more_like traffic to codfw (duration: 00m 55s)
* 21:35 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: Reduce query load on cirrus elastic clusters (duration: 00m 55s)
* 21:13 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: Reduce query load on cirrus elastic clusters (duration: 00m 58s)
== 2019-12-27 ==
* 19:51 gehel: restarted blazegraph on wdqs1007
* 19:45 gehel: depool wdqs1007 - checking what is wrong with updater
* 19:26 gehel: restarting elasticsearch on elastic1038 - high load
* 19:23 gehel: restarting elasticsearch on elastic1036 - high load
* 15:35 jynus: upgrade and restart dbprov1002
* 12:16 jynus: upgrade and restart dbprov1001
* 11:05 arturo: import openstack pike packages into thirdparty/openstack-pike-stretch in install1002 ([[phab:T241347|T241347]])
* 10:55 arturo: import gpg key 0x56056AB2FEE4EECB in install1002 for openstack packages ([[phab:T241347|T241347]])
* 10:51 jynus: upgrade and restart dbprov2002
* 09:30 jynus: upgrade and restart dbprov2001
== 2019-12-25 ==
* 10:12 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:12 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 10:12 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:11 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
== 2019-12-24 ==
* 15:33 gehel@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:13 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:13 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 14:38 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:38 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 14:13 gehel: data reload from wdqs1008 to wdqs1006 - [[phab:T241418|T241418]]
* 14:11 gehel@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:58 jynus: tune2fs -m 0 /dev/mapper/wdqs1006--vg-data [[phab:T241418|T241418]]
* 12:21 volans@deploy1001: Finished deploy [debmonitor/deploy@39ad186]: Release v0.2.2 - [[phab:T241206|T241206]] (duration: 00m 40s)
* 12:20 volans@deploy1001: Started deploy [debmonitor/deploy@39ad186]: Release v0.2.2 - [[phab:T241206|T241206]]
* 11:43 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: use TiffInfo again [[phab:T240455|T240455]] (duration: 01m 07s)
* 02:45 volker-e@deploy1001: Finished deploy [design/style-guide@8b2eda6]: Deploy design/style-guide:  (duration: 00m 07s)
* 02:45 volker-e@deploy1001: Started deploy [design/style-guide@8b2eda6]: Deploy design/style-guide:
* 00:13 volker-e@deploy1001: Finished deploy [design/style-guide@efc240b]: Deploy design/style-guide:  (duration: 00m 07s)
* 00:13 volker-e@deploy1001: Started deploy [design/style-guide@efc240b]: Deploy design/style-guide:
== 2019-12-23 ==
* 18:54 chaomodus: Deleting mgmt IP addresses from Netbox that are connected to offline devices. [[phab:T228387|T228387]]
* 15:57 papaul: shut down ms-fe2007 for NIC replacement
* 15:10 fdans@deploy1001: Finished deploy [analytics/refinery@531752b]: deploying refinery (duration: 08m 09s)
* 15:02 fdans@deploy1001: Started deploy [analytics/refinery@531752b]: deploying refinery
* 15:00 jynus@cumin1001: dbctl commit (dc=all): 'Reducing db1126 main s8 weight, seems flapping', diff saved to https://phabricator.wikimedia.org/P10012 and previous config saved to /var/cache/conftool/dbconfig/20191223-150052-jynus.json
* 13:42 moritzm: installing cpio security updates on jessie
* 13:31 moritzm: installing NSS security updates on jessie
* 13:23 moritzm: installing libvorbis security updates
* 13:01 moritzm: uploaded libvpx 1.7.0-3+wmf2 to component/vp9
* 11:06 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 11:05 jmm@cumin1001: START - Cookbook sre.hosts.decommission
* 11:05 arturo: import python-pyasn1 0.4.2-3~bpo9+1~wmf1 into stretch-wikimedia/component/python-ldap-bpo ([[phab:T229227|T229227]])
* 11:04 moritzm: removing dubnium in ganeti [[phab:T224557|T224557]]
* 10:47 arturo: import python-ldap 3.1.0-2~bpo9+1~wmf1 into stretch-wikimedia/component/python-ldap-bpo ([[phab:T229227|T229227]])
* 10:06 moritzm: removing elastic1019 from puppetdb [[phab:T239821|T239821]]
* 09:17 _joe_: running docker-report for base images, as a test, on boron
* 09:11 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2023.codfw.wmnet,service=ats-be
* 09:10 ema: cp2023: wipe ats-be cache and repool after normalize-path.lua experiment [[phab:T241232|T241232]]
* 08:58 moritzm: restarting slapd on serpens/seaborgium to pick up SASL security update
* 08:50 moritzm: restarting slapd on LDAP replicas to pick up SASL security update
* 08:31 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2023.codfw.wmnet,service=ats-be
* 08:31 ema: cp2023: depool ats-be for Lua path normalization experiment [[phab:T241232|T241232]]
* 08:17 moritzm: installing cyrus-sasl2 security updates
== 2019-12-22 ==
* 16:48 bd808: Mastodon feed moved to https://mastodon.social/@wikimedia_sal ([[phab:T52109|T52109]])
* 16:44 bd808: Trying another Mastodon account. Hopefully running at bot at @wikimedia_sal@mastodon.social is acceptable use for that server.
* 05:58 volker-e@deploy1001: Finished deploy [design/style-guide@9aa0b3d]: Deploy design/style-guide:  (duration: 00m 07s)
* 05:58 volker-e@deploy1001: Started deploy [design/style-guide@9aa0b3d]: Deploy design/style-guide:
* 00:10 bd808: Wikimedia SAL messages now available on Mastodon! https://fosstodon.org/@wikimedia_sal
== 2019-12-21 ==
* 23:40 bd808: Testing Mastodon logging of SAL messages ([[phab:T52109|T52109]])
* 23:12 volans: powercycle cp3055 - [[phab:T240425|T240425]]
* 22:45 volans: powercycle cp3051 - [[phab:T241306|T241306]]
* 13:39 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=testwikidatawiki --sleep 2 --batch-size=50 ([[phab:T241209|T241209]])
* 04:01 mutante: LDAP - added ifried to wmf ([[phab:T240988|T240988]]) for Turnilo / Superset access
* 01:44 reedy@deploy1001: update-interwiki-cache aborted: Update interwiki cache (duration: 00m 07s)
* 01:39 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T188369|T188369]] (duration: 00m 53s)
* 01:05 volker-e@deploy1001: Finished deploy [design/style-guide@b61669a]: Deploy design/style-guide:  (duration: 00m 07s)
* 01:05 volker-e@deploy1001: Started deploy [design/style-guide@b61669a]: Deploy design/style-guide:
== 2019-12-20 ==
* 23:10 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=testwikidatawiki --sleep 2 --batch-size=10 ([[phab:T241209|T241209]])
* 23:05 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@8373b0d]: Correct upload metadata usage (duration: 06m 41s)
* 22:59 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@8373b0d]: Correct upload metadata usage
* 22:52 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@d99bebf]: force utf-8 encoding when not detected (duration: 05m 22s)
* 22:47 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@d99bebf]: force utf-8 encoding when not detected
* 21:58 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@0ad64e7]: properly decode unicode on ltr model upload (duration: 06m 04s)
* 21:52 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@0ad64e7]: properly decode unicode on ltr model upload
* 18:44 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikimediaEditorTasks: Fix: Get RevisionRecord directly from the Revision in onPageContentSaveComplete ([[phab:T241014|T241014]]) (duration: 00m 54s)
* 17:48 ejegg: updated fundraising internal dashboard from {{Gerrit|b75f9074de}} to {{Gerrit|852f6871cf}}
* 17:29 bd808: Getting cloudweb2001-dev caught up with the MW train via a manual `scap pull` ([[phab:T241251|T241251]])
* 17:19 volans: re-generating sre-bot ro-token for Netbox, accidentally leaked to gerrit, already revoked within a minute from leak
* 16:28 cdanis: ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕦☕ homer 'cr*' commit 'templatize BGP sessions with pmacct netflow collector {{Gerrit|cb096f509}} {{Gerrit|0f56b2233}} 2e050ad33'
* 16:27 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|Ia9190a4e5}}, [[phab:T240691|T240691]]: Disable wgExtractsExtendOpenSearchXml (duration: 00m 55s)
* 14:38 volans: temporarily disable puppet on netbox[12]001 to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/555715 - [[phab:T233183|T233183]]
* 14:14 cdanis: homer 'cr*esams*' commit 'I022c62120 enable netflow collection in esams'
* 13:16 bblack: [correction] esams+eqsin edges switching back to digicert-2019a unified TLS cert
* 13:15 bblack: esams+eqiad edges switching back to digicert-2019a unified TLS cert
* 12:43 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:43 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 12:43 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 12:00 moritzm: installing glib2.0 security updates on stretch
* 09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust main traffic weight for db1096:3316 and db1098:3316', diff saved to https://phabricator.wikimedia.org/P9998 and previous config saved to /var/cache/conftool/dbconfig/20191220-094016-marostegui.json
* 09:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:32 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:32 moritzm: applied Ganeti cluster setting to pass through CPU flags for MDS/SSBD to esams/ulsfo clusters [[phab:T226444|T226444]] [[phab:T236216|T236216]]
* 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9997 and previous config saved to /var/cache/conftool/dbconfig/20191220-092805-marostegui.json
* 09:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:26 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9996 and previous config saved to /var/cache/conftool/dbconfig/20191220-091934-marostegui.json
* 09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9995 and previous config saved to /var/cache/conftool/dbconfig/20191220-091050-marostegui.json
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1096:3315 db1096:3316', diff saved to https://phabricator.wikimedia.org/P9994 and previous config saved to /var/cache/conftool/dbconfig/20191220-090204-marostegui.json
* 08:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315, db1096:3316 for upgrade', diff saved to https://phabricator.wikimedia.org/P9993 and previous config saved to /var/cache/conftool/dbconfig/20191220-085300-marostegui.json
* 08:51 addshore: addshore@graphite1004&2003:~$ sudo -u _graphite find /var/lib/carbon/whisper/daily/wikidata/api/actions -delete # [[phab:T227594|T227594]]
* 08:45 moritzm: failover ganeti master on the new esams cluster
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9992 and previous config saved to /var/cache/conftool/dbconfig/20191220-080400-marostegui.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9991 and previous config saved to /var/cache/conftool/dbconfig/20191220-075434-marostegui.json
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9990 and previous config saved to /var/cache/conftool/dbconfig/20191220-074504-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1103:3312, db1103:3314', diff saved to https://phabricator.wikimedia.org/P9989 and previous config saved to /var/cache/conftool/dbconfig/20191220-073017-marostegui.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312, db1103:3314 for upgrade', diff saved to https://phabricator.wikimedia.org/P9988 and previous config saved to /var/cache/conftool/dbconfig/20191220-065753-marostegui.json
* 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1089', diff saved to https://phabricator.wikimedia.org/P9987 and previous config saved to /var/cache/conftool/dbconfig/20191220-065628-marostegui.json
* 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P9986 and previous config saved to /var/cache/conftool/dbconfig/20191220-065022-marostegui.json
* 06:41 mutante: netflow3001 - rebooting
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P9985 and previous config saved to /var/cache/conftool/dbconfig/20191220-064129-marostegui.json
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089', diff saved to https://phabricator.wikimedia.org/P9984 and previous config saved to /var/cache/conftool/dbconfig/20191220-063248-marostegui.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for upgrade', diff saved to https://phabricator.wikimedia.org/P9983 and previous config saved to /var/cache/conftool/dbconfig/20191220-062530-marostegui.json
* 06:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1009 after upgrade (duration: 00m 54s)
* 06:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1009 for upgrade (duration: 00m 55s)
* 04:55 mutante: phab1003 - rm /etc/ssh/sshd_config.phabricator ; kill 26085 (secondary sshd for phab; systemctl start sshd (fixes regular sshd) ([[phab:T238957|T238957]])
* 01:26 eileen: civicrm revision changed from {{Gerrit|6ecdccd240}} to {{Gerrit|f6f4aa1d86}}, config revision is {{Gerrit|2e9bf6308b}}
* 00:52 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MachineVision: Fix: Allow a single period in $basePath in maintenance scripts (duration: 00m 54s)
* 00:46 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MachineVision: Fix: Ignore duplicate key errors when inserting data from annotation jobs (duration: 00m 53s)
* 00:05 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikimediaEditorTasks: Fix: Pass a RevisionRecord to Counter::onRevert from onArticleRollbackComplete ([[phab:T241013|T241013]]) (duration: 00m 54s)
== 2019-12-19 ==
* 23:54 volker-e@deploy1001: Finished deploy [design/style-guide@e9bf493]: Deploy design/style-guide:  (duration: 00m 09s)
* 23:53 volker-e@deploy1001: Started deploy [design/style-guide@e9bf493]: Deploy design/style-guide:
* 22:54 sbassett@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MobileFrontend/extension.json: Deploy security patch for [[phab:T240502|T240502]] (pushed through gerrit) (duration: 00m 55s)
* 21:49 eileen: civicrm revision changed from {{Gerrit|6062da3ab5}} to {{Gerrit|6ecdccd240}}, config revision is {{Gerrit|2e9bf6308b}}
* 19:06 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@99a25c0]: glent: Remove unused esbulk cli parameters (duration: 01m 20s)
* 19:04 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@99a25c0]: glent: Remove unused esbulk cli parameters
* 18:26 mforns@deploy1001: Finished deploy [analytics/refinery@e7200d2] (thin): deploying analytics-refinery together with refinery-source v0.0.109 (thin) (duration: 00m 06s)
* 18:26 mforns@deploy1001: Started deploy [analytics/refinery@e7200d2] (thin): deploying analytics-refinery together with refinery-source v0.0.109 (thin)
* 18:24 mforns@deploy1001: Finished deploy [analytics/refinery@e7200d2]: deploying analytics-refinery together with refinery-source v0.0.109 (duration: 07m 58s)
* 18:16 mforns@deploy1001: Started deploy [analytics/refinery@e7200d2]: deploying analytics-refinery together with refinery-source v0.0.109
* 18:03 James_F: Running `foreachwiki maintenance/deleteTag.php --batch-size 500 HHVM` on mwmaint1002 for [[phab:T75181|T75181]]
* 18:02 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Remove new upload labeling job delay (duration: 00m 57s)
* 17:38 James_F: Running `mwscript deleteTag.php --wiki=testwiki --batch-size 100 HHVM` for [[phab:T75181|T75181]] final testing
* 16:57 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=10 # [[phab:T237984|T237984]], Full pass (33 rows missing currently)
* 16:54 addshore@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseTermIdsCleaner.php: [[phab:T237984|T237984]] [[gerrit:559529]] Fix incorrect deletion of rows in DatabaseTermIdsCleaner (duration: 00m 56s)
* 16:22 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=2 --from-id=36185524 # [[phab:T237984|T237984]] (For `P4155`, then will stop)
* 16:21 addshore: Processed up to page {{Gerrit|36567013}} (P4152)
* 16:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:10 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:53 XioNoX: enable netflow sampling in ulsfo
* 15:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:34 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:12 ejegg: updated fundraising internal dashboard from {{Gerrit|913d690621}} to {{Gerrit|b75f9074de}}
* 15:10 ejegg: updated payments-wiki from {{Gerrit|7131303dba}} to {{Gerrit|827e3235dc}}
* 14:50 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Mark REL1_34 stable in ExtensionDistributor (duration: 00m 53s)
* 14:47 _joe_: restart pybal on the primary low-traffic balancers in eqiad, codfw
* 14:41 ema: cp1075, cp4028: ats-backend-restart to disable xdebug plugin [[phab:T241001|T241001]]
* 14:40 _joe_: restarting pybal on the backup low-traffic in eqiad,codfw
* 14:12 phamhi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:10 phamhi@cumin1001: START - Cookbook sre.hosts.downtime
* 13:48 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:45 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:41 phamhi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:41 phamhi@cumin1001: START - Cookbook sre.hosts.downtime
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1136', diff saved to https://phabricator.wikimedia.org/P9979 and previous config saved to /var/cache/conftool/dbconfig/20191219-133525-marostegui.json
* 13:34 ema: pool cp1089 with ATS backend [[phab:T227432|T227432]]
* 13:33 ema: pool cp2023 with ATS backend [[phab:T227432|T227432]]
* 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136', diff saved to https://phabricator.wikimedia.org/P9978 and previous config saved to /var/cache/conftool/dbconfig/20191219-131832-marostegui.json
* 13:17 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:15 James_F: mwscript emptyUserGroup.php --wiki=mediawikiwiki oauthadmin [[phab:T241142|T241142]]
* 13:15 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:12 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 13:12 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136', diff saved to https://phabricator.wikimedia.org/P9977 and previous config saved to /var/cache/conftool/dbconfig/20191219-130832-marostegui.json
* 13:02 jforrester@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.11
* 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136', diff saved to https://phabricator.wikimedia.org/P9976 and previous config saved to /var/cache/conftool/dbconfig/20191219-125748-marostegui.json
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1136, s7 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9975 and previous config saved to /var/cache/conftool/dbconfig/20191219-125314-marostegui.json
* 12:52 moritzm: failover ganeti master in ulsfo to ganeti4002 for a test
* 12:52 jforrester@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikiLove/includes/ApiWikiLove.php: [[phab:T241094|T241094]] ApiWikiLove: Don't pass null to implode(), but fall back to [] (duration: 01m 02s)
* 12:52 ema: depool cp2023 and cp1089 for ATS reimages [[phab:T227432|T227432]]. Reimaged together because of [[phab:T238817|T238817]]
* 12:37 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Cleanup: Remove CLI 'display_errors=stderr' setting (duration: 01m 01s)
* 12:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:35 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:32 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Cleanup: Remove very old 'error_append_string' INI override (duration: 01m 02s)
* 12:30 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Cleanup: Move core DB/SQL-related config closer together (duration: 01m 02s)
* 12:03 moritzm: installing netflow4001
* 11:43 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=5 --from-id=4089887 # [[phab:T237984|T237984]] - Will stop after P433
* 11:28 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:26 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:05 moritzm: removing kubestagetcd1001-1003 from debmonitor [[phab:T224568|T224568]]
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1104', diff saved to https://phabricator.wikimedia.org/P9974 and previous config saved to /var/cache/conftool/dbconfig/20191219-110404-marostegui.json
* 10:50 ema: pool cp3064 with ATS backend [[phab:T227432|T227432]]
* 10:40 effie: pool mw1320
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P9973 and previous config saved to /var/cache/conftool/dbconfig/20191219-102316-marostegui.json
* 10:21 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:18 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:10 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P9972 and previous config saved to /var/cache/conftool/dbconfig/20191219-101024-marostegui.json
* 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1093', diff saved to https://phabricator.wikimedia.org/P9971 and previous config saved to /var/cache/conftool/dbconfig/20191219-100938-marostegui.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1104', diff saved to https://phabricator.wikimedia.org/P9970 and previous config saved to /var/cache/conftool/dbconfig/20191219-095959-marostegui.json
* 09:56 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P9969 and previous config saved to /var/cache/conftool/dbconfig/20191219-095559-marostegui.json
* 09:53 effie: depool mw1320
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104, s8 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9968 and previous config saved to /var/cache/conftool/dbconfig/20191219-095158-marostegui.json
* 09:50 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P9967 and previous config saved to /var/cache/conftool/dbconfig/20191219-094945-marostegui.json
* 09:49 jmm@deploy1001: Finished deploy [debmonitor/deploy@c056c3c]: debmonitor release v0.2.1 - [[phab:T237978|T237978]] (duration: 02m 24s)
* 09:48 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 09:47 jmm@deploy1001: Started deploy [debmonitor/deploy@c056c3c]: debmonitor release v0.2.1 - [[phab:T237978|T237978]]
* 09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2084:3314, db2084:3315 [[phab:T241103|T241103]]', diff saved to https://phabricator.wikimedia.org/P9966 and previous config saved to /var/cache/conftool/dbconfig/20191219-094135-marostegui.json
* 09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1093', diff saved to https://phabricator.wikimedia.org/P9965 and previous config saved to /var/cache/conftool/dbconfig/20191219-093959-marostegui.json
* 09:39 ema: pool cp1087 with ATS backend [[phab:T227432|T227432]]
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1093, s6 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9964 and previous config saved to /var/cache/conftool/dbconfig/20191219-093116-marostegui.json
* 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1081', diff saved to https://phabricator.wikimedia.org/P9963 and previous config saved to /var/cache/conftool/dbconfig/20191219-092920-marostegui.json
* 09:23 ema: depool cp3064 and reimage as text_ats [[phab:T227432|T227432]]
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P9962 and previous config saved to /var/cache/conftool/dbconfig/20191219-092257-marostegui.json
* 09:15 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:15 onimisionipe: running maps osm-replicate process manually on maps1004 - [[phab:T239728|T239728]]
* 09:13 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P9961 and previous config saved to /var/cache/conftool/dbconfig/20191219-091205-marostegui.json
* 09:06 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1081', diff saved to https://phabricator.wikimedia.org/P9960 and previous config saved to /var/cache/conftool/dbconfig/20191219-090455-marostegui.json
* 09:03 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:55 ema: depool cp1087 and reimage as text_ats [[phab:T227432|T227432]]
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081, s4 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9959 and previous config saved to /var/cache/conftool/dbconfig/20191219-084544-marostegui.json
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084 - depooled by mistake', diff saved to https://phabricator.wikimedia.org/P9958 and previous config saved to /var/cache/conftool/dbconfig/20191219-084518-marostegui.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081, s4 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9957 and previous config saved to /var/cache/conftool/dbconfig/20191219-084346-marostegui.json
* 08:31 ema: pool cp1085 with ATS backend [[phab:T227432|T227432]]
* 08:27 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=5 --from-id=30398836 # [[phab:T237984|T237984]]
* 08:21 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:18 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:17 marostegui: Restart mysql on labsdb1010 (after depooling it)
* 08:11 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:09 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1014', diff saved to https://phabricator.wikimedia.org/P9956 and previous config saved to /var/cache/conftool/dbconfig/20191219-080519-marostegui.json
* 07:53 ema: depool cp1085 and reimage as text_ats [[phab:T227432|T227432]]
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1014', diff saved to https://phabricator.wikimedia.org/P9955 and previous config saved to /var/cache/conftool/dbconfig/20191219-074840-marostegui.json
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1076', diff saved to https://phabricator.wikimedia.org/P9954 and previous config saved to /var/cache/conftool/dbconfig/20191219-074800-marostegui.json
* 07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1014', diff saved to https://phabricator.wikimedia.org/P9953 and previous config saved to /var/cache/conftool/dbconfig/20191219-074122-marostegui.json
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P9952 and previous config saved to /var/cache/conftool/dbconfig/20191219-073907-marostegui.json
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1014', diff saved to https://phabricator.wikimedia.org/P9951 and previous config saved to /var/cache/conftool/dbconfig/20191219-073430-marostegui.json
* 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P9950 and previous config saved to /var/cache/conftool/dbconfig/20191219-073151-marostegui.json
* 07:30 ema: cp: rolling ats-backend-restart to disable compress plugin everywhere [[phab:T238494|T238494]]
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1014 for upgrade', diff saved to https://phabricator.wikimedia.org/P9949 and previous config saved to /var/cache/conftool/dbconfig/20191219-072728-marostegui.json
* 07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1076', diff saved to https://phabricator.wikimedia.org/P9948 and previous config saved to /var/cache/conftool/dbconfig/20191219-072413-marostegui.json
* 07:19 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:17 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076, s2 candidate master, for upgrade', diff saved to https://phabricator.wikimedia.org/P9947 and previous config saved to /var/cache/conftool/dbconfig/20191219-071514-marostegui.json
* 07:10 marostegui: Upgrade db1115 (this will make dbtree fail for a few minutes)
* 06:52 XioNoX: cr2-eqdfw:delete chassis alarm management-ethernet - [[phab:T241105|T241105]]
* 06:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1008 after upgrade (duration: 01m 02s)
* 06:37 marostegui: Upgrade pc1008
* 06:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1008 for upgrade (duration: 01m 03s)
* 06:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:13 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 06:13 marostegui: Upgrade db2122, db2084
* 05:09 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:07 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 05:02 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:00 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:59 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 04:58 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 04:29 kart_: Update cxserver to 2019-12-11-144337-production ([[phab:T233405|T233405]], [[phab:T238118|T238118]])
* 04:25 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 04:21 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 04:19 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 04:12 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 04:10 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 04:03 mutante: LDAP - added mstyles to archiva-deployers ([[phab:T240865|T240865]])
* 03:48 volker-e@deploy1001: Finished deploy [design/style-guide@5cecb37]: Deploy design/style-guide:  (duration: 00m 07s)
* 03:48 volker-e@deploy1001: Started deploy [design/style-guide@5cecb37]: Deploy design/style-guide:
* 01:58 eileen: - civicrm revision changed from {{Gerrit|93037d6e35}} to {{Gerrit|6062da3ab5}}, config revision is {{Gerrit|2e9bf6308b}} mostly eoy_summary stuff
* 01:57 twentyafterfour: phabricator update completed
* 01:44 twentyafterfour: deploying phabricator update (tagged release/2019-12-19/1)
* 00:36 twentyafterfour@deploy1001: Synchronized php-1.35.0-wmf.11/skins/MinervaNeue/resources/skins.minerva.scripts/menu/MainMenu.js: sync https://gerrit.wikimedia.org/r/c/mediawiki/skins/MinervaNeue/+/559226 for SWAT (duration: 01m 02s)
== 2019-12-18 ==
* 23:44 krinkle@deploy1001: Synchronized php-1.35.0-wmf.11/includes/libs/objectcache/MemcachedPeclBagOStuff.php: {{Gerrit|Iacbc9ebda681}} (duration: 01m 01s)
* 23:27 krinkle@deploy1001: Synchronized php-1.35.0-wmf.11/includes/resourceloader/ResourceLoaderFileModule.php: {{Gerrit|I3fe9f0a9ddc}} (duration: 01m 02s)
* 23:17 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: {{Gerrit|If465c0ef}} cleanup (duration: 01m 01s)
* 21:33 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@9dd8227]: Update mobileapps to {{Gerrit|cf2bb3b}} (duration: 05m 51s)
* 21:27 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@9dd8227]: Update mobileapps to {{Gerrit|cf2bb3b}}
* 21:22 halfak@deploy1001: Finished deploy [ores/deploy@80b1e62]: [[phab:T240725|T240725]] (duration: 22m 05s)
* 21:01 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@7e68510]: Return low_traffic_jobs concurrency to normal after [[phab:T240518|T240518]] (duration: 01m 03s)
* 21:00 halfak@deploy1001: Started deploy [ores/deploy@80b1e62]: [[phab:T240725|T240725]]
* 21:00 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@7e68510]: Return low_traffic_jobs concurrency to normal after [[phab:T240518|T240518]]
* 20:50 ppchelko@deploy1001: Finished deploy [restbase/deploy@6e24349]: Disable all parsoid-php vs parsoid-js special cases [[phab:T229015|T229015]] (duration: 13m 56s)
* 20:36 ppchelko@deploy1001: Started deploy [restbase/deploy@6e24349]: Disable all parsoid-php vs parsoid-js special cases [[phab:T229015|T229015]]
* 20:04 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable banner on Special:Block for selected wikis - [[phab:T240300|T240300]] (duration: 01m 01s)
* 19:59 niharika29@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/WikimediaMessages/includes/WikimediaMessagesHooks.php: Remove messagebox class from partial block banner - [[phab:T240300|T240300]] (duration: 01m 02s)
* 19:47 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@4b174fd]: glent: Explicitly pass previous partition to m2run (duration: 00m 10s)
* 19:47 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@4b174fd]: glent: Explicitly pass previous partition to m2run
* 19:28 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump Parsoid/PHP cluster memory_limit - [[phab:T239806|T239806]], [[phab:T236833|T236833]] (duration: 01m 01s)
* 18:58 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
* 18:56 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
* 18:51 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@20e4c16]: Update glent to spark 2.4.4 (duration: 00m 29s)
* 18:51 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@20e4c16]: Update glent to spark 2.4.4
* 18:18 jynus: disable puppet on backup1001, dbprov1001 to test special backup recovery
* 18:15 cdanis@deploy1001: Synchronized wmf-config/db-eqiad.php: remove dbctl-obsoleted hostsByName entries 🔧 {{Gerrit|7d20965f5}} [[phab:T240991|T240991]] [[phab:T229676|T229676]] (duration: 01m 01s)
* 18:14 cdanis@deploy1001: Synchronized wmf-config/db-codfw.php: remove dbctl-obsoleted hostsByName entries 🔧 {{Gerrit|7d20965f5}} [[phab:T240991|T240991]] [[phab:T229676|T229676]] (duration: 01m 01s)
* 18:00 cdanis@deploy1001: Synchronized wmf-config/etcd.php: use hostsByName from etcd {{Gerrit|96df9c004}} [[phab:T229676|T229676]] [[phab:T240991|T240991]] (duration: 01m 01s)
* 17:59 cdanis@deploy1001: Synchronized wmf-config/CommonSettings.php: use hostsByName from etcd {{Gerrit|96df9c004}} [[phab:T229676|T229676]] [[phab:T240991|T240991]] (duration: 01m 01s)
* 17:52 jynus: reload dbproxy1017 dbproxy1021
* 17:34 jforrester@deploy1001: Finished scap: Full scap for extra AHT i18n (duration: 14m 09s)
* 17:34 ppchelko@deploy1001: Finished deploy [restbase/deploy@c9d8ef1]: Parsoid-PHP: mirror 100% of all traffic [[phab:T229015|T229015]] (duration: 16m 00s)
* 17:20 jforrester@deploy1001: Started scap: Full scap for extra AHT i18n
* 17:18 ppchelko@deploy1001: Started deploy [restbase/deploy@c9d8ef1]: Parsoid-PHP: mirror 100% of all traffic [[phab:T229015|T229015]]
* 16:30 akosiaris@cumin1001: conftool action : set/weight=10; selector: dc=codfw,service=echostore
* 16:29 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
* 16:25 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 16:21 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 16:21 akosiaris: remove kubestagetcd100<nowiki>{</nowiki>1,2,3<nowiki>}</nowiki> from the fleet [[phab:T239835|T239835]]
* 16:19 akosiaris@cumin1001: conftool action : set/weight=10; selector: name=kubernetes2002.codfw.wmnet,service=echostore
* 16:18 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 15:58 jforrester@deploy1001: rebuilt and synchronized wikiversions files: train: Rolling Wikitech foward to 1.35.0-wmf.11 [[phab:T233859|T233859]] [[phab:T241059|T241059]]
* 15:55 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=1 --from-id=14546856 # [[phab:T237984|T237984]] (will run to completion)
* 15:42 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:34 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:31 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:21 bblack: cr[23]-esams: ns2 authdns static routing: route to both of dns300[12] w/ ECMP (was just dns3001)
* 15:07 akosiaris: repool all codfw k8s services. [[phab:T239835|T239835]]
* 15:06 akosiaris@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=(eventgate.*{{!}}mathoid{{!}}citoid{{!}}restrouter{{!}}sessionstore{{!}}echostore{{!}}zotero{{!}}termbox{{!}}wikifeeds{{!}}cxserver{{!}}blubberoid)
* 14:49 jforrester@deploy1001: rebuilt and synchronized wikiversions files: train: Rolling Commons foward to 1.35.0-wmf.11 [[phab:T233859|T233859]] [[phab:T241057|T241057]]
* 14:45 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Wikibase: [[gerrit:559064{{!}}Fix DatabaseEntityInfoBuilder on federated repos (T241057)]] (duration: 01m 07s)
* 14:38 jbond42: restart apache on phab
* 14:37 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 14:37 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 14:28 moritzm: removing pollux from Ganeti (obsoleted by ldap-corp2001 for a while now)
* 14:06 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 14:06 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:05 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 14:05 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'sessionstore' for release 'production' .
* 14:04 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'restrouter' for release 'production' .
* 14:02 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 14:02 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 14:02 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-main' for release 'main' .
* 14:01 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:00 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'eventgate-analytics' for release 'analytics' .
* 13:59 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'echostore' for release 'production' .
* 13:58 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 13:58 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 13:57 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 13:38 moritzm: installing dbus security updates for stretch
* 13:30 jforrester@deploy1001: rebuilt and synchronized wikiversions files: train: Rolling Wikitech back to 1.35.0-wmf.10 [[phab:T233859|T233859]]
* 13:11 jforrester@deploy1001: rebuilt and synchronized wikiversions files: train: Rolling Commons back to 1.35.0-wmf.10 [[phab:T233859|T233859]]
* 13:04 jforrester@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.11 (duration: 01m 01s)
* 13:03 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.11
* 13:02 moritzm: installing ruby2.5 security updates
* 12:56 jbond42: enable puppet fleet wide
* 12:52 jbond42: disable puppet fleet wide to restart apache on puppetmasters
* 12:52 akosiaris: pool wikifeeds eqiad. For some reason it was depooled
* 12:50 akosiaris@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=(wikifeeds)
* 12:48 wmde-fisch@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Popups: SWAT: [[gerrit:559010{{!}}Fix initial preferences for newly created user accounts (T240947)]] (duration: 01m 02s)
* 12:35 wmde-fisch@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Popups: SWAT: [[gerrit:559010{{!}}Fix initial preferences for newly created user accounts (T240947)]] (duration: 01m 03s)
* 12:21 ladsgroup@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:558239{{!}}Add a bit for forcing LC caching backend in cli mode (T105683)]] (duration: 01m 03s)
* 11:28 moritzm: installing ruby2.3 security updates
* 11:27 jbond42: installing apache update on basion servers
* 11:26 moritzm: installing spamassassin security updates on fermium/lists
* 11:16 marostegui: Upgrade db2081, db2082
* 11:14 moritzm: installing spamassassin security updates on mendelevium/OTRS
* 10:47 jforrester@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/VisualEditor/includes/ApiVisualEditor.php: [[phab:T240961|T240961]]: Fix unchecked array access in ApiVisualEditor (duration: 01m 02s)
* 10:16 akosiaris: depooling eventgate.*{{!}}mathoid{{!}}citoid{{!}}restrouter{{!}}sessionstore{{!}}echostore{{!}}zotero{{!}}termbox{{!}}wikifeeds{{!}}cxserver{{!}}blubberoid) from codfw kubernetes
* 10:16 akosiaris@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=(eventgate.*{{!}}mathoid{{!}}citoid{{!}}restrouter{{!}}sessionstore{{!}}echostore{{!}}zotero{{!}}termbox{{!}}wikifeeds{{!}}cxserver{{!}}blubberoid)
* 10:10 marostegui: Upgrade db2083
* 10:09 jforrester@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/GrowthExperiments/includes: [[phab:T240444|T240444]] Make PageViewInfo a soft dependency (duration: 01m 04s)
* 10:06 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 10:06 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:05 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:05 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:05 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:05 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:05 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 10:05 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:04 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:04 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:04 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:04 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime
* 10:00 akosiaris: populate new calico stores for codfw [[phab:T239835|T239835]]
* 09:57 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1011', diff saved to https://phabricator.wikimedia.org/P9937 and previous config saved to /var/cache/conftool/dbconfig/20191218-095710-marostegui.json
* 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1011', diff saved to https://phabricator.wikimedia.org/P9936 and previous config saved to /var/cache/conftool/dbconfig/20191218-094540-marostegui.json
* 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1011', diff saved to https://phabricator.wikimedia.org/P9935 and previous config saved to /var/cache/conftool/dbconfig/20191218-093720-marostegui.json
* 09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1011', diff saved to https://phabricator.wikimedia.org/P9934 and previous config saved to /var/cache/conftool/dbconfig/20191218-092625-marostegui.json
* 09:24 elukey: execute 'megacli -LDSetProp WT -LAll -aAll' on analytics1057 - [[phab:T239045|T239045]]
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1011 for upgrade', diff saved to https://phabricator.wikimedia.org/P9933 and previous config saved to /var/cache/conftool/dbconfig/20191218-092228-marostegui.json
* 09:18 ema: repool cp3050 after ats-be restart
* 08:59 vgutierrez: restarting ats-be on cp3050
* 08:58 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool es1016', diff saved to https://phabricator.wikimedia.org/P9931 and previous config saved to /var/cache/conftool/dbconfig/20191218-082226-marostegui.json
* 08:21 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:20 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1016', diff saved to https://phabricator.wikimedia.org/P9930 and previous config saved to /var/cache/conftool/dbconfig/20191218-081256-marostegui.json
* 08:12 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:12 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:05 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:04 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:01 marostegui: Upgrade db2109
* 08:01 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 07:59 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1016', diff saved to https://phabricator.wikimedia.org/P9929 and previous config saved to /var/cache/conftool/dbconfig/20191218-075919-marostegui.json
* 07:59 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 07:58 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 07:58 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1105:3311, db1105:3312', diff saved to https://phabricator.wikimedia.org/P9928 and previous config saved to /var/cache/conftool/dbconfig/20191218-075828-marostegui.json
* 07:58 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'sessionstore' for release 'staging' .
* 07:57 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'restrouter' for release 'staging' .
* 07:57 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 07:57 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 07:56 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-main' for release 'main' .
* 07:55 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 07:54 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-analytics' for release 'analytics' .
* 07:54 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'echostore' for release 'staging' .
* 07:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 07:53 akosiaris: run helmfile sync for all staging deployments [[phab:T239835|T239835]]
* 07:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 07:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311, db1105:3312', diff saved to https://phabricator.wikimedia.org/P9927 and previous config saved to /var/cache/conftool/dbconfig/20191218-074642-marostegui.json
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool es1016', diff saved to https://phabricator.wikimedia.org/P9926 and previous config saved to /var/cache/conftool/dbconfig/20191218-074032-marostegui.json
* 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1016 for upgrade', diff saved to https://phabricator.wikimedia.org/P9925 and previous config saved to /var/cache/conftool/dbconfig/20191218-073002-marostegui.json
* 07:18 andrew@deploy1001: Finished deploy [horizon/deploy@f77e91b]: Fix for [[phab:T240979|T240979]] (duration: 03m 24s)
* 07:14 andrew@deploy1001: Started deploy [horizon/deploy@f77e91b]: Fix for [[phab:T240979|T240979]]
* 06:53 marostegui: Upgrade db2132, db2133, db2134
* 06:51 marostegui: Upgrade db2135
* 06:48 volker-e@deploy1001: Finished deploy [design/style-guide@d13b55d]: Deploy design/style-guide:  (duration: 00m 07s)
* 06:48 volker-e@deploy1001: Started deploy [design/style-guide@d13b55d]: Deploy design/style-guide:
* 06:45 onimisionipe: running replicate-osm on maps1004 after failed osm sync - [[phab:T239728|T239728]]
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311, db1105:3312', diff saved to https://phabricator.wikimedia.org/P9924 and previous config saved to /var/cache/conftool/dbconfig/20191218-064510-marostegui.json
* 06:38 moritzm: upgrading debmonitor-client to 0.2.0 fleet-wide
* 06:37 moritzm: upgrading debdeploy-client to 0.2.0 fleet-wide
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1105:3311, db1105:3312', diff saved to https://phabricator.wikimedia.org/P9923 and previous config saved to /var/cache/conftool/dbconfig/20191218-063652-marostegui.json
* 06:30 marostegui: Upgrade db1105
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311, db1105:3312', diff saved to https://phabricator.wikimedia.org/P9922 and previous config saved to /var/cache/conftool/dbconfig/20191218-062759-marostegui.json
* 06:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1007 after upgrade (duration: 01m 00s)
* 06:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1007 for upgrade (duration: 01m 11s)
* 06:03 marostegui: Upgrade db2112 db2116 db2130
* 05:59 marostegui: Upgrade db2088, db2092
* 05:55 marostegui: Upgrade db2071 and db2072
* 05:31 marostegui: Deploy schema change on commonswiki.image on s4 primary master (db1138) - [[phab:T233135|T233135]]
* 04:59 XioNoX: advertise 185.15.57.0/24 from [co{{!}}eq]dfw - [[phab:T239347|T239347]]
* 04:54 XioNoX: add static routes for cloud's 185.15.57.0/29 on cr1/2-codfw - [[phab:T239347|T239347]]
* 02:14 eileen: civicrm revision changed from {{Gerrit|b2d0b5d66d}} to {{Gerrit|93037d6e35}}, config revision is {{Gerrit|2e9bf6308b}}
* 00:22 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable Glent M0 for dewiki, enwiki and frwiki - [[phab:T237365|T237365]] (duration: 01m 02s)
* 00:15 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable Glent M0 A/B test - [[phab:T237363|T237363]] (duration: 01m 02s)
* 00:09 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable  on test wikis - [[phab:T240736|T240736]] (duration: 01m 02s)
== 2019-12-17 ==
* 22:55 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@0dd9f6b]: Ship kerberos configuration for oozie jobs (duration: 15m 50s)
* 22:39 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@0dd9f6b]: Ship kerberos configuration for oozie jobs
* 22:19 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 22:18 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 22:10 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@100bf96]: Ship kerberos configuration for oozie jobs (duration: 00m 34s)
* 22:09 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@100bf96]: Ship kerberos configuration for oozie jobs
* 21:52 cdanis: ✔️ cdanis@cp3050.esams.wmnet ~ 🕔🍵 sudo depool
* 21:04 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Add more info to MachineVisionEntitySaveException message, take 2 (duration: 01m 04s)
* 21:02 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MachineVision: Add more info to MachineVisionEntitySaveException message, take 2 (duration: 01m 04s)
* 19:49 hashar: Restarting CI Jenkins for plugins upgrades
* 19:41 mholloway-shell@deploy1001: Synchronized wmf-config/CommonSettings.php: Disable MachineVision email notifications ([[phab:T240878|T240878]]) (duration: 01m 07s)
* 19:00 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Add more info to MachineVisionEntitySaveException message (duration: 01m 08s)
* 18:58 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/MachineVision: Add more info to MachineVisionEntitySaveException message (duration: 01m 07s)
* 18:12 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@c30d801]: Update mobileapps to {{Gerrit|5551575}} (duration: 06m 38s)
* 18:06 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@c30d801]: Update mobileapps to {{Gerrit|5551575}}
* 17:06 moritzm: uploading debmonitor-client 0.2.0 to apt.wikimedia.org (jessie/stretch/buster) [[phab:T237978|T237978]]
* 16:49 marostegui: Deploy schema change on commonswiki.logging on db1138 (s4 primary master) - [[phab:T233135|T233135]]
* 16:14 jforrester@deploy1001: Synchronized php-1.35.0-wmf.10/includes/specials/pagers/NewPagesPager.php: [[phab:T240924|T240924]] NewPagesPager: Fix namespace query conditions (duration: 01m 03s)
* 16:07 jforrester@deploy1001: Synchronized php-1.35.0-wmf.11/includes/specials/pagers/NewPagesPager.php: [[phab:T240924|T240924]] NewPagesPager: Fix namespace query conditions (duration: 01m 02s)
* 15:57 jmm@deploy1001: Finished deploy [debmonitor/deploy@bb99a23]: Debmonitor release v0.2.0 - [[phab:T237978|T237978]] (duration: 06m 25s)
* 15:52 ema: text@ulsfo rolling ats-backend-restart to disable compress plugin https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/558089/ [[phab:T238495|T238495]]
* 15:50 jmm@deploy1001: Started deploy [debmonitor/deploy@bb99a23]: Debmonitor release v0.2.0 - [[phab:T237978|T237978]]
* 15:24 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 15:17 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=10 --from-id=3929876 # [[phab:T237984|T237984]] (I stopped it at Processed up to page {{Gerrit|14546856}} (P501))
* 15:14 onimisionipe: pool maps2003 after postgres init - [[phab:T239728|T239728]]
* 15:13 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=10 --from-id=3929876 # [[phab:T237984|T237984]]
* 14:46 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 14:45 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
* 14:45 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 14:42 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 14:42 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
* 14:42 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 14:40 cdanis: tearing down tmux sessions generating nic_saturation kludge on some memcached hosts
* 14:36 gehel: restarting elastic1054 for config change (new master)
* 14:15 ema: cp: rolling ats-tls-restart to clear issues caused by [[phab:T240950|T240950]]
* 14:10 marostegui: Upgrade db2091
* 14:05 andrew@deploy1001: Finished deploy [horizon/deploy@b95d700]: Deploying a fix to the puppet prefix tab (duration: 03m 26s)
* 14:02 andrew@deploy1001: Started deploy [horizon/deploy@b95d700]: Deploying a fix to the puppet prefix tab
* 13:53 gehel@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 13:51 gehel@cumin1001: START - Cookbook sre.hosts.decommission
* 13:49 gehel@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 13:48 gehel@cumin1001: START - Cookbook sre.hosts.decommission
* 13:39 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.11 [[phab:T233859|T233859]]
* 13:34 jforrester@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.11 and rebuild l10n cache [[phab:T233859|T233859]] (duration: 36m 46s)
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1091', diff saved to https://phabricator.wikimedia.org/P9915 and previous config saved to /var/cache/conftool/dbconfig/20191217-132554-marostegui.json
* 13:02 moritzm: installing intel-microcode updates to 3.20191115.1 on buster/stretch (with one Xeon type rolled back to a regression causing failing reboots, see DSA 4562-2)
* 13:01 ayounsi@deploy1001: Finished deploy [homer/deploy@359de04]: Homer release v0.1.1 - [[phab:T228388|T228388]] (duration: 00m 30s)
* 13:01 ayounsi@deploy1001: Started deploy [homer/deploy@359de04]: Homer release v0.1.1 - [[phab:T228388|T228388]]
* 12:58 jforrester@deploy1001: Started scap: testwiki to php-1.35.0-wmf.11 and rebuild l10n cache [[phab:T233859|T233859]]
* 12:50 jforrester@deploy1001: Pruned MediaWiki: 1.35.0-wmf.8 (duration: 09m 25s)
* 12:41 James_F: Running scap clean for wmf.8 [[phab:T233859|T233859]]
* 11:37 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 11:37 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
* 11:37 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 11:34 addshore: finished my last maint script run early after https://phabricator.wikimedia.org/P9914
* 11:27 addshore: addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size=1 # [[phab:T237984|T237984]]
* 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1091', diff saved to https://phabricator.wikimedia.org/P9913 and previous config saved to /var/cache/conftool/dbconfig/20191217-112134-marostegui.json
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1091', diff saved to https://phabricator.wikimedia.org/P9912 and previous config saved to /var/cache/conftool/dbconfig/20191217-111400-marostegui.json
* 11:06 dcausse@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: (no justification provided) (duration: 00m 25s)
* 11:06 dcausse@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: (no justification provided)
* 11:05 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T234318|T234318]] ContentTranslation: Make available by default for new Wikipedias (duration: 00m 58s)
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1134', diff saved to https://phabricator.wikimedia.org/P9911 and previous config saved to /var/cache/conftool/dbconfig/20191217-110322-marostegui.json
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P9910 and previous config saved to /var/cache/conftool/dbconfig/20191217-105425-marostegui.json
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P9908 and previous config saved to /var/cache/conftool/dbconfig/20191217-104530-marostegui.json
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1134', diff saved to https://phabricator.wikimedia.org/P9907 and previous config saved to /var/cache/conftool/dbconfig/20191217-103810-marostegui.json
* 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134 - s1 candidate master for  upgrade', diff saved to https://phabricator.wikimedia.org/P9905 and previous config saved to /var/cache/conftool/dbconfig/20191217-102907-marostegui.json
* 10:13 volans@deploy1001: Finished deploy [homer/deploy@996f7be]: Homer release v0.1.0 - [[phab:T228388|T228388]] (duration: 00m 32s)
* 10:13 volans@deploy1001: Started deploy [homer/deploy@996f7be]: Homer release v0.1.0 - [[phab:T228388|T228388]]
* 10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1075', diff saved to https://phabricator.wikimedia.org/P9902 and previous config saved to /var/cache/conftool/dbconfig/20191217-101125-marostegui.json
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1075', diff saved to https://phabricator.wikimedia.org/P9901 and previous config saved to /var/cache/conftool/dbconfig/20191217-100250-marostegui.json
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1075', diff saved to https://phabricator.wikimedia.org/P9900 and previous config saved to /var/cache/conftool/dbconfig/20191217-095152-marostegui.json
* 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1075', diff saved to https://phabricator.wikimedia.org/P9899 and previous config saved to /var/cache/conftool/dbconfig/20191217-094418-marostegui.json
* 09:30 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 09:25 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 09:15 akosiaris: delete all namespaces in kubernetes staging cluster for initialization with etcd3 backing datastore. [[phab:T239835|T239835]]
* 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075 s3 candidate master for upgrade', diff saved to https://phabricator.wikimedia.org/P9896 and previous config saved to /var/cache/conftool/dbconfig/20191217-091316-marostegui.json
* 08:56 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 08:45 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 08:45 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1252.eqiad.wmnet
* 08:45 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1251.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1250.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1249.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1248.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1247.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1246.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1245.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1244.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1243.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1242.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1241.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1240.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1239.eqiad.wmnet
* 08:44 jiji@cumin1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=appserver,service=nginx,name=mw1238.eqiad.wmnet
* 08:31 marostegui: Repool labsdb1010 [[phab:T238399|T238399]]
* 07:29 andrew@deploy1001: Finished deploy [horizon/deploy@ff67a19]: Updating to Horizon version 'train' again (with fresh venvs) (duration: 03m 43s)
* 07:25 andrew@deploy1001: Started deploy [horizon/deploy@ff67a19]: Updating to Horizon version 'train' again (with fresh venvs)
* 07:24 andrew@deploy1001: Finished deploy [horizon/deploy@ff67a19]: Updating to Horizon version 'train' again (with fresh venvs) (duration: 00m 10s)
* 07:24 andrew@deploy1001: Started deploy [horizon/deploy@ff67a19]: Updating to Horizon version 'train' again (with fresh venvs)
* 07:23 marostegui: Upgrade candidate masters in codfw db2080 db2103 db2104 db2110 db2113 db2121 db2127
* 07:21 andrew@deploy1001: Finished deploy [horizon/deploy@ff67a19]: Updating to Horizon version 'train' (duration: 03m 46s)
* 07:18 andrew@deploy1001: Started deploy [horizon/deploy@ff67a19]: Updating to Horizon version 'train'
* 07:15 marostegui: Upgrade db2100
* 07:11 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1130 [[phab:T240823|T240823]]', diff saved to https://phabricator.wikimedia.org/P9895 and previous config saved to /var/cache/conftool/dbconfig/20191217-070709-marostegui.json
* 07:02 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 06:50 onimisionipe: depool maps2003 for postgres init - [[phab:T239728|T239728]]
* 06:48 onimisionipe: pool maps2002. Postgres init is complete - [[phab:T239728|T239728]]
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1130 [[phab:T240823|T240823]]', diff saved to https://phabricator.wikimedia.org/P9894 and previous config saved to /var/cache/conftool/dbconfig/20191217-064030-marostegui.json
* 06:36 volker-e@deploy1001: Finished deploy [design/style-guide@73d51f0]: Deploy design/style-guide:  (duration: 00m 07s)
* 06:36 volker-e@deploy1001: Started deploy [design/style-guide@73d51f0]: Deploy design/style-guide:
* 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1130 [[phab:T240823|T240823]]', diff saved to https://phabricator.wikimedia.org/P9893 and previous config saved to /var/cache/conftool/dbconfig/20191217-063121-marostegui.json
* 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1130 [[phab:T240823|T240823]]', diff saved to https://phabricator.wikimedia.org/P9892 and previous config saved to /var/cache/conftool/dbconfig/20191217-062136-marostegui.json
* 06:20 XioNoX: remove BGP session to AS32934 in ulsfo - [[phab:T239896|T239896]]
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on es3 slaves in eqiad and codfw - [[phab:T231018|T231018]]', diff saved to https://phabricator.wikimedia.org/P9891 and previous config saved to /var/cache/conftool/dbconfig/20191217-061959-marostegui.json
* 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on es2 slaves in eqiad and codfw - [[phab:T231018|T231018]]', diff saved to https://phabricator.wikimedia.org/P9890 and previous config saved to /var/cache/conftool/dbconfig/20191217-061707-marostegui.json
* 06:06 marostegui: Upgrade db1130 [[phab:T240823|T240823]]
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1091 schema change', diff saved to https://phabricator.wikimedia.org/P9889 and previous config saved to /var/cache/conftool/dbconfig/20191217-055848-marostegui.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1081 after schema change', diff saved to https://phabricator.wikimedia.org/P9888 and previous config saved to /var/cache/conftool/dbconfig/20191217-055407-marostegui.json
== 2019-12-16 ==
* 23:28 ejegg: updated fundraising civicrm from {{Gerrit|505c653da1}} to {{Gerrit|b2d0b5d66d}}
* 22:57 arlolra: Updated Parsoid to {{Gerrit|8ccc085}} ([[phab:T240091|T240091]], [[phab:T236912|T236912]], [[phab:T236415|T236415]], [[phab:T239929|T239929]], [[phab:T214649|T214649]], [[phab:T239830|T239830]])
* 22:47 arlolra@deploy1001: Finished deploy [parsoid/deploy@26ee446]: Updating Parsoid to {{Gerrit|8ccc085}} (duration: 06m 54s)
* 22:40 arlolra@deploy1001: Started deploy [parsoid/deploy@26ee446]: Updating Parsoid to {{Gerrit|8ccc085}}
* 22:24 cdanis: ✔️ cdanis@mwdebug2001.codfw.wmnet /srv/mediawiki 🕔🍺 scap pull
* 22:07 herron: increasing mx exim log verbosity by adding smtp_connection to log_selector list [[phab:T240906|T240906]]
* 22:02 arlolra@deploy1001: Finished deploy [parsoid/deploy@a42ca13]: Updating Parsoid to {{Gerrit|56a64ef}} (duration: 08m 16s)
* 21:53 arlolra@deploy1001: Started deploy [parsoid/deploy@a42ca13]: Updating Parsoid to {{Gerrit|56a64ef}}
* 21:53 cdanis: taking over mwdebug2001 to do some testing
* 20:52 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@4e72559]: Update mobileapps to {{Gerrit|9118b44}} (duration: 07m 06s)
* 20:45 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@4e72559]: Update mobileapps to {{Gerrit|9118b44}}
* 20:37 cstone: civicrm revision changed from {{Gerrit|ad2303ef72}} to {{Gerrit|505c653da1}}
* 20:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081 schema change', diff saved to https://phabricator.wikimedia.org/P9884 and previous config saved to /var/cache/conftool/dbconfig/20191216-203202-marostegui.json
* 20:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1084 after schema change', diff saved to https://phabricator.wikimedia.org/P9883 and previous config saved to /var/cache/conftool/dbconfig/20191216-202902-marostegui.json
* 20:25 effie: restart php on mw1330
* 20:18 effie: restart php on mw1326
* 20:10 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix: Restore suggestion randomization (duration: 01m 00s)
* 20:03 Urbanecm: Morning SWAT done
* 20:03 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|9d7530e}}: Remove custom protection level for ptwikinews (duration: 00m 57s)
* 19:55 Urbanecm: mwscript renameRestrictions.php --wiki=ptwiki 'autoreviewer' 'editautoreviewprotected' ([[phab:T230103|T230103]])
* 19:55 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@9423e7e]: Increase concurrency for low traffic jobs even further -- [[phab:T240518|T240518]] (duration: 00m 49s)
* 19:54 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@9423e7e]: Increase concurrency for low traffic jobs even further -- [[phab:T240518|T240518]]
* 19:54 Urbanecm: mwscript renameRestrictions.php --wiki=dewiktionary 'autoreviewprotected' 'editautoreviewprotected' ([[phab:T230103|T230103]])
* 19:53 Urbanecm: mwscript renameRestrictions.php --wiki=arwiki 'autoreview' 'editautoreviewprotected' ([[phab:T230103|T230103]])
* 19:48 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|f4cd6d0}}: Use editautoreviewprotected for autoreview protection level only ([[phab:T230103|T230103]]) (duration: 00m 57s)
* 19:32 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@1efbc29]: Increase concurrency for low traffic jobs -- [[phab:T240518|T240518]] (duration: 00m 46s)
* 19:31 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@1efbc29]: Increase concurrency for low traffic jobs -- [[phab:T240518|T240518]]
* 19:27 mlitn@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/WikibaseMediaInfo: Override getSitelink in mediainfo table, instead of removing it (duration: 00m 56s)
* 19:21 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|6e518be}}: Add additional import sources for zhwikisource ([[phab:T240814|T240814]]) (duration: 00m 56s)
* 19:18 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@0047875]: Do not consume the fetchGoogleCloudVisionAnnotations topic -- [[phab:T240518|T240518]] (duration: 01m 00s)
* 19:17 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@0047875]: Do not consume the fetchGoogleCloudVisionAnnotations topic -- [[phab:T240518|T240518]]
* 19:17 Urbanecm: mwscript namespaceDupes.php --wiki=zhwikiquote --fix ([[phab:T240428|T240428]])
* 19:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|ced7842}}: Add namespace aliases for zhwikiquote ([[phab:T240428|T240428]]) (duration: 00m 56s)
* 19:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|4541c4a}}: Remove Wiktionary and Wikiquote from $wgInterlanguageLinkCodeMap for now ([[phab:T174160|T174160]]) (duration: 00m 57s)
* 19:04 onimisionipe: depool maps2002 for postgres init - [[phab:T239728|T239728]]
* 19:00 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build - redeploy to fix issue on wdqs1007 (duration: 02m 09s)
* 18:58 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build - redeploy to fix issue on wdqs1007
* 18:54 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@deafe56]: (no justification provided)
* 18:33 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Catch DB duplicate key errors, cont. ([[phab:T240518|T240518]]) (duration: 00m 55s)
* 18:31 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 00m 53s)
* 18:30 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 18:30 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 01m 01s)
* 18:29 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 18:28 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@deafe56]: (no justification provided)
* 18:21 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 01m 38s)
* 18:20 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision:  ([[phab:T240518|T240518]]) (duration: 00m 57s)
* 18:19 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 18:18 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@665d9d3]: New WDQS build (duration: 13m 44s)
* 18:04 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@665d9d3]: New WDQS build
* 18:03 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@deafe56]: Rolling restart of CP4JQ -- [[phab:T240518|T240518]]
* 17:56 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix: Ignore duplicate entry errors on insertLabels ([[phab:T240518|T240518]]) (duration: 00m 57s)
* 17:46 mdholloway: disabled enqueuing new MachineVision label request jobs
* 17:46 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 56s)
* 17:22 anomie@deploy1001: Synchronized php-1.35.0-wmf.10/includes/api/ApiQueryUserContribs.php: Backporting fix for [[phab:T240808|T240808]] (duration: 00m 59s)
* 17:14 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 17:01 ebernhardson: start batch indexing of minwiktionary into cirrussearch
* 17:01 hashar: Restarting CI Jenkins for plugins updates
* 16:45 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers
* 16:42 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on es1 slaves in eqiad and codfw - [[phab:T231018|T231018]]', diff saved to https://phabricator.wikimedia.org/P9881 and previous config saved to /var/cache/conftool/dbconfig/20191216-163712-marostegui.json
* 16:27 hashar: Jenkins CI: upgrading collapsing console section to 1.8.0 # [[phab:T236222|T236222]] / [[phab:T239985|T239985]]
* 16:24 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix: Bail out of label fetching job if local file not found ([[phab:T240733|T240733]]) (duration: 00m 59s)
* 16:18 hashar: Restarting CI Jenkins
* 16:14 hashar: Upgrading https://releases-jenkins.wikimedia.org/
* 16:12 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers
* 16:05 moritzm: installing spamassassin security updates
* 16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Change weights from 1 to 100 on x1 slaves in eqiad and codfw - [[phab:T231018|T231018]]', diff saved to https://phabricator.wikimedia.org/P9880 and previous config saved to /var/cache/conftool/dbconfig/20191216-160346-marostegui.json
* 15:41 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 15:28 mforns@deploy1001: Finished deploy [analytics/refinery@1c72a71]: deploying analytics refinery for kerberos migration (duration: 07m 57s)
* 15:20 mforns@deploy1001: Started deploy [analytics/refinery@1c72a71]: deploying analytics refinery for kerberos migration
* 15:15 elukey@cumin1001: START - Cookbook sre.druid.roll-restart-workers
* 14:58 cdanis: ✔️ cdanis@mwdebug2001.codfw.wmnet ~ 🕤☕ scap pull
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 schema change', diff saved to https://phabricator.wikimedia.org/P9877 and previous config saved to /var/cache/conftool/dbconfig/20191216-145520-marostegui.json
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1121 after schema change', diff saved to https://phabricator.wikimedia.org/P9876 and previous config saved to /var/cache/conftool/dbconfig/20191216-144902-marostegui.json
* 14:46 cdanis@deploy1001: Synchronized wmf-config/db-eqiad.php: db-eqiad: remove dbctl-obsoleted externalLoads section {{Gerrit|5413a6d73}} [[phab:T229686|T229686]] (duration: 00m 54s)
* 14:45 cdanis@deploy1001: Synchronized wmf-config/db-codfw.php: db-codfw: remove dbctl-obsoleted externalLoads section {{Gerrit|519e37461}} [[phab:T229686|T229686]] (duration: 00m 54s)
* 14:39 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 14:39 cdanis@deploy1001: Synchronized wmf-config/etcd.php: db-codfw: remove dbctl-obsoleted externalLoads section {{Gerrit|519e37461}} [[phab:T229686|T229686]] (duration: 00m 53s)
* 14:38 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 14:36 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 14:35 XioNoX: delete virtual chassis ID on asw-a-codfw
* 14:34 XioNoX: delete virtual chassis ID on asw-b-codfw
* 14:32 XioNoX: delete virtual chassis ID on asw-c-codfw
* 14:30 cdanis: manual testing of {{Gerrit|I219711eb}} on mwdebug2001
* 14:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1127 after testing', diff saved to https://phabricator.wikimedia.org/P9875 and previous config saved to /var/cache/conftool/dbconfig/20191216-141141-marostegui.json
* 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127 from x1 for testing', diff saved to https://phabricator.wikimedia.org/P9874 and previous config saved to /var/cache/conftool/dbconfig/20191216-140951-marostegui.json
* 14:03 cdanis@deploy1001: Synchronized wmf-config/etcd.php: enable dbctl for externalLoads {{Gerrit|6dfb30c76}} [[phab:T229686|T229686]] (duration: 00m 53s)
* 13:50 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:50 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 13:33 ema: cp-ats: rolling ats-backend-restart to apply ram cache size changes [[phab:T238494|T238494]]
* 13:33 moritzm: restarting systemd-timesyncd on stat1005
* 12:52 elukey: shutdown of the Analytics Hadoop cluster to enable Kerberos
* 12:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:15 elukey@cumin1001: START - Cookbook sre.hosts.downtime
* 12:12 Urbanecm: EU SWAT done
* 12:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|026913d}}: Add no=>nb in $wgInterlanguageLinkCodeMap ([[phab:T174160|T174160]]) (duration: 00m 53s)
* 11:58 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P9873 and previous config saved to /var/cache/conftool/dbconfig/20191216-115841-jynus.json
* 11:55 hashar: Restarting Jenkins completely to flush out stall Gearman functions in Zuul
* 11:41 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (T128546)]] (duration: 00m 52s)
* 11:40 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017{{!}} Bumping portals to master (T128546)]] (duration: 00m 56s)
* 10:57 elukey: disable puppet on labstore100[6,7] and stop analytics-related systemd timers - prep step for Kerberos
* 10:41 XioNoX: delete virtual chassis ID on asw-d-codfw
* 10:14 hashar: Restarting CI Jenkins due to out of sync state between Zuul Gearman and what is actually running (some jobs got lost)
* 09:50 marostegui: Stop replication in the same position in labsdb1010 and labsdb1012 - [[phab:T238399|T238399]]
* 09:24 hashar: Reloading Jenkins CI
* 09:14 godog: upgrade hw raid firmware on ms-be2016 and reboot - [[phab:T240798|T240798]]
* 09:14 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:13 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 09:04 Urbanecm: mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T240825 ([[phab:T240825|T240825]])
* 08:54 ema: cp1077: ats-backend-restart to increase RAM cache size [[phab:T238494|T238494]]
* 08:53 moritzm: powercycling ms-be2016 [[phab:T240798|T240798]]
* 08:36 ema: cp1075: repool all services [[phab:T240826|T240826]]
* 08:12 ema: cp1075: wipe varnish-fe and ats-be caches due to missed purges [[phab:T240826|T240826]]
* 08:08 ema: cp1075: manually start vhtcpd.service [[phab:T240826|T240826]]
* 07:52 ema: cp1075: depool, vhtcpd not running
* 07:38 marostegui: Disable auto-learn on db21[03-35] [[phab:T240823|T240823]]
* 07:27 marostegui: Disable auto-learn on db[1126-1138].eqiad.wmnet [[phab:T240823|T240823]]
* 07:13 _joe_: restarting cpjobqueue on scb1001 to check if processing rate of recentChanges recovers [[phab:T240518|T240518]]
* 07:11 marostegui: Stop replication in the same position in labsdb1010 and labsdb1012 - [[phab:T238399|T238399]]
* 07:09 onimisionipe: depool maps2001 for postgres reinit - [[phab:T239728|T239728]]
* 06:59 onimisionipe: pool maps2004. osm import is complete - [[phab:T239728|T239728]]
* 06:58 _joe_: clearing apcu across multiple api servers to allow metrics to be collected again (task coming soon)
* 06:56 marostegui: Force re-learn cycle on db1130
* 06:42 marostegui: Depool labsdb1010 - [[phab:T238399|T238399]]
* 06:39 marostegui: Recreate views on commonswiki,testcommonswiki for protected_titles on all labsdb hosts - [[phab:T233135|T233135]]
* 06:29 marostegui: Remove triggers for ar_comment on db1125:3314 [[phab:T234704|T234704]]
* 06:28 marostegui: Stop replication on db1121 for schema change
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121 for schema change', diff saved to https://phabricator.wikimedia.org/P9871 and previous config saved to /var/cache/conftool/dbconfig/20191216-062809-marostegui.json
* 03:52 tstarling@deploy1001: Synchronized docroot/mediawiki.org/keys/keys.html: (no justification provided) (duration: 00m 57s)
* 03:49 tstarling@deploy1001: Synchronized docroot/mediawiki.org/keys/keys.txt: (no justification provided) (duration: 01m 01s)
== 2019-12-14 ==
* 22:50 hashar: Restarted Gerrit on gerrit2001 # [[phab:T240763|T240763]]
== 2019-12-13 ==
* 22:35 cdanis: [[phab:T229686|T229686]] ✔️ cdanis@mwdebug1001.eqiad.wmnet /srv/mediawiki 🕠🍺 scap pull
* 21:58 cdanis: testing {{Gerrit|I0e0de86d}} by hand on mwdebug1001 [[phab:T229686|T229686]]
* 21:57 cdanis: testing {{Gerrit|I0e0de86d}} by hand on mwdebug1001
* 21:50 otto@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@06e5f42]: (no justification provided) (duration: 00m 03s)
* 21:50 otto@deploy1001: Started deploy [analytics/hdfs-tools/deploy@06e5f42]: (no justification provided)
* 21:31 onimisionipe: depool maps2004 for osm initial import - [[phab:T239728|T239728]]
* 21:29 otto@deploy1001: Finished deploy [hdfs-tools-deploy@c71e63a]: (no justification provided) (duration: 00m 08s)
* 21:29 onimisionipe: disabled tilerator on maps200[1-3] - [[phab:T239728|T239728]]
* 21:29 otto@deploy1001: Started deploy [hdfs-tools-deploy@c71e63a]: (no justification provided)
* 20:31 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 20:29 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 20:26 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 20:07 sbassett: Deployed security patch (via gerrit 557097) for [[phab:T240487|T240487]] to wmf.10
* 19:33 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix typo in 'wgMachineVisionShowUploadWizardCallToAction' (duration: 01m 00s)
* 14:51 onimisionipe: depool maps1003 after postgres init - [[phab:T239728|T239728]]
* 14:37 onimisionipe: pool maps1002 after postgres init - [[phab:T239728|T239728]]
* 11:46 moritzm: installing tiff security updates
* 10:52 moritzm: rebooting mw2164 for microcode tests
* 10:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:52 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 10:30 moritzm: uploaded doxygen 1.8.16-1~exp4~deb10+wmf1 to buster-wikimedia/component/ci [[phab:T239482|T239482]]
* 10:17 ema: cp4028: restart ats-be to enable xdebug plugin
* 09:55 _joe_: restarting pybal on lvs in esams (3007, then 3006 and 3005)
* 09:50 rlazarus: rzl@conf1006:~$ sudo systemctl restart etcd.service
* 08:48 andrewbogott: rebooting cloudvirt1023 to investigate some nova things
* 08:10 elukey: rm /var/log user.log.1 messages.1 daemon.log.1 kafkatee.log.1 syslog.1 on netflow2001 to free space (logs spammed with the same error message over and over)
* 08:07 elukey: restart kafkatee-webrequest.service on netflow1001 (spamming logs about not being able to bind to address:port)
* 08:07 elukey: restart fastmon on netflow2001 as attempt to stop spamming logs (failed)
* 08:06 elukey: restart kafkatee-webrequest.service on netflow2001 (spamming logs about not being able to bind to address:port)
* 07:56 onimisionipe: depool maps1002 for postgres init. - [[phab:T239728|T239728]]
* 07:55 elukey: execute clear bfd session address fe80::ee38:7300:17e8:a04e on cr3-knams to restore BFD session with eqdfw (OSPF3 status ok on cr3-knams)
* 06:30 moritzm: installing libice security updates
* 00:32 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Begin "initiation test" for suggested edits ([[phab:T238888|T238888]]) (duration: 00m 55s)
* 00:21 catrope@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/GrowthExperiments/: GrowthExperiments: record suggestededits pre-activation as a preference ([[phab:T238888|T238888]]) (duration: 00m 55s)
* 00:10 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Align help panel new account enabling with homepage ([[phab:T232396|T232396]]) (duration: 00m 56s)
== 2019-12-12 ==
* 22:48 eileen: process-control config revision is {{Gerrit|d195531033}} jobs temporarily disabled
* 22:33 eileen: civicrm revision changed from {{Gerrit|2043c27a0e}} to {{Gerrit|ad2303ef72}}, config revision is {{Gerrit|4d25b656e2}}
* 21:31 arlolra@deploy1001: Finished deploy [parsoid/deploy@75d72e8]: Updating Parsoid to {{Gerrit|28d7c21}} (duration: 07m 41s)
* 21:23 arlolra@deploy1001: Started deploy [parsoid/deploy@75d72e8]: Updating Parsoid to {{Gerrit|28d7c21}}
* 20:54 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Drop wgMediaInfoEnableOtherStatements, wgDepictsQualifierProperties, and wgDisableRollbackConfirmationFeature (duration: 00m 58s)
* 20:52 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Drop wgSpamBlacklistEventLogging, no longer read (duration: 00m 58s)
* 20:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T240546|T240546]] Enable the Wikisource extension on all Wikisources except old Wikisource (duration: 00m 57s)
* 20:46 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki.png: [[phab:T240578|T240578]] Change wikimaniawiki logo back to general version, 1x (duration: 00m 56s)
* 20:45 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki-1.5x.png: [[phab:T240578|T240578]] Change wikimaniawiki logo back to general version, 1.5x (duration: 00m 55s)
* 20:43 jforrester@deploy1001: Synchronized static/images/project-logos/wikimaniawiki-2x.png: [[phab:T240578|T240578]] Change wikimaniawiki logo back to general version, 2x (duration: 00m 56s)
* 20:43 volker-e@deploy1001: Finished deploy [design/style-guide@311d22e]: Deploy design/style-guide:  (duration: 00m 07s)
* 20:43 volker-e@deploy1001: Started deploy [design/style-guide@311d22e]: Deploy design/style-guide:
* 20:22 eileen: civicrm revision changed from {{Gerrit|8c8aa0e6d3}} to {{Gerrit|2043c27a0e}}, config revision is {{Gerrit|4d25b656e2}}
* 20:20 cdanis@cumin2001: dbctl commit (dc=all): '[[phab:T229686|T229686]] add sections es1/es2/es3/x1 and their instances', diff saved to https://phabricator.wikimedia.org/P9866 and previous config saved to /var/cache/conftool/dbconfig/20191212-202023-cdanis.json
* 20:18 cdanis: [[phab:T229686|T229686]] adding sections es1/es2/es3/x1 to dbctl's section data
* 20:18 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.10
* 20:17 cdanis: [[phab:T229686|T229686]] adding instances backing es1/es2/es3/x1 to dbctl's instance data
* 20:14 ejegg: updated fundraising internal dashboard from {{Gerrit|cc6d5cdde7}} to {{Gerrit|1105bf1796}}
* 20:02 onimisionipe: pool maps1001 - postgres re-init is complete - [[phab:T239728|T239728]]
* 19:57 Urbanecm: Morning SWAT done
* 19:55 mlitn@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/WikibaseMediaInfo/WikibaseMediaInfo.entitytypes.php: Revert: Register mediainfo-specific EntityIdLookup (duration: 01m 01s)
* 19:44 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|ffe365e}}: Make Parsoid/PHP cluster read-write to record lints ([[phab:T237326|T237326]], [[phab:T240057|T240057]]) (duration: 01m 02s)
* 19:34 mlitn@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/WikibaseMediaInfo/WikibaseMediaInfo.entitytypes.php: Register mediainfo-specific EntityIdLookup (duration: 01m 04s)
* 18:25 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: (no justification provided) (duration: 01m 47s)
* 18:23 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: (no justification provided)
* 18:13 ejegg: updated fundraising internal dashboard from {{Gerrit|c1ded3c473}} to {{Gerrit|cc6d5cdde7}}
* 16:09 moritzm: installing libvorbis security updates
* 16:02 cdanis: [[phab:T229686|T229686]] upgrade python3-conftool and python3-conftool-dbctl on cumin hosts
* 16:01 cdanis: sudo -E reprepro -C main include buster-wikimedia conftool_1.3.0-1+deb10u1_amd64.changes
* 15:59 cdanis: [[phab:T229686|T229686]] ✔️ cdanis@install1002.wikimedia.org ~ 🕚☕ sudo -E reprepro -C main include stretch-wikimedia conftool_1.3.0-1_amd64.changes
* 15:18 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 15:18 jmm@cumin1001: START - Cookbook sre.hosts.decommission
* 15:13 moritzm: deleting puppetdb1001 in Ganeti  [[phab:T228657|T228657]]
* 14:56 reedy@deploy1001: Synchronized php-1.35.0-wmf.10/includes/specials/SpecialUserrights.php: [[phab:T240574|T240574]] (duration: 01m 02s)
* 14:44 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7dc11d4]: Update mobileapps to {{Gerrit|65272a6}} (duration: 06m 12s)
* 14:41 onimisionipe: depool maps1001 for postgres reinitialization - [[phab:T239728|T239728]]
* 14:38 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7dc11d4]: Update mobileapps to {{Gerrit|65272a6}}
* 14:30 onimisionipe: pool maps1004 osm-import is complete - [[phab:T239728|T239728]]
* 14:21 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:18 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:18 elukey@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)
* 14:16 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:08 marostegui: Upgrade db2085 and db2086
* 14:02 jbond42: merge puppet-merge refactor
* 13:38 hashar: contint1001 / contint2001 : upgraded Zuul to 2.5.1-wmf11 # [[phab:T203846|T203846]]
* 12:58 elukey@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers
* 12:39 Urbanecm: EU SWAT done
* 12:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|07652a6}}: Add 2020: Wikimania namespace ([[phab:T240339|T240339]]) (duration: 01m 02s)
* 12:37 moritzm: installing NSS security updates on buster
* 12:34 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|1c58f09}}: Enable SandboxLink extension on hywwiki ([[phab:T239387|T239387]]) (duration: 01m 03s)
* 11:49 moritzm: removing puppetdb2001 from Ganeti
* 11:46 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
* 11:45 jmm@cumin2001: START - Cookbook sre.hosts.decommission
* 11:41 hashar: Removing zuul package from Jessie CI instances # [[phab:T240551|T240551]]
* 11:17 addshore@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase: BACKPORTS: wikibase tainted refs https://gerrit.wikimedia.org/r/#/q/topic:backports-wd-tainted-1 (duration: 01m 08s)
* 09:46 moritzm: upgrading recently reimaged stretch hosts back to puppet 5 / facter 3 [[phab:T239832|T239832]]
* 09:37 marostegui: Retroactive: deploy schema change on db1102:3314
* 08:40 eileen: process-control config revision is {{Gerrit|4d25b656e2}}
* 08:34 godog: cleanup puppetmaster1001:/run/confd-template
* 06:11 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 06:11 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 05:58 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 05:57 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:57 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:57 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:57 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:56 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 05:56 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:56 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 05:56 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
* 05:47 marostegui: Deploy schema change on db1102:3314
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3314 after schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9861 and previous config saved to /var/cache/conftool/dbconfig/20191212-054708-marostegui.json
* 03:25 ejegg: updated fundraising internal dashboard from {{Gerrit|3917f7d9dc}} to {{Gerrit|c1ded3c473}}
* 01:42 volker-e@deploy1001: Finished deploy [design/style-guide@481eaf6]: Deploy design/style-guide:  (duration: 00m 07s)
* 01:41 volker-e@deploy1001: Started deploy [design/style-guide@481eaf6]: Deploy design/style-guide:
== 2019-12-11 ==
* 23:42 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Add ability to load the DiscussionTools extension, disabled everywhere [[phab:T240468|T240468]] (duration: 01m 02s)
* 23:30 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wmgUseDiscussionTools false everywhere [[phab:T240468|T240468]] (duration: 01m 03s)
* 23:19 jforrester@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.10 (duration: 01m 02s)
* 23:18 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.10
* 23:13 jforrester@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/CentralNotice/includes/CentralNoticeHooks.php: [[phab:T240505|T240505]] Remove CentralNotice's used of deprecated jquery.ui module aliases (duration: 01m 25s)
* 22:59 cstone: civicrm revision changed from {{Gerrit|7b971ac58c}} to {{Gerrit|8c8aa0e6d3}}
* 22:08 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: (no justification provided) (duration: 01m 51s)
* 22:07 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: (no justification provided)
* 22:04 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: (no justification provided) (duration: 07m 13s)
* 21:57 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: (no justification provided)
* 21:40 arlolra: Updated Parsoid to {{Gerrit|af576d5}} ([[phab:T237693|T237693]], [[phab:T238777|T238777]], [[phab:T237306|T237306]], [[phab:T239875|T239875]], [[phab:T240053|T240053]])
* 21:31 arlolra@deploy1001: Finished deploy [parsoid/deploy@5ba7506]: Updating Parsoid to {{Gerrit|af576d5}} (duration: 09m 12s)
* 21:21 arlolra@deploy1001: Started deploy [parsoid/deploy@5ba7506]: Updating Parsoid to {{Gerrit|af576d5}}
* 21:17 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 back to 1.34.0-wmf.8
* 21:16 jforrester@deploy1001: sync-wikiversions aborted: group0 to 1.34.0-wmf.0 (duration: 00m 00s)
* 20:39 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Remove testing group restriciton on commonswiki (duration: 01m 04s)
* 20:36 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Show UploadWizard CTA on commonswiki (duration: 01m 03s)
* 20:32 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Update labeling job delay to 48 hours (duration: 01m 05s)
* 20:11 jforrester@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.10 (duration: 01m 02s)
* 20:10 jforrester@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.10
* 19:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314 for schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9858 and previous config saved to /var/cache/conftool/dbconfig/20191211-195306-marostegui.json
* 19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3314 after schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9857 and previous config saved to /var/cache/conftool/dbconfig/20191211-195130-marostegui.json
* 19:43 awight: Morning SWAT complete
* 19:43 eileen: re-enabled dedupe (off from last night benevity import attempt)
* 19:43 eileen: process-control config revision is {{Gerrit|8c073ae64a}}
* 19:34 awight@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556372{{!}}Lazily fetch user interface language to prevent cache split (take 2) (T240426, T239988)]] (duration: 00m 40s)
* 19:33 awight: Overriding scap canaries for [[phab:T240426|T240426]]
* 19:24 awight@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 19:18 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|eaa4c2c}}: Remove unused wmgCheckUserForceSummary (duration: 01m 01s)
* 19:15 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|c8fe811}}: Use wgCheckUserForceSummary instead of wmgCheckUserForceSummary (duration: 01m 02s)
* 19:11 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|c8fe811}}: Use wgCheckUserForceSummary instead of wmgCheckUserForceSummary (duration: 01m 02s)
* 19:10 urbanecm@deploy1001: sync-file aborted: SWAT: {{Gerrit|c8fe811}}: Use wgCheckUserForceSummary instead of wmgCheckUserForceSummary ([[phab:T239936|T239936]]) (duration: 00m 02s)
* 19:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|9ca31f4}}: Enable CheckUser Special:Investigate page on testwiki ([[phab:T239936|T239936]]) (duration: 01m 02s)
* 18:53 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/MachineVision: Fix no-JS warning message ([[phab:T240210|T240210]]) (duration: 01m 02s)
* 16:43 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: Consider any type of empty value as uncached in SqlEntityInfoBuilder ([[phab:T237984|T237984]]) (duration: 01m 03s)
* 16:39 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: Consider any type of empty value as uncached in SqlEntityInfoBuilder ([[phab:T237984|T237984]]) (duration: 01m 03s)
* 16:28 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 16:28 ema: cp1075 ats-be: temporarily switch to plain HTTP for api and appservers (apache directly instead of nginx)
* 16:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 15:23 rzl@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 15:21 rzl@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 15:14 rzl@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 14:45 rlazarus: updating envoyproxy to 1.12.2 on all eqiad [[phab:T238050|T238050]]
* 14:43 rlazarus: updating envoyproxy to 1.12.2 on all codfw [[phab:T238050|T238050]]
* 14:19 rlazarus: updating envoyproxy to 1.12.2 on mwmaint, restbase [[phab:T238050|T238050]]
* 14:00 rlazarus: uploaded envoyproxy-1.12.2 to reprepro
* 13:37 awight: EU SWAT complete
* 13:25 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556367{{!}}Revert "Lazily fetch user interface language to prevent cache split" ()]] (duration: 01m 02s)
* 12:54 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Cite: SWAT: [[gerrit:556351{{!}}Use messagelocalizer in CiteErrorReporter (T239988)]] (duration: 01m 04s)
* 12:38 andrew-wmde@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 12:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|7651c1a}}: GrowthExperiments: Configure testwiki to use local search & config ([[phab:T235717|T235717]]) (duration: 01m 02s)
* 12:03 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.10/extensions/Wikibase/data-access: [[gerrit:556353{{!}}Fix idlookup dropping pageids (T236691 T240410)]] (duration: 01m 03s)
* 12:00 moritzm: installing git security updates
* 11:57 jbond42: draining kubernetes2003 to restart calico-node
* 11:55 jbond42: draining kubernetes2002 to restart calico-node
* 11:52 jbond42: draining kubernetes2001 to restart calico-node
* 11:36 jbond42: draining kubernetes1004.eqiad.wmnet to restart calico-node
* 11:31 jbond42: draining kubernetes1005.eqiad.wmnet to restart calico-node
* 11:27 jbond42: draining kubernetes1006.eqiad.wmnet to restart calico-node
* 10:51 jbond42: draining kubernetes1003.eqiad.wmnet to restart calico-node
* 10:48 jbond42: draining kubernetes1002.eqiad.wmnet to restart calico-node
* 10:45 marostegui: Deploy schema change on db1103:3314
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3314 for schema change [[phab:T233135|T233135]]', diff saved to https://phabricator.wikimedia.org/P9851 and previous config saved to /var/cache/conftool/dbconfig/20191211-104506-marostegui.json
* 10:39 jbond42: draining kubernetes1001.eqiad.wmnet to restart calico-node
* 10:34 Nikerabbit: Finished running Translate/refresh-translatable-pages.php --jobqueue for Translate wikis - [[phab:T235027|T235027]] [[phab:T235188|T235188]]
* 10:03 ema: cp-ats: apply set_server_resp_no_store patch https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/556201/ to all hosts [[phab:T227432|T227432]]
* 09:45 ema@cumin1001: conftool action : set/pooled=yes; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 09:45 ema: cp1075: repool ats-be after successful set_server_resp_no_store test P9849 [[phab:T227432|T227432]]
* 09:33 godog: roll-restart logstash in codfw/eqiad after https://gerrit.wikimedia.org/r/c/operations/puppet/+/556173
* 09:25 ema@cumin1001: conftool action : set/pooled=no; selector: name=cp1075.eqiad.wmnet,service=ats-be
* 09:25 ema: cp1075: depool ats-be to test set_server_resp_no_store https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/556201/ [[phab:T227432|T227432]]
* 09:14 ema: repool cp3055 [[phab:T238305|T238305]]
* 09:04 Nikerabbit: running Translate/refresh-translatable-pages.php --jobqueue for Translate wikis - [[phab:T235027|T235027]] [[phab:T235188|T235188]]
* 08:34 marostegui: Compress cx_corpora on x1 master (db1120) - [[phab:T240325|T240325]]
* 08:34 marostegui: Upgrade db1140
* 08:10 Urbanecm: Clear signup throttle for IP 195.113.183.5
* 08:10 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: {{Gerrit|f62edfe}}: Add throttle rule for Czech student workshop (duration: 01m 02s)
* 08:04 elukey: powercycle cp3055 - down since hours ago, no ssh, no mgmt serial console usable
* 08:02 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3055.esams.wmnet
* 07:54 marostegui: Compress cx_corpora on db1140:3320 [[phab:T240325|T240325]]
* 07:51 marostegui: Upgrade db2096 (x1 codfw master)
* 06:59 marostegui: Compress cx_corpora on db2096 [[phab:T240325|T240325]]
* 06:57 marostegui: Upgrade x1 codfw
* 06:55 eileen: process-control config revision is {{Gerrit|f34450e3ba}} - turn off dedupe to do Benevity import
* 06:46 effie: restart graphoid on scb1001
* 06:44 marostegui: Stop mysql on db1124 for upgrade
* 06:28 marostegui: Stop MySQL on db2070 - [[phab:T239684|T239684]]
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2070 from config as it will be decommissioned [[phab:T239684|T239684]]', diff saved to https://phabricator.wikimedia.org/P9848 and previous config saved to /var/cache/conftool/dbconfig/20191211-062700-marostegui.json
* 06:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db2070 from config [[phab:T239684|T239684]] (duration: 01m 08s)
* 06:24 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db2070 from config [[phab:T239684|T239684]] (duration: 01m 18s)
* 06:22 marostegui: Remove db2070 from tendril and zarcillo [[phab:T239684|T239684]]
* 06:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:07 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:00 marostegui: Compress cx_corpora on db2131 [[phab:T240325|T240325]]
* 05:45 marostegui: Deploy schema change on dbstore1004:3314
* 00:54 eileen: rocess-control config revision is {{Gerrit|3f60e8fe9e}}
* 00:46 eileen: civicrm revision changed from {{Gerrit|b519d4fb73}} to {{Gerrit|7b971ac58c}}, config revision is {{Gerrit|9fb34fd93a}}
* 00:39 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:546894{{!}}Add growthexperiments dblist, for puppet usage (T208369)]] (duration: 01m 00s)
* 00:37 tgr@deploy1001: Synchronized wmf-config/config: SWAT: [[gerrit:546894{{!}}Add growthexperiments dblist, for puppet usage (T208369)]] (duration: 01m 01s)
* 00:35 tgr@deploy1001: Synchronized dblists/growthexperiments.dblist: SWAT: [[gerrit:546894{{!}}Add growthexperiments dblist, for puppet usage (T208369)]] (duration: 01m 02s)
== 2019-12-10 ==
* 22:33 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7c8cb9d]: Update mobileapps to {{Gerrit|3b1ba07}} (duration: 05m 58s)
* 22:27 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7c8cb9d]: Update mobileapps to {{Gerrit|3b1ba07}}
* 21:25 marxarelli: promoted group0 to 1.35.0-wmf.10 cc: [[phab:T233858|T233858]]
* 21:23 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.10
* 21:16 dduvall@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.10 and rebuild l10n cache (duration: 37m 20s)
* 20:39 dduvall@deploy1001: Started scap: testwiki to php-1.35.0-wmf.10 and rebuild l10n cache
* 20:38 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.5 (duration: 01m 36s)
* 20:37 cdanis: ✔️ cdanis@mw1323.eqiad.wmnet ~ 🕞🍵 sudo renice -n -19 `pidof mcrouter`
* 20:36 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.3 (duration: 01m 52s)
* 20:33 dduvall@deploy1001: Pruned MediaWiki: 1.35.0-wmf.4 (duration: 06m 40s)
* 20:31 cdanis@cumin2001: conftool action : set/weight=20; selector: cluster=appserver,dc=eqiad,service=nginx,name=mw132[34].*
* 20:31 cdanis@cumin2001: conftool action : set/weight=20; selector: cluster=appserver,dc=eqiad,service=apache2,name=mw132[34].*
* 19:45 _joe_: restarting php-fpm on mw1332,1319 (high latency)
* 19:01 marxarelli: cutting branch for 1.35.0-wmf.10 cc: [[phab:T233858|T233858]]
* 18:22 rlazarus: restarted php7.2-fpm on mw1328
* 18:19 bblack: cp2007: restart traffic-manager.service, seems to have been left in a bad state?
* 18:09 jeh: imported ceph nautilus debian packages into buster-wikimedia/thirdparty/ceph-nautilus-buster [[phab:T239917|T239917]]
* 18:08 rlazarus: restarting php7.2-fpm on all remaining slow hosts except 1328, held back for investigation: mw[1333,1331,1322,1327,1325]
* 17:54 _joe_: repooled mw1322, just depooling solved the issue
* 17:48 _joe_: depool mw1322 for debugging
* 17:44 rlazarus: mw1322$ php7adm /apcu-free
* 17:22 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556218{{!}}Catch one last undefined index (T240248)]] (duration: 01m 02s)
* 17:05 bblack: lvs100<nowiki>{</nowiki>14,16<nowiki>}</nowiki> - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 17:00 bblack: lvs200[25] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 16:50 bblack: lvs500[23] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 16:46 bblack: lvs300[67] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 16:41 bblack: lvs400[67] - restarting pybal on high-traffic2 + backup, cleaning old entries for recdns
* 16:37 bblack: lvs* + dns*: puppet disabled for lvs recdns decom work - [[phab:T239993|T239993]]
* 16:31 andrew-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: SWAT: [[gerrit:556186{{!}}Fix incomplete cloning of the Parser::$extCite instance (T240248)]] (duration: 01m 04s)
* 16:25 bblack: cr[12]-eqiad: Adding static route for 208.80.154.254 (legacy lvs recdns IP) to dns1002.wikimedia.org - [[phab:T239993|T239993]]
* 16:23 bblack: cr[12]-codfw: Adding static route for 208.80.153.254 (legacy lvs recdns IP) to dns2002.wikimedia.org - [[phab:T239993|T239993]]
* 16:11 moritzm: installing gettext updates from stretch 9.11 point release
* 16:04 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 16:04 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 16:01 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 16:00 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 15:56 moritzm: installing icu updates from stretch 9.11 point release
* 15:54 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:54 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:53 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:45 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:44 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:44 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:34 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:34 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:24 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:24 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:14 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 15:04 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 15:03 reedy@deploy1001: Synchronized wmf-config/wikitech.php: Load OSM and LdapAuth via extension.json [[phab:T140852|T140852]] (duration: 00m 55s)
* 15:01 moritzm: installing systemd updates from stretch 9.11 point release
* 14:59 reedy@deploy1001: Synchronized wmf-config/extension-list: Load OSM and LdapAuth via extension.json for messages (duration: 00m 55s)
* 14:55 reedy@deploy1001: Synchronized wmf-config/wikitech.php: [[phab:T161553|T161553]] Bye OSM config! (duration: 00m 55s)
* 14:52 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 14:43 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 14:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 14:32 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 14:30 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 14:30 akosiaris@cumin1001: START - Cookbook sre.ganeti.makevm
* 14:08 jbond42: rolling restart of varnishkafaka-webrequest and varnishkafaka-eventloggin
* 13:54 godog: remove stale puppetmaster2001:/var/run/confd-template/.*.err
* 13:21 marostegui: Compress table db2115 wikishared.cx_corpora on db2115 - [[phab:T240325|T240325]]
* 12:36 Urbanecm: EU SWAT done
* 12:36 urbanecm@deploy1001: Synchronized wmf-config/abusefilter.php: SWAT: {{Gerrit|80fac66}}: Enable abusefilter blocking cap at testwiki (duration: 00m 55s)
* 12:32 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|80fac66}}: Enable abusefilter blocking cap at testwiki (duration: 00m 55s)
* 12:19 nikerabbit@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:555921{{!}}Add wiki-for-human-rights CX campaign (T239977)]] (duration: 00m 56s)
* 11:46 _joe_: restarting etcd on conf1005, also etcdmirrormaker
* 11:34 jbond42: rolloing restart of ats servers
* 11:28 mbsantos@deploy1001: Finished deploy [kartotherian/deploy@452b144] (stretch): Update kartotherian-package to {{Gerrit|f9fb029}} ([[phab:T240227|T240227]]) (duration: 00m 20s)
* 11:27 mbsantos@deploy1001: Started deploy [kartotherian/deploy@452b144] (stretch): Update kartotherian-package to {{Gerrit|f9fb029}} ([[phab:T240227|T240227]])
* 11:04 _joe_: restarting pybal on lvs1015, then 1013 and 1014 to pick up the etcd restart
* 10:59 _joe_: restarting pybal on lvs1016, the the other eqiad pybals, to catch up on etcd restart
* 10:55 _joe_: restarting etcd on conf1004 [[phab:T237362|T237362]]
* 10:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1127 after table defragmentation (duration: 00m 55s)
* 10:36 marostegui: Optimize wikishared.cx_corpora on db2115 (non compressed table) - [[phab:T183485|T183485]]
* 10:35 marostegui: Upgrade db1127
* 10:13 moritzm: stopping slapd on dubnium/pollux following application of the spare role [[phab:T224557|T224557]]
* 10:06 onimisionipe: add new disk to RAID array on cloudelastic1002 - [[phab:T239957|T239957]]
* 09:51 marostegui: Optimize wikishared. cx_corpora on db1127 - [[phab:T183485|T183485]]
* 09:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1127 for table defragmentation (duration: 00m 59s)
* 09:20 marostegui: Restart mysql on dbstore1003, 1004 and 1005 for upgrade
* 09:11 marostegui: Restart MySQL on labsdb1012
* 06:39 marostegui: Remove db1062 from tendril and zarcillo [[phab:T239188|T239188]]
* 06:08 marostegui: Deploy schema change on s4 codfw master (this will generate lag on s4 codfw) [[phab:T233135|T233135]]
* 06:06 marostegui: Remove triggers from db2095:3314 for ar_comment - [[phab:T234704|T234704]]
* 02:11 cstone: updated civicrm revision changed from {{Gerrit|09149e0427}} to {{Gerrit|b519d4fb73}},
== 2019-12-09 ==
* 23:04 XenoRyet: reenabled Ingenico Connect recurring charge job
* 23:00 brennen@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Cite: Sync [[gerrit:556066{{!}}Hotfix: Defensive array accesses (T240248)]] (duration: 00m 57s)
* 22:55 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.8
* 22:54 XenoRyet: updated civicrm from {{Gerrit|7eab025ec0}} to {{Gerrit|09149e0427}}
* 22:54 shdubsh: restart prometheus on prometheus2004 -- [[phab:T238807|T238807]]
* 22:40 sbassett: Deployed security patch for [[phab:T192134|T192134]] to wmf.8
* 22:37 sbassett: Deployed security patch for [[phab:T192134|T192134]] to wmf.5
* 21:55 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@aa65057]: Update mobileapps to {{Gerrit|f9771ab}} (duration: 10m 39s)
* 21:48 shdubsh: restart prometheus on prometheus2003 -- [[phab:T238807|T238807]]
* 21:45 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@aa65057]: Update mobileapps to {{Gerrit|f9771ab}}
* 21:32 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@08cfd70]: Set location of ivy cache for spark (duration: 00m 24s)
* 21:32 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@08cfd70]: Set location of ivy cache for spark
* 19:18 Urbanecm: Morning SWAT done
* 19:18 Urbanecm: Purge several logo files ([[phab:T150618|T150618]])
* 19:18 Urbanecm: Run namespaceDupes.php for eswikisource ([[phab:T240050|T240050]])
* 19:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|32da89f}}: Upload HD logos for en, fi and nl arbcom wikis (2/2, [[phab:T150618|T150618]]) (duration: 01m 00s)
* 19:14 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: {{Gerrit|32da89f}}: Upload HD logos for en, fi and nl arbcom wikis (1/2, [[phab:T150618|T150618]]) (duration: 01m 01s)
* 19:07 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|f984d18}}: Add aliases for Help and Project on eswikisource ([[phab:T240050|T240050]]) (duration: 01m 00s)
* 19:01 onimisionipe: continue osm-import on maps1004 - [[phab:T239728|T239728]]
* 18:37 herron: enabling lvs for kibana-next elk7 upgrade environment, in case any alerts fire relating to this please disreagard them
* 18:19 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@9f9190e]: New WDQS Build (duration: 09m 33s)
* 18:09 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@9f9190e]: New WDQS Build
* 18:09 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@9f9190e]: New WDQS Build (duration: 03m 02s)
* 18:06 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@9f9190e]: New WDQS Build
* 18:01 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert "group2 wikis to 1.35.0-wmf.5"
* 17:52 brennen@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.8
* 17:09 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.8 (duration: 01m 00s)
* 17:08 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.8
* 17:03 brennen: attempting to roll 1.35.0-wmf.8 forward to group1
* 15:57 moritzm: installing openslp security updates
* 15:40 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:37 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:09 elukey: upload prometheus-memcached-exporter 0.6.0+git20191209.bac8a8c-1 to buster-wikimedia
* 15:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:58 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:34 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: [[phab:T229407|T229407]], clean up debugging info (duration: 00m 59s)
* 14:26 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:555941{{!}}Disable sanity check cirrus jobs for Wikidata (T239931 T229407)]] (duration: 00m 57s)
* 14:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:13 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:55 effie: reimage mw2270.codfw.wmnet mw2269.codfw.wmnet mw2268.codfw.wmnet
* 11:36 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:555907{{!}} Bumping portals to master (T128546)]] (duration: 01m 00s)
* 11:35 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:555907{{!}} Bumping portals to master (T128546)]] (duration: 01m 23s)
* 10:46 addshore: [[phab:T239470|T239470]] addshore@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki wikidatawiki --from-id=10000007 --to-id=10000007
* 10:06 rlazarus: rolling restart php-fpm in mw-eqiad due to APCu fragmentation
* 08:58 oblivian@cumin1001: conftool action : set/weight=10; selector: service=parsoid-php
* 08:49 kart_: Updated cxserver to 2019-12-05-090549-production ([[phab:T217585|T217585]], [[phab:T230195|T230195]])
* 08:46 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 08:41 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 08:39 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 08:33 elukey: powercycle mw1280, mgmt console stuck, dimm errors in getsel
* 08:24 ema: cp3064: ats-tls-restart to clear "tls process restarted" alert [[phab:T240183|T240183]]
* 07:44 onimisionipe: resetting cron on wdqs1010 to fix cronspam
* 04:05 andrew@deploy1001: Finished deploy [horizon/deploy@9847a28]: (no justification provided) (duration: 03m 37s)
* 04:01 andrew@deploy1001: Started deploy [horizon/deploy@9847a28]: (no justification provided)
* 03:54 andrew@deploy1001: Finished deploy [horizon/deploy@d1cba62]: (no justification provided) (duration: 01m 51s)
* 03:52 andrew@deploy1001: Started deploy [horizon/deploy@d1cba62]: (no justification provided)
== 2019-12-08 ==
* 20:40 ejegg: disabled Ingenico Connect recurring charge job
* 02:58 andrew@deploy1001: Finished deploy [horizon/deploy@ff0a0e7]: (no justification provided) (duration: 01m 53s)
* 02:56 andrew@deploy1001: Started deploy [horizon/deploy@ff0a0e7]: (no justification provided)
* 02:19 andrew@deploy1001: Finished deploy [horizon/deploy@ed2243c]: (no justification provided) (duration: 01m 50s)
* 02:17 andrew@deploy1001: Started deploy [horizon/deploy@ed2243c]: (no justification provided)
* 01:49 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 55s)
* 01:47 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 01:44 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 47s)
* 01:43 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 01:40 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 49s)
* 01:38 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 01:37 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 00m 07s)
* 01:36 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 01:16 andrew@deploy1001: Finished deploy [horizon/deploy@accbbd1]: (no justification provided) (duration: 01m 53s)
* 01:14 andrew@deploy1001: Started deploy [horizon/deploy@accbbd1]: (no justification provided)
* 01:11 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 01m 48s)
* 01:09 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
== 2019-12-07 ==
* 13:44 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 00m 08s)
* 13:44 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 13:29 elukey: restart php-fpm on mw1293 (jobrunner) as test
* 13:26 elukey: restart php-fpm on mw1299 (jobrunner) as test
* 09:51 apergos: reboot dumpsdata1002, checking that rpc.statd starts on boot properly
* 04:10 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 01m 55s)
* 04:08 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 03:27 andrew@deploy1001: Finished deploy [horizon/deploy@841693b]: (no justification provided) (duration: 01m 45s)
* 03:25 andrew@deploy1001: Started deploy [horizon/deploy@841693b]: (no justification provided)
* 02:59 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 01m 40s)
* 02:58 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 02:55 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 00m 07s)
* 02:55 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 01:05 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 02m 55s)
* 01:02 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 01:01 andrew@deploy1001: Finished deploy [horizon/deploy@0f70602]: (no justification provided) (duration: 02m 04s)
* 00:59 andrew@deploy1001: Started deploy [horizon/deploy@0f70602]: (no justification provided)
* 00:58 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 106m 13s)
== 2019-12-06 ==
* 23:35 ejegg: updated internal fundraising dashboard from {{Gerrit|d9d74429ba}} to {{Gerrit|3917f7d9dc}}
* 23:22 ejegg: updated payments-wiki from {{Gerrit|00632a397c}} to {{Gerrit|b3f983d5d1}}
* 23:12 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 23:12 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 00m 07s)
* 23:12 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 23:10 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 00m 07s)
* 23:10 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 23:00 ppchelko@deploy1001: Finished deploy [restbase/deploy@c2bab5d]: Parsoid: Disable mirroring all traffic in split mode (duration: 13m 43s)
* 22:46 ppchelko@deploy1001: Started deploy [restbase/deploy@c2bab5d]: Parsoid: Disable mirroring all traffic in split mode
* 22:08 bblack: mc1033: ethernet tweaks as well (expect a short link blip)
* 21:54 bblack: mc1026: add tc-fq qdisc to eth0 for tx
* 21:41 bblack: mc1026: adjusting rx ring to 2047 and disabling ethernet pause (will be a minor blip of eth link state!)
* 21:25 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:23 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 21:16 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1231.eqiad.wmnet
* 21:15 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1227.eqiad.wmnet
* 21:15 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1222.eqiad.wmnet
* 21:15 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1233.eqiad.wmnet
* 21:14 cdanis@cumin2001: conftool action : set/weight=25; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12[789].*
* 21:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1233
* 21:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1222
* 21:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw1227
* 21:04 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:01 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 20:37 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:34 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 18:57 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:56 cdanis@cumin2001: conftool action : set/weight=20; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12.*
* 18:55 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 18:36 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:34 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 18:15 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:13 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 18:12 cdanis@cumin2001: conftool action : set/weight=15; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12.*
* 17:54 bblack: install2002 - restart squid3 service
* 17:43 jforrester@deploy1001: Synchronized php-1.35.0-wmf.8/includes/libs/rdbms/database/Database.php: [[phab:T239877|T239877]] Have Database::makeWhereFrom2d assume  is string-based (duration: 01m 11s)
* 17:28 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:26 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 17:19 bblack: editing /e/n/i carefully with sed across the fleet via cumin, to correct legacy "dns-nameservers" line in older installs
* 17:08 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:06 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 16:50 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:48 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 16:47 _joe_: acpu flush finished
* 16:41 _joe_: flush acpu across the api cluster in eqiad
* 16:32 _joe_: flushing apcu on mw1339
* 16:21 ejegg: updated fundraising CiviCRM from {{Gerrit|30cdc5fa59}} to {{Gerrit|7eab025ec0}}
* 14:40 ema: text@esams: rolling ats-backend-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/553132/ [[phab:T238494|T238494]]
* 14:12 ema: cp3050: ats-backend-restart to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/553132/ [[phab:T238494|T238494]]
* 13:41 ema: cp2004: adding do_global_ doesn't seem to work with reload, restart ats-be [[phab:T238494|T238494]]
* 13:31 gehel: starting transfer of blazegraph journal from wdqs1007 to stat1004 - [[phab:T239898|T239898]]
* 08:46 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 00m 08s)
* 08:46 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 08:43 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 01m 59s)
* 08:41 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 08:38 andrew@deploy1001: Finished deploy [horizon/deploy@1911591]: (no justification provided) (duration: 01m 55s)
* 08:36 andrew@deploy1001: Started deploy [horizon/deploy@1911591]: (no justification provided)
* 08:25 moritzm: installing libgd2 security updates on stretch
* 08:04 andrew@deploy1001: Finished deploy [horizon/deploy@a8c759e]: (no justification provided) (duration: 00m 07s)
* 08:04 andrew@deploy1001: Started deploy [horizon/deploy@a8c759e]: (no justification provided)
* 08:03 andrew@deploy1001: Finished deploy [horizon/deploy@a8c759e]: (no justification provided) (duration: 01m 28s)
* 08:01 andrew@deploy1001: Started deploy [horizon/deploy@a8c759e]: (no justification provided)
* 08:01 andrew@deploy1001: Finished deploy [horizon/deploy@a8c759e]: (no justification provided) (duration: 02m 03s)
* 07:59 andrew@deploy1001: Started deploy [horizon/deploy@a8c759e]: (no justification provided)
* 07:55 moritzm: installing libonig security updates
* 07:46 andrew@deploy1001: Finished deploy [horizon/deploy@a8c759e]: (no justification provided) (duration: 03m 11s)
* 07:43 andrew@deploy1001: Started deploy [horizon/deploy@a8c759e]: (no justification provided)
* 07:42 andrew@deploy1001: Finished deploy [horizon/deploy@1ac26da]: (no justification provided) (duration: 00m 08s)
* 07:41 andrew@deploy1001: Started deploy [horizon/deploy@1ac26da]: (no justification provided)
* 07:41 andrew@deploy1001: Finished deploy [horizon/deploy@1ac26da]: (no justification provided) (duration: 03m 23s)
* 07:38 andrew@deploy1001: Started deploy [horizon/deploy@1ac26da]: (no justification provided)
* 07:38 moritzm: installing libav security updates
* 07:37 andrew@deploy1001: Finished deploy [horizon/deploy@1ac26da]: (no justification provided) (duration: 00m 07s)
* 07:37 andrew@deploy1001: Started deploy [horizon/deploy@1ac26da]: (no justification provided)
* 03:58 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 03:55 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 03:53 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 03:53 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 02:12 reedy@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/SecurePoll/cli/dump.php: [[phab:T239968|T239968]] (duration: 01m 04s)
* 01:34 reedy@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/SecurePoll/cli/dump.php: [[phab:T239968|T239968]] (duration: 01m 00s)
* 01:25 reedy@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/SecurePoll/cli/dump.php: [[phab:T239968|T239968]] (duration: 01m 01s)
* 01:09 ejegg: updated fundraising internal dashboard from {{Gerrit|3a93d2aba4}} to {{Gerrit|d9d74429baa}}
* 01:08 ejegg: updated payments-wiki from {{Gerrit|81921bd04a}} to {{Gerrit|00632a397c}}
* 01:04 catrope@deploy1001: Synchronized private/PrivateSettings.php: HMAC value for Kask config ([[phab:T222099|T222099]]) (duration: 00m 59s)
* 01:02 reedy@deploy1001: Synchronized private/PrivateSettings.php: wmgSessionStoreHMACKey [[phab:T222099|T222099]] (duration: 01m 07s)
* 00:47 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: Use PHP serialization with HMAC for Kask session serialization ([[phab:T222099|T222099]]) (duration: 01m 01s)
* 00:08 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add *.archives.go.jp to $wgCopyUploadsDomains ([[phab:T238476|T238476]]) (duration: 01m 00s)
== 2019-12-05 ==
* 23:44 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T235263|T235263]] Turn off redirect on exact search match for Commons (duration: 01m 00s)
* 23:04 ebernhardson: [cloudelastic-chi] reduce indices.recovery.max_bytes_per_sec from 512mb->128mb
* 22:30 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:28 herron@cumin1001: START - Cookbook sre.hosts.downtime
* 22:07 krinkle@deploy1001: Synchronized wmf-config/: {{Gerrit|I64e5ebe5fcd6b}} - removes arclamp.php (duration: 01m 01s)
* 22:03 mutante: phabricator - git-ssh.wikimedia.org has been fixed and is up again ([[phab:T238956|T238956]])
* 22:01 mutante: phab1001 - restarting ssh-phab to listen on additional LVS IP
* 22:00 krinkle@deploy1001: Synchronized php-1.35.0-wmf.8/includes/libs/rdbms/database/: [[phab:T233342|T233342]] (duration: 01m 02s)
* 21:55 twentyafterfour: stopping phd on phab1003 and starting on phab1001
* 21:50 mutante: phab1003 - remove IPv6 service IP for git-ssh from lo:LVS
* 21:34 mutante: puppetmaster2001: deleting /var/run/confd-template/.git-ssh*.err to fix confd template compilation alerts
* 21:33 mutante: puppetmaster1001: deleting /var/run/confd-template/.git-ssh*.err to fix confd template compilation alerts
* 21:19 mutante: phab1001 - systemctl restart ssh-phab (to make it listen on IPv6, race between puppet adding the IP and starting the service)
* 21:09 bblack: ns0.wikimedia.org: restore routing to authdns1001
* 21:03 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=phab1001-vcs.eqiad.wmnet
* 21:00 mutante: phab1001 - reload apache2, removed /ws/ rewrite for wstunnel for aphlict
* 21:00 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:58 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 20:56 bblack: cr[12]-eqiad: delete leftover static route of ns2->authdns1001 from esams work, which was blinding icinga to the real ns2 :P
* 20:49 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 20:48 twentyafterfour: successfully migrated to phab1001 with no apparent user impact!
* 20:47 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 20:46 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 20:43 bblack: ns0.wikimedia.org: re-routing auth traffic from authdns1001 (reimaging) to dns1001
* 20:41 mutante: running puppet on all cp* for phab change
* 20:36 volker-e@deploy1001: Finished deploy [design/style-guide@437023f]: Deploy design/style-guide:  (duration: 00m 08s)
* 20:36 volker-e@deploy1001: Started deploy [design/style-guide@437023f]: Deploy design/style-guide:
* 20:29 twentyafterfour: migrating back to phab1001, minimal downtime expected
* 20:12 mutante: phab1001 - rebooting to hopefully clear "microcode vuln" icinga alert
* 20:11 onimisionipe: ban cloudelastic1002 from shard allocation - [[phab:T230088|T230088]]
* 20:10 bblack: ns1.wikimedia.org: restoring normal routing to the newly-reimaged authdns2001
* 19:56 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:53 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 19:47 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Linter/extension.json: SWAT: {{Gerrit|afcfdce}}: Revert "Revert "Implement ParserLogLinterData hook"" (3/3, [[phab:T238456|T238456]]) (duration: 01m 00s)
* 19:46 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Linter/includes/ApiRecordLint.php: SWAT: {{Gerrit|afcfdce}}: Revert "Revert "Implement ParserLogLinterData hook"" (2/3, [[phab:T238456|T238456]]) (duration: 01m 09s)
* 19:44 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Linter/includes/Hooks.php: SWAT: {{Gerrit|afcfdce}}: Revert "Revert "Implement ParserLogLinterData hook"" (1/3, [[phab:T238456|T238456]]) (duration: 01m 11s)
* 19:41 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Linter/includes/ApiRecordLint.php: SWAT: {{Gerrit|7b7f326}}: Implement ParserLogLinterData hook (3/3, [[phab:T238456|T238456]]) (duration: 01m 04s)
* 19:39 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Linter/extension.json: SWAT: {{Gerrit|7b7f326}}: Implement ParserLogLinterData hook (2/3, [[phab:T238456|T238456]]) (duration: 01m 05s)
* 19:37 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Linter/includes/Hooks.php: SWAT: {{Gerrit|7b7f326}}: Implement ParserLogLinterData hook (1/3, [[phab:T238456|T238456]]) (duration: 01m 09s)
* 19:35 mutante: Icinga: delete all downtimes for mw2259. Scheduling Icinga downtimes is tricky business. If you add some for hardware failure and they are too short you cause Icinga spam, if they are too long and the dcops operator is amazingly fast like Papaul then your server is back in production but not monitored and you have to click a million times in the web UI to remove them to avoid that.
* 19:34 bblack: ns1.wikimedia.org: re-route authdns traffic from authdns2001 (to be reimaged) -> dns2001 temporarily - [[phab:T239667|T239667]]
* 19:28 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Linter: SWAT: {{Gerrit|e0a2059}}: Revert "Implement ParserLogLinterData hook" (duration: 01m 01s)
* 19:19 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Linter/: SWAT: {{Gerrit|b376528}}: Revert "Implement ParserLogLinterData hook" (duration: 01m 01s)
* 19:15 urbanecm@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 19:14 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Linter: SWAT: {{Gerrit|839c383}}: Implement ParserLogLinterData hook ([[phab:T238456|T238456]]) (duration: 01m 02s)
* 18:40 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2259.codfw.wmnet
* 18:25 kevinbazira@deploy1001: Finished deploy [ores/deploy@6dd1fef]: [[phab:T238839|T238839]] (duration: 17m 20s)
* 18:08 kevinbazira@deploy1001: Started deploy [ores/deploy@6dd1fef]: [[phab:T238839|T238839]]
* 17:38 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:36 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 17:31 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@c29a758]: deploy repo to search-airflow dsh group (duration: 00m 13s)
* 17:30 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@c29a758]: deploy repo to search-airflow dsh group
* 17:23 cdanis: ✔️ cdanis@install1002.wikimedia.org ~ 🕧☕ sudo -E reprepro -C main include stretch-wikimedia prometheus-atlas-exporter_1.0+git20191204.ffafab7-1_amd64.changes
* 17:18 effie: reimage mw2260, yes again
* 16:47 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@87b25f2]: initial airflow dags/plugins (duration: 00m 06s)
* 16:47 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@87b25f2]: initial airflow dags/plugins
* 16:40 brion: running `requeueTranscodes.php --error --throttle` on mwmaint1002 to clean up [[phab:T239831|T239831]]-related broken video transcodes. will raise usage on video scalers for a while.
* 16:33 elukey: execute clear bfd session address fe80::5e5e:ab00:d3d:85ce on cr3-knams
* 16:32 elukey: execute clear bfd session address fe80::7a4f:9b00:d4e:8004 on cr1-eqiad
* 16:20 elukey: execute clear bfd session address 208.80.154.208 on cr2-eqord
* 15:50 anomie@deploy1001: Finished scap: Backporting fix for [[phab:T239428|T239428]] (duration: 33m 20s)
* 15:49 ejegg: re-enabled creating CiviMail activities when sending Thank You emails
* 15:44 jynus: restart backup1001, overloaded [[phab:T234900|T234900]]
* 15:43 akosiaris@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 15:43 moritzm: upgrading the reimaged video scalers back to the row-mt enabled ffmpeg [[phab:T239831|T239831]]
* 15:41 ejegg: updated Fundraising CiviCRM from {{Gerrit|4a72ad4e63}} to {{Gerrit|30cdc5fa59}}
* 15:17 anomie@deploy1001: Started scap: Backporting fix for [[phab:T239428|T239428]]
* 15:16 onimisionipe: run osm-import on maps1004 - [[phab:T239728|T239728]]
* 14:52 cdanis@deploy1001: Synchronized src/Noc/WmfClusters.php: {{Gerrit|c0fe7c410}} clarify loads output (earlier push was {{Gerrit|7963fdcd2}} sort clusters naturally) (duration: 00m 59s)
* 14:52 onimisionipe: disable puppet on maps100[1-3].eqiad.wmnet - [[phab:T239728|T239728]]
* 14:51 onimisionipe: disable tilerator on maps100[1-3].eqiad.wmnet - [[phab:T239728|T239728]]
* 14:50 cdanis@deploy1001: Synchronized docroot/noc/db.php: {{Gerrit|c0fe7c410}} noc/db.php: clarify loads output (duration: 01m 01s)
* 14:39 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:37 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:25 Lucas_WMDE: 14:20:08 <effie> reimage mw2260
* 13:09 godog: bounce mtail on mw1240
* 13:01 _joe_: restarted mtail on mw1239
* 12:41 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:39 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:21 effie: Reimage mw2261.codfw.wmnet
* 12:16 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|6c9d168}}: Fix namespace name - napwikisource ([[phab:T239547|T239547]]) (duration: 01m 02s)
* 10:48 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:44 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:38 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 10:35 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 10:26 effie: reimage mw2260.codfw.wmnet
* 10:13 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 09:54 ema: text@esams: disable ats-be origin server request coalescing [[phab:T238494|T238494]]
* 09:07 marostegui: Upgrade db2094 and db2095
* 08:38 marostegui: Upgrade db2078
* 08:09 marostegui: Upgrade pc2007, pc2008, pc2009, pc2010
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1062 from etcd [[phab:T239188|T239188]]', diff saved to https://phabricator.wikimedia.org/P9821 and previous config saved to /var/cache/conftool/dbconfig/20191205-080909-marostegui.json
* 08:03 elukey: remove logstash_cleanup_indices_apifeatureusage-search.svc.codfw.wmnet and logstash_cleanup_indices_apifeatureusage-search.svc.eqiad.wmnet from logstash1025,logstash1024,logstash1023,logstash2024,logstash2025 to reduce cronspam - [[phab:T234854|T234854]]
* 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9820 and previous config saved to /var/cache/conftool/dbconfig/20191205-074200-marostegui.json
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9819 and previous config saved to /var/cache/conftool/dbconfig/20191205-073209-marostegui.json
* 07:29 _joe_: ran apt-get install manually on kubestagetcd1001 to fix broken packages
* 07:25 _joe_: manually running package_builder_Clean_up_build_directory.service on boron
* 07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9818 and previous config saved to /var/cache/conftool/dbconfig/20191205-072314-marostegui.json
* 07:22 _joe_: umounting /proc,/sys,/dev from /var/cache/pbuilder/build/cow.6815 on boron to allow reaping it away
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1099:3311, db1099:3318', diff saved to https://phabricator.wikimedia.org/P9817 and previous config saved to /var/cache/conftool/dbconfig/20191205-071445-marostegui.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311, db1099:3318 for upgrade', diff saved to https://phabricator.wikimedia.org/P9816 and previous config saved to /var/cache/conftool/dbconfig/20191205-070631-marostegui.json
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9815 and previous config saved to /var/cache/conftool/dbconfig/20191205-065536-marostegui.json
* 06:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:51 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9814 and previous config saved to /var/cache/conftool/dbconfig/20191205-064845-marostegui.json
* 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9813 and previous config saved to /var/cache/conftool/dbconfig/20191205-063103-marostegui.json
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3317, db1101:3318', diff saved to https://phabricator.wikimedia.org/P9812 and previous config saved to /var/cache/conftool/dbconfig/20191205-061453-marostegui.json
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3317 for upgrade', diff saved to https://phabricator.wikimedia.org/P9811 and previous config saved to /var/cache/conftool/dbconfig/20191205-055756-marostegui.json
* 03:37 twentyafterfour: leaving phabricator on phab1003 for tonight while phab1001 raid syncs, will pick it up tomorrow to decide where to go from here
* 03:32 twentyafterfour@deploy1001: Finished deploy [phabricator/deployment@UNKNOWN]: deploy release/2019-08-22/1 to phab1001 (duration: 01m 36s)
* 03:30 twentyafterfour@deploy1001: Started deploy [phabricator/deployment@UNKNOWN]: deploy release/2019-08-22/1 to phab1001
* 03:29 twentyafterfour@deploy1001: Finished deploy [phabricator/deployment@UNKNOWN]: deploy release/2019-08-22/1 to phab1001 (duration: 00m 22s)
* 03:29 twentyafterfour@deploy1001: Started deploy [phabricator/deployment@UNKNOWN]: deploy release/2019-08-22/1 to phab1001
* 03:07 mutante: phab1001 - now using AHCI mode after reinstall, performance much better. rsyncing /srv/repos from phab1003 again
* 02:32 mutante: phab1001 - signed new puppet cert - initial puppet run in progress
* 02:27 mutante: phab1001 - fixed boot order in BIOS to boot only from HDD, back at login
* 02:12 ejegg: updated payments-wiki from {{Gerrit|f61c9f0692}} to {{Gerrit|81921bd04a}}
* 01:21 mutante: phab1001 - rebooting to BIOS once more - "The settings were saved successfully."
* 01:19 twentyafterfour: phab1001 back, still in legacy ide mode
* 01:12 mutante: phab1001 - enabling Write Cache in BIOS
* 01:07 mutante: phab1001 - System BIOS Settings > SATA Settings > Embedded SATA: switch from ATA to AHCI mode ([[phab:T238956|T238956]])
* 01:05 mutante: phab1001 - powercycling
* 01:04 mutante: telling phab1001 to boot into BIOS next time it boots via mgmt console (https://wikitech.wikimedia.org/wiki/Platform-specific_documentation/Dell_PowerEdge_RN30#Reboot_and_boot_into_BIOS_then_console)
* 01:03 twentyafterfour: phabricator switched back to phab1003 - reimaging phab1001 now
* 00:49 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=phab1001-vcs.eqiad.wmnet
* 00:49 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=phab1001-vcs.eqiad.wmnet
* 00:43 cwhite@cumin1001: dbctl commit (dc=all): 'Depool db1062 [[phab:T239874|T239874]]', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20191205-004256-cwhite.json
== 2019-12-04 ==
* 23:38 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.35.0-wmf.5"
* 23:35 brennen@deploy1001: Scap failed!: 9/11 canaries failed their endpoint checks(http://en.wikipedia.org)
* 23:30 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.8 (duration: 01m 01s)
* 23:29 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.8
* 23:22 jforrester@deploy1001: Synchronized wmf-config/logging.php: Keep  test consistent w/ operations/puppet, for logging (duration: 01m 02s)
* 23:21 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Keep  test consistent w/ operations/puppet, for CS (duration: 01m 03s)
* 22:49 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable VisualEditor on Wikitech (and Labs Wikitech) (duration: 01m 02s)
* 22:45 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@e6afe36]: Update mobileapps to {{Gerrit|9e9b042}} (duration: 05m 48s)
* 22:39 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@e6afe36]: Update mobileapps to {{Gerrit|9e9b042}}
* 22:39 bstorm_: powered off cloudstore1008, disabled sync from cloudstore1009, and downtimed both cloudstore1008 and cloudstore1009 for memory module replacement [[phab:T239569|T239569]]
* 22:37 bstorm_: poweroff cloudstore1008 for memory module replacement
* 22:24 RoanKattouw: [[phab:T208369|T208369]] ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php kowiki --cutoff 350
* 22:21 RoanKattouw: [[phab:T208369|T208369]] ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php cswiki --cutoff 350
* 21:48 eileen: civicrm revision changed from {{Gerrit|6812488f3a}} to {{Gerrit|4a72ad4e63}}, config revision is {{Gerrit|9f4db1edad}} (CiviCRM security patches )
* 21:38 rzl@cumin1001: conftool action : set/pooled=yes; selector: name=mw226[3456]\.codfw\.wmnet
* 21:11 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:09 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 20:47 milimetric@deploy1001: Finished deploy [analytics/refinery@fc710ec] (thin): Weekly train deploy to labs/notebooks (duration: 00m 07s)
* 20:47 milimetric@deploy1001: Started deploy [analytics/refinery@fc710ec] (thin): Weekly train deploy to labs/notebooks
* 20:33 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:31 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 20:28 milimetric@deploy1001: Finished deploy [analytics/refinery@fc710ec]: Weekly train deploy (duration: 07m 09s)
* 20:28 brennen@deploy1001: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.35.0-wmf.5"
* 20:21 milimetric@deploy1001: Started deploy [analytics/refinery@fc710ec]: Weekly train deploy
* 20:15 milimetric@deploy1001: Finished deploy [analytics/refinery@fc710ec]: Weekly train deploy (duration: 06m 37s)
* 20:14 brennen@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.8 (duration: 01m 29s)
* 20:12 brennen@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.8
* 20:09 milimetric@deploy1001: Started deploy [analytics/refinery@fc710ec]: Weekly train deploy
* 19:56 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:54 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 19:43 twentyafterfour@deploy1001: Finished deploy [phabricator/deployment@e4e2b22]: deploy phabricator to phab2001.codfw.wmnet (duration: 00m 31s)
* 19:43 twentyafterfour@deploy1001: Started deploy [phabricator/deployment@e4e2b22]: deploy phabricator to phab2001.codfw.wmnet
* 19:38 milimetric@deploy1001: deploy aborted: Weekly train deploy (duration: 00m 21s)
* 19:38 milimetric@deploy1001: Started deploy [analytics/refinery@c8de2ab]: Weekly train deploy
* 19:21 Amir1: morning SWAT is done
* 19:19 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:17 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: SWAT: [[gerrit:554330{{!}}Remove no-op 'jquery.ui.core.styles' from FullEntityParserOutputGenerator]] ([[phab:T219604|T219604]] [[phab:T239594|T239594]]) (duration: 01m 06s)
* 19:16 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 18:55 bblack: dns1001: back to normal again
* 18:54 bblack: dns1001: stop bird.service again, briefly
* 18:52 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:50 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:49 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:46 bblack: dns1001: restart bird.service
* 18:45 arlolra: Updated Parsoid to {{Gerrit|b81bbf4}} ([[phab:T239643|T239643]], [[phab:T239830|T239830]], [[phab:T238456|T238456]], [[phab:T239841|T239841]])
* 18:41 bblack: dns1001: stopping just bird
* 18:32 arlolra@deploy1001: Finished deploy [parsoid/deploy@0910e18]: Updating Parsoid to {{Gerrit|b81bbf4}} (duration: 08m 11s)
* 18:24 arlolra@deploy1001: Started deploy [parsoid/deploy@0910e18]: Updating Parsoid to {{Gerrit|b81bbf4}}
* 18:08 bblack: dns1002: back to normal state
* 18:05 bblack: dns1002: stopping recursive dns to test failure theory (same method as prere-imaging earlier, intended to not cause impact)
* 17:54 bblack: dns1001: back to normal state
* 17:51 bblack: dns1001: stopping recursive dns to test failure theory (same method as prere-imaging earlier, intended to not cause impact)
* 17:50 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: [[phab:T229407|T229407]], part III (duration: 01m 01s)
* 17:25 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=dns[12]001.wikimedia.org
* 17:25 _joe_: repooling mw1348
* 17:21 _joe_: depooling mw1348 for debugging
* 17:15 jynus: killing dump threads on db1118 [[phab:T143870|T143870]]
* 17:13 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:11 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 17:09 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:07 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 16:49 bblack@cumin1001: conftool action : set/pooled=no; selector: name=dns[12]001.wikimedia.org
* 16:48 bblack: dns[12]001 - reimaging to buster
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=nginx,name=mw2267.codfw.wmnet,cluster=jobrunner
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=apache2,name=mw2267.codfw.wmnet,cluster=jobrunner
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=nginx,name=mw2267.codfw.wmnet,cluster=videoscaler
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,service=apache2,name=mw2267.codfw.wmnet,cluster=videoscaler
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: name=mw2272.codfw.wmnet,dc=codfw,service=nginx,cluster=appserver
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: name=mw2272.codfw.wmnet,dc=codfw,service=apache2,cluster=appserver
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: name=mw2273.codfw.wmnet,dc=codfw,cluster=appserver,service=nginx
* 16:48 rzl@cumin1001: conftool action : set/pooled=yes; selector: name=mw2273.codfw.wmnet,dc=codfw,cluster=appserver,service=apache2
* 16:33 ejegg: updated fundraising CiviCRM from {{Gerrit|970b7b214b}} to {{Gerrit|6812488f3a}}
* 16:32 effie: enagle puppet on mwdebug1001
* 16:32 effie: enagle puppet on mw1348
* 16:30 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:28 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 16:25 effie: disable puppet on mw1348
* 15:57 papaul: rebooting ms-fe2007 for HW maintenance
* 15:49 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:47 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 15:29 moritzm: installing mariadb 10.3 updates from Buster 10.2 point release (client libs/tools only)
* 15:28 mobrovac@deploy1001: Finished deploy [restbase/deploy@f4b752e]: Parsoid: Set title when sending html2html reqs; Mirror 6% of html2html reqs to Parsoid/PHP - [[phab:T239768|T239768]] [[phab:T239643|T239643]] (duration: 16m 02s)
* 15:26 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:24 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 15:19 ejegg: updated fundraising CiviCRM from {{Gerrit|0f51030071}} to {{Gerrit|970b7b214b}}
* 15:15 ejegg: disabled debug logging for Ingenico on payments-wiki
* 15:12 mobrovac@deploy1001: Started deploy [restbase/deploy@f4b752e]: Parsoid: Set title when sending html2html reqs; Mirror 6% of html2html reqs to Parsoid/PHP - [[phab:T239768|T239768]] [[phab:T239643|T239643]]
* 15:09 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: [[phab:T229407|T229407]], part II (duration: 01m 02s)
* 15:07 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:07 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/includes/ParserOutput/FullEntityParserOutputGenerator.php: [[phab:T229407|T229407]] (duration: 01m 00s)
* 15:05 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2135 as master for s10 in codfw', diff saved to https://phabricator.wikimedia.org/P9806 and previous config saved to /var/cache/conftool/dbconfig/20191204-145349-marostegui.json
* 14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2135 in m5 codfw', diff saved to https://phabricator.wikimedia.org/P9805 and previous config saved to /var/cache/conftool/dbconfig/20191204-145145-marostegui.json
* 14:40 rzl@cumin1001: conftool action : set/pooled=yes; selector: service=apache2,cluster=appserver,dc=codfw,name=mw2274.codfw.wmnet
* 14:40 rzl@cumin1001: conftool action : set/pooled=yes; selector: service=nginx,cluster=appserver,dc=codfw,name=mw2274.codfw.wmnet
* 14:31 moritzm: test ldap-corp2001 as LDAP server on mx2001
* 14:24 bblack: ns2 authdns: re-route from ganeti3003 to dns3001 - [[phab:T236479|T236479]]
* 14:10 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=dns5001.wikimedia.org
* 14:04 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=dns[34]001.wikimedia.org
* 13:59 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:57 rzl@cumin1001: START - Cookbook sre.hosts.downtime
* 13:54 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:52 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 13:50 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 13:45 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:43 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 13:24 bblack@cumin1001: conftool action : set/pooled=no; selector: name=dns[345]001.wikimedia.org
* 13:24 onimisionipe: downtimed maps1004 - [[phab:T239728|T239728]]
* 13:23 bblack: dns[345]001 - starting downtimes/etc for reimage to buster...
* 12:31 filippo@cumin1001: conftool action : set/pooled=no; selector: name=ms-fe2007.codfw.wmnet
* 12:29 Urbanecm: EU SWAT done
* 12:28 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/WikimediaMessages/: SWAT:  {{Gerrit|bbf2a33}}: Change Schema Revision of WMDEBannerEvents ([[phab:T239430|T239430]]) (duration: 01m 02s)
* 12:26 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/WikimediaMessages/: SWAT: {{Gerrit|b3ef5cd}}: Change Schema Revision of WMDEBannerEvents ([[phab:T239430|T239430]]) (duration: 01m 04s)
* 11:38 jbond42: puppet enabled accross the fleet and new CA certificate installed
* 11:31 akosiaris: drain kubernetes1002 for test of nf_conntrack changes
* 11:23 jbond42: enable puppet in eqiad and deploy updated CA
* 11:13 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
* 10:54 jbond42: enable puppet in codfw and deploy updated CA
* 10:46 jbond42: enable puppet in esams and deploy updated CA
* 10:42 jbond42: enable puppet in ulsfo and deploy updated CA
* 10:31 gehel@cumin1001: START - Cookbook sre.wdqs.restart
* 10:31 gehel: rolling restart of wdqs for config change (event logging) - [[phab:T101013|T101013]]
* 10:31 jbond42: enable puppet in eqsin and deploy updated CA
* 10:24 marostegui: stop replication and mysql on db2107 (s2 codfw master) to test puppet CA changes
* 10:21 marostegui: stop replication and mysql on db2071 to test puppet CA changes
* 10:02 jbond42: disabling puppet accros the fleet to start CA update change 548241
* 09:29 godog: roll-restart logstash7 in codfw/eqiad after https://gerrit.wikimedia.org/r/c/operations/puppet/+/554472
* 09:15 marostegui: Reload labsdb1010 after reimporting wikidatawiki.page - [[phab:T238399|T238399]]
* 09:06 moritzm: updated jenkins on apt.wikimedia.org to 2.190.3 ([[phab:T239586|T239586]])
* 08:05 effie: Restart php7-fpm on mw1348
* 07:09 marostegui: Depool labsdb1010 to reimport wikidatawiki.page - [[phab:T238399|T238399]]
* 07:02 marostegui: Repool labsdb1011
* 06:36 mutante: removed LVS IP for git-ssh from interface on phab1003
* 06:25 dzahn@cumin1001: conftool action : set/weight=10; selector: name=phab1001-vcs.eqiad.wmnet
* 06:13 mutante: phab1001 - running rsync of /srv/repos with --delete because it's larger than the source by about 5GB - deleting objects to match phab1003, former prod server. now both 50G ([[phab:T238956|T238956]])
* 06:04 marostegui: Depool labsdb1011
* 06:01 mutante: rsyncing /srv/repos data once again. pulling from phab1003 to phab1001 ([[phab:T238956|T238956]])
* 05:51 marostegui: Deploy schema change on s3 primary master (db1123)
* 04:59 mutante: removed downtime for phabricator.wikimedia.org meta service (paging)
* 04:58 mutante: phabricator maintenance ended for today - now running on phab1001 (buster)
* 04:58 mutante: install1002 - restarted isc-dhcpd
* 04:39 mutante: phab1001 - rebooting for BIOS config change
* 02:06 mutante: re-enabling puppet on phab1003 and phab1001.. switching active_server for puppet
* 01:55 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=phab1001-vcs.eqiad.wmnet
* 01:47 mutante: switching phab-vcs in conftool-data from phab1003 to phab1001, running puppet on conf*
* 01:45 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=phab1003-vcs.eqiad.wmnet
* 01:40 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=phab1003-vcs.eqiad.wmnet
* 01:37 twentyafterfour: re-enable phabricator writes (disable cluster.read-only)
* 01:33 twentyafterfour: phab1001.eqiad.wmnet : sudo chown root.www-data /srv/phab/phabricator/conf/local/www.json
* 01:29 mutante: phabricator currently under maintenance - db connection error is known
* 01:20 mutante: running puppet on cp-eqiad
* 00:49 ejegg: changed donations queue consumer and thank you mailer to use 3 minute cycles
* 00:41 twentyafterfour: switching phabricator to read-only mode
* 00:40 reedy@deploy1001: Synchronized php-1.35.0-wmf.8/skins/Vector/includes/templates/SearchComponent.mustache: {{Gerrit|I9776a3c355081dc5fec7753edf256f55dfe6045b}} (duration: 01m 01s)
== 2019-12-03 ==
* 23:47 volans: re-enabled meta-monitoring crontabs on wikitech-static after cleanup, reboot and fix wikitech-static's import errors
* 22:59 volans: apt-get dist-upgrade and reboot of wikitech-static host
* 22:36 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove  settings for closed wikis [[phab:T231178|T231178]] (duration: 01m 01s)
* 22:34 volans: disabled temporarily icinga meta-monitoring (disk full on the wikitech-static host)
* 22:34 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable the Wikisource extension on frwikisource [[phab:T239731|T239731]] (duration: 01m 00s)
* 22:22 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Read wmgDoNotRedirectOnSearchMatch to decide to enable auto-redirect search result change [[phab:T235263|T235263]] (duration: 01m 00s)
* 22:21 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wmgDoNotRedirectOnSearchMatch, default off, on for Test Commons [[phab:T235263|T235263]] (duration: 01m 01s)
* 22:03 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wgXmlDumpSchemaVersion to 0.1.0 everywhere [[phab:T238921|T238921]] [[phab:T174031|T174031]] (duration: 01m 03s)
* 21:40 eileen: civicrm revision changed from {{Gerrit|26b788378e}} to {{Gerrit|0f51030071}}, config revision is {{Gerrit|17b6730a72}} - includes 3 possible performance improvements - logging reduction, cache a query result & cache file existence
* 21:38 volker-e@deploy1001: Finished deploy [design/style-guide@02a92f7]: Deploy design/style-guide:  (duration: 00m 07s)
* 21:38 volker-e@deploy1001: Started deploy [design/style-guide@02a92f7]: Deploy design/style-guide:
* 21:09 sbassett: Deployed security patch for [[phab:T238768|T238768]] to wmf.8
* 21:03 sbassett: Deployed security patch for [[phab:T238768|T238768]] to wmf.5
* 20:43 mutante: mw2259 - did not come back from reboot after reimage, also mgmt not reachable ([[phab:T239054|T239054]])
* 20:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2257.codfw.wmnet
* 20:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet
* 20:38 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2258.codfw.wmnet
* 20:17 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=dns[12]002.wikimedia.org
* 20:00 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@c21a1ca]: Bump preq version for better logging around MW API timeouts (duration: 05m 46s)
* 19:54 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@c21a1ca]: Bump preq version for better logging around MW API timeouts
* 19:53 ejegg: shifted 20 more sec / cycle from donations QC to thank you mailer
* 19:41 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:38 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:36 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:35 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:34 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:31 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:30 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:28 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 19:24 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:22 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 19:16 Urbanecm: Morning SWAT done
* 19:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|5c83491}}: Create translation namespace on nap.wikisource ([[phab:T239547|T239547]]) (duration: 01m 03s)
* 19:12 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|45edf5a}}: Add partial blocks for scowiki ([[phab:T239493|T239493]]) (duration: 01m 00s)
* 19:09 bblack@cumin1001: conftool action : set/pooled=no; selector: name=dns1002.wikimedia.org
* 19:09 bblack@cumin1001: conftool action : set/pooled=no; selector: name=dns2002.wikimedia.org
* 19:08 bblack: reimagine dns1002 + dns2002 - [[phab:T239667|T239667]]
* 19:07 thcipriani@deploy1001: Synchronized scap/plugins: [[gerrit:526509{{!}}scap: prep and clean git ops for /srv/patches]] [[phab:T222240|T222240]] (no-op sync) (duration: 01m 01s)
* 17:52 ejegg: disabled PayPal orphan rectifier debug logging
* 17:48 ejegg: adjusted timing of thank you mailer and donations QC to give 5 more sec / cycle to TY mails
* 17:43 ejegg: updated fundraising CiviCRM from {{Gerrit|4f3341455f}} to {{Gerrit|26b788378e}}
* 17:22 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:19 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 17:18 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:14 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 17:13 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@498c3d1]: repair bulk daemon swift listings (duration: 05m 49s)
* 17:07 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@498c3d1]: repair bulk daemon swift listings
* 16:52 bblack: reimaging dns3002 + dns5002
* 16:30 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/MachineVision: Remove slow result randomization from the suggestions query (duration: 01m 03s)
* 16:02 ejegg: reduced donations queue consumer 10 sec per cycle and increased TY mail sender 10 sec per cycle
* 15:54 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 15:44 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 15:38 ejegg: updated fundraising CiviCRM from {{Gerrit|5cf2d2713f}} to {{Gerrit|4f3341455f}}
* 15:34 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
* 15:20 elukey: executing sudo cumin -b6 -s 20 -p 95 'A:mw-api-eqiad' 'restart-php7.2-fpm' on cumin1001
* 14:52 godog: swift eqiad-prod: final weight to ms-be105[7-9] - [[phab:T237438|T237438]]
* 14:24 ema: all cp-esams hosts switched to digicert-2019a certs [[phab:T238494|T238494]]
* 14:19 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:17 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 14:13 ema: cp-esams: re-enable puppet, switch to digicert-2019a certs https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/554291/ [[phab:T238494|T238494]]
* 14:06 ema: repool cp3050 with digicert-2019a [[phab:T238494|T238494]]
* 14:00 ema: cp3050: depool and switch to digicert-2019a [[phab:T238494|T238494]]
* 13:56 ema: cp-esams: disable puppet in preparation of digicert-2019a cert switch https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/554291/ [[phab:T238494|T238494]]
* 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1112 after schema change', diff saved to https://phabricator.wikimedia.org/P9802 and previous config saved to /var/cache/conftool/dbconfig/20191203-133231-marostegui.json
* 13:22 mobrovac@deploy1001: Finished deploy [restbase/deploy@92acf1e]: Revert mirroring html2html traffic to PHP - [[phab:T239643|T239643]] (duration: 10m 43s)
* 13:11 mobrovac@deploy1001: Started deploy [restbase/deploy@92acf1e]: Revert mirroring html2html traffic to PHP - [[phab:T239643|T239643]]
* 12:42 mobrovac@deploy1001: Finished deploy [restbase/deploy@41bb230]: Log all html2html errors coming from Parsoid/PHP - [[phab:T239643|T239643]] (duration: 14m 41s)
* 12:28 mobrovac@deploy1001: Started deploy [restbase/deploy@41bb230]: Log all html2html errors coming from Parsoid/PHP - [[phab:T239643|T239643]]
* 12:23 mobrovac@deploy1001: Finished deploy [restbase/deploy@b346ebf]: Mirror html2html traffic to Parsoid/PHP, take #2 (duration: 11m 17s)
* 12:12 mobrovac@deploy1001: Started deploy [restbase/deploy@b346ebf]: Mirror html2html traffic to Parsoid/PHP, take #2
* 12:12 mobrovac@deploy1001: Finished deploy [restbase/deploy@b346ebf]: Mirror html2html traffic to Parsoid/PHP - [[phab:T229015|T229015]] [[phab:T239643|T239643]] (duration: 13m 29s)
* 12:09 Amir1: EU SWAT is done
* 12:09 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:554164{{!}}Set read new for term store for items for client wikis up to Q1000 (T225057)]] (duration: 01m 00s)
* 11:58 mobrovac@deploy1001: Started deploy [restbase/deploy@b346ebf]: Mirror html2html traffic to Parsoid/PHP - [[phab:T229015|T229015]] [[phab:T239643|T239643]]
* 11:58 mobrovac@deploy1001: deploy aborted: Mirror html2html traffic to Parsoid/PHP - [[phab:T229015|T229015]] [[phab:T239643|T239643]] (duration: 00m 00s)
* 11:58 mobrovac@deploy1001: Started deploy [restbase/deploy@b346ebf]: Mirror html2html traffic to Parsoid/PHP - [[phab:T229015|T229015]] [[phab:T239643|T239643]]
* 11:36 hashar: Updated operations-puppet-tests-stretch-docker to fix pip cache directory
* 11:31 godog: refresh kibana fields for logstash-*
* 11:00 hashar: Updated operations-puppet-tests-stretch-docker CI job to use tox 3.10.0 and support various python 3 versions
* 10:37 ema: pool cp1083 with ATS backend [[phab:T227432|T227432]]
* 10:20 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:18 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 10:01 ema: depool cp1083 and reimage as text_ats [[phab:T227432|T227432]]
* 09:22 effie: Roll restart php-fpm mw[1240-1258,1261-1275,1319-1333].eqiad.wmnet
* 09:05 godog: downtime new logstash hosts in codfw/eqiad until thurs
* 09:02 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 09:02 akosiaris@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 09:00 akosiaris@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 08:48 elukey@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)
* 08:45 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 08:45 effie: Restart php-fpm on mw[1330-1333].eqiad.wmnet
* 08:45 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99)
* 08:45 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 08:35 ema: cp3050: set cache.max_open_read_retries=-1 and proxy.config.http.cache.max_open_write_retries=1 (default values) [[phab:T238494|T238494]]
* 08:30 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db1062 from config [[phab:T239188|T239188]] (duration: 01m 02s)
* 08:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db1062 from config [[phab:T239188|T239188]] (duration: 01m 08s)
* 08:20 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 08:19 akosiaris: apply calico rules for eventgate-logging-external. [[phab:T236386|T236386]]
* 08:18 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 08:14 volker-e@deploy1001: Finished deploy [design/style-guide@7978f0d]: Deploy design/style-guide:  (duration: 00m 06s)
* 08:14 volker-e@deploy1001: Started deploy [design/style-guide@7978f0d]: Deploy design/style-guide:
* 07:39 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 06:29 marostegui: Deploy schema change on db1112 with replication (this will generate lag on s3 on labs)
* 06:19 volker-e@deploy1001: Finished deploy [design/style-guide@8e08740]: Deploy design/style-guide:  (duration: 00m 08s)
* 06:19 volker-e@deploy1001: Started deploy [design/style-guide@8e08740]: Deploy design/style-guide:
* 06:07 marostegui: Stop MySQL on db1062 for decommissioning [[phab:T239188|T239188]]
* 06:00 marostegui: Remove db2065 from tendril and zarcillo [[phab:T239046|T239046]]
* 05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 05:53 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 05:50 ema: cp3050: ats-be restart with proxy.config.http.server_session_sharing.pool=thread [[phab:T238494|T238494]]
* 05:47 marostegui: Remove ar_comment triggers from s3 db1124:3313 - [[phab:T234704|T234704]]
* 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112 for schema change', diff saved to https://phabricator.wikimedia.org/P9798 and previous config saved to /var/cache/conftool/dbconfig/20191203-054528-marostegui.json
* 04:19 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/Rest/EntryPoint.php: disable IE6 safety checks for [[phab:T239666|T239666]] (duration: 01m 00s)
* 04:15 tstarling@deploy1001: Synchronized php-1.35.0-wmf.8/includes/Rest/EntryPoint.php: disable IE6 safety checks for [[phab:T239666|T239666]] (duration: 01m 01s)
* 03:53 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@d00c6ad]: Fix: Apply language headers to zhwiki mobile-html responses ([[phab:T239659|T239659]]) (duration: 05m 51s)
* 03:47 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@d00c6ad]: Fix: Apply language headers to zhwiki mobile-html responses ([[phab:T239659|T239659]])
* 02:54 mutante: mw1269 restarted nginx, php
* 02:48 mutante: mw1320, mw1321 restarted php-fpm
* 02:32 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T78711|T78711]] Display 'twice a month' or 'once a month' on cached reports (duration: 01m 19s)
* 02:25 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting testwiki => true for wmgUseCentralAuth, already implied by default (duration: 01m 24s)
* 02:23 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T237698|T237698]] Stop setting wmgUseDPL, unread (duration: 01m 11s)
* 02:21 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T237698|T237698]] Read wmgUseDynamicPageList not wmgUseDPL (duration: 01m 22s)
* 02:19 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T237698|T237698]] Set wmgUseDynamicPageList, less cryptic form of wmgUseDPL (duration: 01m 16s)
* 02:16 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Stop setting wgTorLoadNodes, not read for a while (duration: 01m 14s)
* 02:13 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop setting wgGEHelpPanelSearchEnabled, no longer used (duration: 01m 08s)
* 02:04 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T239091|T239091]] Enable Translate extension on sewikimedia, second try (duration: 01m 24s)
* 01:58 jforrester@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/VisualEditor/: [[phab:T239209|T239209]] Sanitize HTML on paste (duration: 01m 33s)
* 01:55 jforrester@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/VisualEditor/: [[phab:T239209|T239209]] Sanitize HTML on paste (duration: 01m 24s)
* 01:42 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99)
* 01:42 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 01:36 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2252.codfw.wmnet
* 01:35 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet
* 01:33 mutante: mw2250 - E: dpkg was interrupted, you must manually run 'sudo dpkg --configure -a' to correct the problem.
* 01:33 mutante: mw2252 rebooting
* 01:26 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2254.codfw.wmnet
* 01:25 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2253.codfw.wmnet
* 01:22 mutante: mw2254 - rebooting (reimage script exited with segfault after reimage was done)
* 01:20 jforrester@deploy1001: Synchronized php-1.35.0-wmf.5/includes/diff/DifferenceEngine.php: [[phab:T236320|T236320]] Don't calculate amount of inbetween revisions for MCR undo (duration: 00m 59s)
* 01:15 jforrester@deploy1001: Synchronized dblists/wikidataclient.dblist: [[phab:T239318|T239318]] Add sewikimedia to wikidataclient (duration: 01m 03s)
* 01:05 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T239091|T239091]] Revert 'Enable Translate extension on sewikimedia' (duration: 01m 01s)
* 01:00 James_F: mwscript sql.php --wiki=sewikimedia php-1.35.0-wmf.8/extensions/Translate/sql/translate_<nowiki>{</nowiki>…<nowiki>}</nowiki>.sql [[phab:T239091|T239091]]
* 00:56 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T239091|T239091]] Enable Translate extension on sewikimedia (duration: 00m 57s)
* 00:54 James_F: mwscript sql.php --wiki=sewikimedia php-1.35.0-wmf.5/extensions/Wikibase/client/sql/entity_usage.sql
* 00:25 jforrester@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/Echo/includes/DiscussionParser.php: [[phab:T239275|T239275]] Fix type hint fatal from getUserLinks() (duration: 01m 16s)
* 00:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:09 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 00:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 00:05 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:03 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:02 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 00:01 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
== 2019-12-02 ==
* 23:37 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2251.codfw.wmnet
* 23:06 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2249.codfw.wmnet
* 23:06 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2248.codfw.wmnet
* 23:05 mutante: mw2248 - restart nginx (for some reason unit was running but not listening on 443 after reimage..now it does)
* 23:05 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:02 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 22:46 ejegg: updated payments-wiki from {{Gerrit|06a8c3cdff}} to {{Gerrit|f61c9f0692}}
* 22:44 bblack: reimaging dns4002 to buster - [[phab:T239667|T239667]]
* 22:07 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/MachineVision: Update text for no personal uploads message ([[phab:T238873|T238873]]) (duration: 01m 03s)
* 22:05 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 22:03 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:01 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:00 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:00 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:59 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:32 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2247.codfw.wmnet
* 21:31 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2246.codfw.wmnet
* 21:25 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2230.codfw.wmnet
* 21:25 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 21:23 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 21:22 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 20:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1078 after schema change', diff saved to https://phabricator.wikimedia.org/P9796 and previous config saved to /var/cache/conftool/dbconfig/20191202-205904-marostegui.json
* 20:47 ariel@cumin1001: conftool action : set/pooled=yes; selector: cluster=appserver,name=mw2232.codfw.wmnet,service=nginx,dc=codfw
* 20:47 ariel@cumin1001: conftool action : set/pooled=yes; selector: cluster=appserver,name=mw2232.codfw.wmnet,service=apache2,dc=codfw
* 20:47 ariel@cumin1001: conftool action : set/pooled=yes; selector: name=mw2233.codfw.wmnet,dc=codfw,service=nginx,cluster=appserver
* 20:46 ariel@cumin1001: conftool action : set/pooled=yes; selector: name=mw2233.codfw.wmnet,dc=codfw,service=apache2,cluster=appserver
* 20:46 ariel@cumin1001: conftool action : set/pooled=yes; selector: name=mw2234.codfw.wmnet,service=nginx,cluster=appserver,dc=codfw
* 20:46 ariel@cumin1001: conftool action : set/pooled=yes; selector: name=mw2234.codfw.wmnet,service=apache2,cluster=appserver,dc=codfw
* 20:46 ariel@cumin1001: conftool action : set/pooled=yes; selector: cluster=appserver,name=mw2231.codfw.wmnet,service=nginx,dc=codfw
* 20:46 ariel@cumin1001: conftool action : set/pooled=yes; selector: cluster=appserver,name=mw2231.codfw.wmnet,service=apache2,dc=codfw
* 20:36 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch Flow on all wikis to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 00m 59s)
* 20:35 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 20:26 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 20:21 mobrovac@deploy1001: Finished deploy [restbase/deploy@92acf1e]: Switch everything to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 14m 59s)
* 20:12 joal@deploy1001: Finished deploy [analytics/refinery@9cd234a] (thin): Analytics deploy - Fixes for today deploy (2) (duration: 00m 05s)
* 20:12 joal@deploy1001: Started deploy [analytics/refinery@9cd234a] (thin): Analytics deploy - Fixes for today deploy (2)
* 20:08 joal@deploy1001: Finished deploy [analytics/refinery@9cd234a]: Analytics deploy - Fixes for today deploy (2) (duration: 08m 08s)
* 20:06 mobrovac@deploy1001: Started deploy [restbase/deploy@92acf1e]: Switch everything to Parsoid/PHP - [[phab:T229015|T229015]]
* 20:05 reedy@deploy1001: Synchronized wmf-config/LabsServices.php: labslabslabs (duration: 01m 08s)
* 20:05 mobrovac@deploy1001: Finished deploy [restbase/deploy@92acf1e] (dev-cluster): Switch everything to Parsoid/PHP (duration: 02m 48s)
* 20:02 mobrovac@deploy1001: Started deploy [restbase/deploy@92acf1e] (dev-cluster): Switch everything to Parsoid/PHP
* 20:01 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:59 joal@deploy1001: Started deploy [analytics/refinery@9cd234a]: Analytics deploy - Fixes for today deploy (2)
* 19:59 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:56 ariel@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:55 ariel@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:55 ariel@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:51 ariel@cumin1001: START - Cookbook sre.hosts.downtime
* 19:51 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 19:50 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:50 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 19:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:50 ariel@cumin1001: START - Cookbook sre.hosts.downtime
* 19:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:50 ariel@cumin1001: START - Cookbook sre.hosts.downtime
* 19:48 ariel@cumin1001: START - Cookbook sre.hosts.downtime
* 19:37 mobrovac@deploy1001: Finished deploy [restbase/deploy@e69e2e5]: Switch everything but enwiki to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 13m 48s)
* 19:27 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:25 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:23 mobrovac@deploy1001: Started deploy [restbase/deploy@e69e2e5]: Switch everything but enwiki to Parsoid/PHP - [[phab:T229015|T229015]]
* 19:23 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 19:22 mobrovac@deploy1001: Finished deploy [restbase/deploy@e69e2e5] (dev-cluster): Switch everything but enwiki to Parsoid/PHP (duration: 06m 38s)
* 19:16 mobrovac@deploy1001: Started deploy [restbase/deploy@e69e2e5] (dev-cluster): Switch everything but enwiki to Parsoid/PHP
* 19:04 mobrovac@deploy1001: Finished deploy [restbase/deploy@6a24685]: Parsoid Proxy: Direct html2html traffic to JS; Stop honouring the variant header; Switch sr and zh wikis to PHP - [[phab:T229015|T229015]] (duration: 14m 11s)
* 18:58 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:56 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:50 mobrovac@deploy1001: Started deploy [restbase/deploy@6a24685]: Parsoid Proxy: Direct html2html traffic to JS; Stop honouring the variant header; Switch sr and zh wikis to PHP - [[phab:T229015|T229015]]
* 18:39 joal@deploy1001: Finished deploy [analytics/refinery@980298b] (thin): Analytics deploy - Fixes for today deploy (duration: 00m 06s)
* 18:39 joal@deploy1001: Started deploy [analytics/refinery@980298b] (thin): Analytics deploy - Fixes for today deploy
* 18:38 joal@deploy1001: Finished deploy [analytics/refinery@980298b]: Analytics deploy - Fixes for today deploy (duration: 08m 21s)
* 18:32 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:30 joal@deploy1001: Started deploy [analytics/refinery@980298b]: Analytics deploy - Fixes for today deploy
* 18:22 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:19 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:18 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:16 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@97d17f6]: New blazegraph and WDQS build plus GUI changes (duration: 15m 42s)
* 18:10 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:06 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:05 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 18:00 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@97d17f6]: New blazegraph and WDQS build plus GUI changes
* 17:56 mobrovac@deploy1001: Finished deploy [restbase/deploy@ff7862f]: Switch sr and zh wikipediae back to Parsoid/JS - [[phab:T229015|T229015]] (duration: 14m 06s)
* 17:51 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:49 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 17:42 mobrovac@deploy1001: Started deploy [restbase/deploy@ff7862f]: Switch sr and zh wikipediae back to Parsoid/JS - [[phab:T229015|T229015]]
* 17:29 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@deafe56]: Followup on cirrusSearchElasticWrite partitioning [[phab:T230495|T230495]] (duration: 01m 14s)
* 17:28 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@deafe56]: Followup on cirrusSearchElasticWrite partitioning [[phab:T230495|T230495]]
* 17:21 ssastry@deploy1001: Finished deploy [parsoid/deploy@743efb0]: Updating Parsoid to {{Gerrit|ca588b25}} + fix broken langconv library / deploy (duration: 07m 48s)
* 17:14 ssastry@deploy1001: Started deploy [parsoid/deploy@743efb0]: Updating Parsoid to {{Gerrit|ca588b25}} + fix broken langconv library / deploy
* 17:09 ejegg: disabled fundraising job omnimail_groupmember_load
* 16:45 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:43 ejegg: updated fundraising internal dashboard from {{Gerrit|8fc2726736}} to {{Gerrit|3a93d2aba4}}
* 16:43 effie: restart all API cluster in eqiad
* 16:43 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:42 hashar: Restarted CI Jenkins
* 16:41 mobrovac@deploy1001: Finished deploy [restbase/deploy@3516382]: Switch ru, sr and zh wikipediae to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 13m 53s)
* 16:41 ema: cp3050: ats-be restart with proxy.config.http.server_session_sharing.pool=global [[phab:T238494|T238494]]
* 16:32 ema: cp3053: repooling after firmware update [[phab:T239041|T239041]]
* 16:27 mobrovac@deploy1001: Started deploy [restbase/deploy@3516382]: Switch ru, sr and zh wikipediae to Parsoid/PHP - [[phab:T229015|T229015]]
* 16:19 effie: reimage mw1295.eqiad.wmnet mw1294.eqiad.wmnet  mw1293.eqiad.wmnet
* 16:11 robh: cp3053 depooling and rebooting for firmware update [[phab:T239041|T239041]]
* 16:10 robh: cp3035 depooling and rebooting for firmware update [[phab:T239041|T239041]]
* 15:38 elukey@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)
* 15:38 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Parsoid VRS: Switch groups 0 and 1 to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 00m 59s)
* 15:35 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
* 15:30 mobrovac@deploy1001: Finished deploy [restbase/deploy@d6d5a6e]: Parsoid Proxy: Do not use the fall-back for linting transforms - [[phab:T239607|T239607]] (duration: 14m 51s)
* 15:26 effie: Rolling restart mw1345-1348
* 15:15 mobrovac@deploy1001: Started deploy [restbase/deploy@d6d5a6e]: Parsoid Proxy: Do not use the fall-back for linting transforms - [[phab:T239607|T239607]]
* 14:46 ema: cp-ats: set server_session_sharing.match=2 everywhere (puppet re-enable and run) [[phab:T238494|T238494]]
* 14:31 ema: cp-ats: merge server_session_sharing.match=2 (https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/553490/) with puppet disabled, test on cp3050 [[phab:T238494|T238494]]
* 14:18 godog: set grafana theme back to light, was dark for some reason
* 14:10 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:08 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078 for schema change', diff saved to https://phabricator.wikimedia.org/P9794 and previous config saved to /var/cache/conftool/dbconfig/20191202-135643-marostegui.json
* 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075 after schema change', diff saved to https://phabricator.wikimedia.org/P9793 and previous config saved to /var/cache/conftool/dbconfig/20191202-135543-marostegui.json
* 13:47 ema: power-cycle cp3053 [[phab:T239041|T239041]]
* 13:44 hashar: Restarted CI Jenkins
* 13:30 hashar: Restarted CI Jenkins
* 13:14 mobrovac@deploy1001: Finished deploy [restbase/deploy@eedba38]: Parsoid Proxy: Fixes - [[phab:T229015|T229015]] (duration: 14m 49s)
* 13:04 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:01 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:59 mobrovac@deploy1001: Started deploy [restbase/deploy@eedba38]: Parsoid Proxy: Fixes - [[phab:T229015|T229015]]
* 12:57 mobrovac@deploy1001: Finished deploy [restbase/deploy@eedba38] (dev-cluster): Parsoid Proxy: Fixes (duration: 02m 54s)
* 12:54 mobrovac@deploy1001: Started deploy [restbase/deploy@eedba38] (dev-cluster): Parsoid Proxy: Fixes
* 12:54 Urbanecm: EU SWAT done
* 12:50 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|d27fe78}}: Enable partial blocks on eswiki ([[phab:T239370|T239370]]) (duration: 01m 00s)
* 12:45 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|445bdc3}}: Remove `move-rootuserpages` from user on svwiki ([[phab:T238842|T238842]]) (duration: 01m 04s)
* 12:43 Urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/bawiki*.png
* 12:39 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: {{Gerrit|61a9563}}: Revert "Change bawiki logo to an anniversary one" ([[phab:T237070|T237070]]) (duration: 01m 06s)
* 12:37 effie: reimage mw1296.eqiad.wmnet
* 12:37 effie: reimage mw1298.eqiad.wmnet
* 12:31 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:554049{{!}}Set read new for term store for items of wikidata up to Q1000 (T225057)]] (duration: 01m 00s)
* 12:19 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.8/extensions/GrowthExperiments/: SWAT: [[gerrit:553402{{!}}Suggested edits: do not treat AQS lookup failure as error (T238178)]] (duration: 01m 02s)
* 11:58 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:56 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:50 filippo@cumin1001: conftool action : set/pooled=yes; selector: name=mw2229.codfw.wmnet
* 11:38 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: [[gerrit:554033{{!}} Bumping portals to master (T128546)]] (duration: 01m 00s)
* 11:37 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:554033{{!}} Bumping portals to master (T128546)]] (duration: 01m 04s)
* 11:07 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:05 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:03 moritzm: installing ruby2.1 security updates
* 10:59 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:57 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:54 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:52 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 10:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:43 moritzm: installing python-psutil security updates
* 10:42 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:42 effie: reimage mw1299.eqiad.wmnet
* 10:18 effie: reimage mw1290.eqiad.wmnet
* 10:18 effie: reimage  mw1275.eqiad.wmnet
* 10:15 moritzm: installing file/libmagic regresssion update for jessie
* 10:08 filippo@cumin1001: conftool action : set/pooled=no; selector: name=mw2229.codfw.wmnet
* 09:52 godog: swift eqiad-prod: more weight to ms-be105[7-9] - [[phab:T237438|T237438]]
* 09:51 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:49 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:49 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:46 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:41 joal@deploy1001: Finished deploy [analytics/refinery@8991301] (thin): Regular analytics deploy - late from last week (thin) (duration: 00m 08s)
* 09:41 joal@deploy1001: Started deploy [analytics/refinery@8991301] (thin): Regular analytics deploy - late from last week (thin)
* 09:40 joal@deploy1001: Finished deploy [analytics/refinery@8991301]: Regular analytics deploy - late from last week (duration: 18m 22s)
* 09:23 effie: reimage mw1300.eqiad.wmnet
* 09:23 effie: reimage mw1300.eqiad.wmne
* 09:22 joal@deploy1001: Started deploy [analytics/refinery@8991301]: Regular analytics deploy - late from last week
* 09:16 moritzm: installing libvpx security updates
* 09:14 godog: extend graphite LVs on graphite1004 / graphite2003 by 200G
* 08:39 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:37 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:33 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:31 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:14 effie: reimage mw1287.eqiad.wmnet mw1288.eqiad.wmnet mw1289.eqiad.wmnet
* 08:08 effie: reimage mw1301.eqiad.wmnet
* 08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:18 andrewbogott: forcing a reboot of cloudstore1008 via mgmt console — it seems to have locked up
* 06:43 Urbanecm: Clear account creation throttle for several IPs ([[phab:T239465|T239465]])
* 06:38 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: New throttle rule for cawiki workshop ([[phab:T239465|T239465]]) (duration: 01m 03s)
* 06:00 marostegui: Compress s8 codfw master (lag might appear on codfw s8)
* 06:00 marostegui: Compress s4 codfw master (lag might appear on codfw s4)
* 05:56 marostegui: Deploy schema change on db1075
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075 for schema change', diff saved to https://phabricator.wikimedia.org/P9791 and previous config saved to /var/cache/conftool/dbconfig/20191202-055546-marostegui.json
* 05:53 marostegui: Compress db1099:3318 [[phab:T235599|T235599]]
* 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for compression', diff saved to https://phabricator.wikimedia.org/P9790 and previous config saved to /var/cache/conftool/dbconfig/20191202-055245-marostegui.json
== 2019-12-01 ==
* 23:27 ladsgroup@deploy1001: Started restart [mobileapps/deploy@70154b4]: Rolling restart of mobileapps
* 23:20 bblack: restarting AQS services in eqiad
* 23:15 eileen: process-control config revision is {{Gerrit|9750c318a0}} - jobs disabled
* 21:39 andrewbogott: restarted nova conductor and api on cloudcontrol1003 and 1004 to free up db connections ([[phab:T239168|T239168]])
== 2019-11-30 ==
* 15:47 Urbanecm: Reset email of SUL user Hayk.arabaget ([[phab:T239462|T239462]])
* 07:40 vgutierrez: repooling cp3057 - [[phab:T239502|T239502]]
* 07:30 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3057.esams.wmnet
* 07:30 vgutierrez: depool and powercycle cp3057 - [[phab:T239502|T239502]]
== 2019-11-29 ==
* 22:02 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:00 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 21:38 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:36 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 21:12 effie: reimage  mw1302.eqiad.wmnet
* 20:50 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:47 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 20:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:13 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 19:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 19:42 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 19:19 effie: reimage mw1284.eqiad.wmnet
* 19:19 effie: reimage mw1303.eqiad.wmnet mw1283.eqiad.wmnet
* 17:26 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:24 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:51 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:49 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:42 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:40 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:22 filippo@cumin1001: conftool action : set/pooled=yes; selector: name=mw2228.codfw.wmnet
* 16:17 effie: reimage mw1274.eqiad.wmnet
* 16:14 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:12 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:40 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:38 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:25 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:23 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 15:11 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:09 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:59 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:45 effie: reimage mw1282.eqiad.wmnet
* 14:45 effie: reimage mw1282.eqiad.wmne
* 14:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:36 effie: reimage mw1323.eqiad.wmnet mw1297.eqiad.wmnet mw1273.eqiad.wmnet
* 14:35 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:27 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:25 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:14 filippo@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2228.codfw.wmnet
* 14:13 godog: reimage mw2228 for partman tests
* 14:02 effie: reimage mw1271.eqiad.wmnet mw1272.eqiad.wmnet mw1304.eqiad.wmnet
* 13:48 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:46 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:33 jynus: reenable puppet on dbprov2001, backup1001
* 13:29 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:27 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:59 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:57 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:48 jynus: disabling puppet also on on backup1001 to test recoveries
* 12:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:34 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:33 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:24 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:22 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:58 effie: reimage mw1305.eqiad.wmnet mw1265.eqiad.wmnet mw1270.eqiad.wmnet
* 11:51 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:49 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:39 jynus: disabling puppet on dbprov2001 to test recoveries
* 11:34 effie: reimage mw1268.eqiad.wmnet mw1280.eqiad.wmnet  mw1281.eqiad.wmnet
* 11:24 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:22 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:20 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:20 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:03 Lucas_WMDE: <effie> 10:58:17 log reimage mw1268.eqiad.wmnet mw1280.eqiad.wmnet mw1281.eqiad.wmne
* 11:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:58 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:48 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:47 elukey@deploy1001: Finished deploy [analytics/refinery@97015e4] (thin): Deploy thin Analytics Refinery (no jars/git-fat-obj) to notebook and labstore hosts (duration: 00m 08s)
* 10:47 elukey@deploy1001: Started deploy [analytics/refinery@97015e4] (thin): Deploy thin Analytics Refinery (no jars/git-fat-obj) to notebook and labstore hosts
* 10:46 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:22 effie: reimage mw1306.eqiad.wmnet mw1264.eqiad.wmnet mw1279.eqiad.wmnet
* 09:56 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:53 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:53 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:51 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:49 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:46 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:33 marostegui: Remove triggers from db2094:3313 - [[phab:T234704|T234704]]
* 09:33 marostegui: Stop replication on db2105 (s3 codfw) for schema change
* 09:23 effie: reimage mw1263.eqiad.wmnet mw1307.eqiad.wmnet
* 09:03 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:01 volans: temporary disabling puppet on 'R:keyholder::agent' to merge gerrit:operations/puppet/+/553460 - [[phab:T239386|T239386]]
* 09:00 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:59 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:45 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:43 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:18 effie: reimage mw2223.codfw.wmnet  mw2222.codfw.wmnet mw2221.codfw.wmnet  mw2220.codfw.wmnet
* 07:50 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 07:48 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 07:48 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 07:48 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 07:25 effie: reimage mw1312.eqiad.wmnet mw1308.eqiad.wmnet  mw1261.eqiad.wmnet
* 06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:15 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1134 after schema change', diff saved to https://phabricator.wikimedia.org/P9781 and previous config saved to /var/cache/conftool/dbconfig/20191129-055845-marostegui.json
* 05:08 krinkle@deploy1001: Synchronized php-1.35.0-wmf.5/includes/exception/MWExceptionHandler.php: {{Gerrit|532f4aba96d85}} (duration: 01m 03s)
== 2019-11-28 ==
* 23:46 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:43 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 23:21 effie: reimage mw1329.eqiad.wmnet
* 23:01 effie: restart cp1087
* 22:58 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:56 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 22:49 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:47 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 22:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:35 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 21:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:42 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 21:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:33 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 21:26 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:23 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 21:19 effie: reimage mw1309.eqiad.wmnet
* 21:19 effie: reimage mw1323.eqiad.wmnet
* 21:11 effie: reimage  mw1316.eqiad.wmnet  mw1315.eqiad.wmnet
* 20:28 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:26 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 20:13 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:10 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 20:04 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:03 effie: reimage mw1313.eqiad.wmnet
* 20:02 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 19:48 effie: reimage mw1331.eqiad.wmnet mw1330.eqiad.wmnet mw1310.eqiad.wmnet
* 18:59 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:57 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:54 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:52 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:50 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 18:48 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 18:41 marostegui: Deploy schema change on db1134
* 18:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134 for schema change', diff saved to https://phabricator.wikimedia.org/P9780 and previous config saved to /var/cache/conftool/dbconfig/20191128-183918-marostegui.json
* 18:29 effie: reimage w1319.eqiad.wmnet  mw1318.eqiad.wmnet
* 18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 after schema change', diff saved to https://phabricator.wikimedia.org/P9779 and previous config saved to /var/cache/conftool/dbconfig/20191128-180517-marostegui.json
* 17:54 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:52 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 17:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:42 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 17:19 effie: reimage mw1340.eqiad.wmnet mw1339.eqiad.wmnet
* 17:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:12 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:34 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:32 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:23 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:21 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 16:18 phamhi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:15 phamhi@cumin1001: START - Cookbook sre.hosts.downtime
* 16:07 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:05 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:58 effie: reimage mw1311.eqiad.wmnet
* 15:30 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:28 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:04 effie: reimage mw1333.eqiad.wmnet mw1332.eqiad.wmnet mw1331.eqiad.wmnet
* 14:53 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:51 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:29 effie: reimage mw1343.eqiad.wmnet mw1342.eqiad.wmnet  mw1341.eqiad.wmnet
* 14:22 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:20 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:20 marostegui: Deploy schema change on s3 codfw on the master, lag will appear on s3 codfw  ([[phab:T234066|T234066]])
* 13:57 Amir1: start of mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size 5 ([[phab:T237984|T237984]])
* 13:57 marostegui: Deploy schema change on s4 codfw master with replication - [[phab:T234066|T234066]]
* 13:52 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:50 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:37 marostegui: Deploy schema change on db1106 with replication (lag will appear on s1 on labs) -  [[phab:T234066|T234066]] [[phab:T233135|T233135]]
* 13:37 marostegui: Recreate views for enwiki_p.protected_titles for all labsdb hosts - [[phab:T233135|T233135]]
* 13:35 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:33 phamhi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:33 phamhi@cumin1001: START - Cookbook sre.hosts.downtime
* 13:31 marostegui: Remove ar_comment triggers from db1124:3311 for enwiki.archive - [[phab:T234704|T234704]]
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 for schema change, temporarily pool db1080 as vslow,dump', diff saved to https://phabricator.wikimedia.org/P9778 and previous config saved to /var/cache/conftool/dbconfig/20191128-133013-marostegui.json
* 13:28 volans: cleanup root's crontab entries on netmon hosts from netbox/postres stuff -  [[phab:T238919|T238919]]
* 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1118 after schema change', diff saved to https://phabricator.wikimedia.org/P9777 and previous config saved to /var/cache/conftool/dbconfig/20191128-132647-marostegui.json
* 13:21 volans: cumin 'netmon*' 'rm -v /var/spool/cron/crontabs/postgres' [[phab:T238919|T238919]]
* 13:18 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:15 effie: enable puppet on thumbor*
* 13:15 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:11 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:08 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:51 effie: disable puppet on thumbor*
* 12:29 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:27 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:23 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:21 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:02 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:00 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:59 effie: reimage mw1267.eqiad.wmnet mw1277.eqiad.wmnet
* 11:52 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:50 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:36 effie: reimage mw1344.eqiad.wmnet mw1334.eqiad.wmnet mw1324.eqiad.wmnet
* 11:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:13 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:55 effie: reimage mw2279 mw2278  mw2277 mw2276 mw2275
* 10:53 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:51 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:45 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:43 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:39 marostegui: Compress labsdb1009
* 09:56 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:54 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:51 godog: swift eqiad-prod: more weight to ms-be105[7-9] - [[phab:T237438|T237438]]
* 09:42 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:40 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:39 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:37 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:19 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:17 effie: reimage mw1266, mw1276
* 09:17 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:56 marostegui: Compress labsdb1011
* 08:42 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:40 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:24 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:22 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 08:19 marostegui: Remove m4 from tendril and zarcillo - [[phab:T159170|T159170]]
* 08:15 effie: reimage mw2280, mw2281, mw2282
* 08:06 marostegui: Compress labsdb1012
* 07:56 effie: reimage mw1345, mw1335, mw1325
* 06:56 elukey: remove log files on an-tool1007 to free root partition space
* 06:14 marostegui: Remove db1061 from tendril and zarcillo - [[phab:T238624|T238624]]
* 06:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:13 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:02 marostegui: Remove db2067 from tendril and zarcillo [[phab:T233185|T233185]]
* 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 for schema change', diff saved to https://phabricator.wikimedia.org/P9776 and previous config saved to /var/cache/conftool/dbconfig/20191128-055212-marostegui.json
* 05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1119 after schema change', diff saved to https://phabricator.wikimedia.org/P9775 and previous config saved to /var/cache/conftool/dbconfig/20191128-055025-marostegui.json
* 03:03 vgutierrez: restarting keyholder on acmechief[12]001
* 01:41 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 01:38 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 01:37 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 01:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 01:29 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 01:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 01:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 01:19 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 00:59 mutante: mw2244 restart php-fpm and apache which somehow are returning 5xx after reimage
* 00:26 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:24 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 00:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 00:22 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
== 2019-11-27 ==
* 23:39 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:37 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:29 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:27 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 23:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:09 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:33 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:31 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:18 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:56 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:54 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 21:35 mutante: mw2215 scap pull
* 21:30 mutante: mw2215 rebooting
* 21:10 bblack: restarting acme-chief service on acmechief1001 (daemon appears to be stuck on a lock and nonfunctional for days...)
* 20:43 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:40 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:32 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:29 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 20:14 cstone: payments-wiki revision changed from {{Gerrit|2eb54fd6ef}} to {{Gerrit|06a8c3cdff}}
* 19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 for schema change', diff saved to https://phabricator.wikimedia.org/P9773 and previous config saved to /var/cache/conftool/dbconfig/20191127-193528-marostegui.json
* 19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1080 after schema change', diff saved to https://phabricator.wikimedia.org/P9772 and previous config saved to /var/cache/conftool/dbconfig/20191127-193227-marostegui.json
* 19:32 ebernhardson@deploy1001: Finished deploy [search/airflow@45b7790]: Allow airflow virtualenv to import system site packages to facilitate libmysqlclient (duration: 00m 45s)
* 19:31 ebernhardson@deploy1001: Started deploy [search/airflow@45b7790]: Allow airflow virtualenv to import system site packages to facilitate libmysqlclient
* 19:27 mutante: an-airflow1001 - apt-get install python3-mysqldb - start airflow-webserver
* 19:24 ebernhardson@deploy1001: Finished deploy [search/airflow@f3bad9d]: revert adding mysqlclient python package (duration: 00m 42s)
* 19:23 ebernhardson@deploy1001: Started deploy [search/airflow@f3bad9d]: revert adding mysqlclient python package
* 19:08 ebernhardson@deploy1001: Finished deploy [search/airflow@57f4caa]: Install mysqlclient to airflow instance (duration: 00m 40s)
* 19:08 ebernhardson@deploy1001: Started deploy [search/airflow@57f4caa]: Install mysqlclient to airflow instance
* 19:00 mutante: an-airflow1001: cd /etc/ ; chown airflow airflow; systemctl start airflow-webserver to let airflow write unittests.cfg (it tries to write this on first start and did not have permissions to do so) [[phab:T236180|T236180]]
* 18:58 mutante: an-airflow1001: cd /etc/ ; chown airflow airflow; systemctl start airflow-webserver to let airflow write unittests.cfg
* 18:57 eileen: process-control config revision is {{Gerrit|b95355c0c0}} - repair omnirecipient job off
* 16:57 andrewbogott: disabling puppet on clouvirt* and cloudcontrol* while merging https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/552894/
* 16:50 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=eventgate-logging-external
* 16:32 cdanis@deploy1001: Synchronized wmf-config/PoolCounterSettings.php: {{Gerrit|dd4c76d3d}} SpecialContributions: max concurrency 3 (instead of 10) [[phab:T234450|T234450]] (duration: 01m 17s)
* 16:22 ejegg: shifted daily silverpop export start time one hour earlier
* 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for schema change', diff saved to https://phabricator.wikimedia.org/P9768 and previous config saved to /var/cache/conftool/dbconfig/20191127-161525-marostegui.json
* 16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1089 after schema change', diff saved to https://phabricator.wikimedia.org/P9767 and previous config saved to /var/cache/conftool/dbconfig/20191127-161450-marostegui.json
* 16:06 ema: cp3050: set proxy.config.http.server_session_sharing.match to "ip" [[phab:T238494|T238494]]
* 15:57 _joe_: restarting pybal on lvs1015
* 15:56 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:56 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:55 _joe_: restarting pybal on lvs1016
* 15:52 jynus: disabling puppet on dbprov1001 to test bacula restore [[phab:T238048|T238048]]
* 15:47 papaul: testing redundancy power on scs-a1-codfw
* 15:47 _joe_: restarting pybal on lvs2003
* 15:44 _joe_: restarting pybal again on lvs2006
* 15:42 jynus: migrate db entries of archive Media to backup1001 [[phab:T238048|T238048]]
* 15:37 marostegui: Logging retroactively for the record: drop user 'nova'@'%' from m5 - [[phab:T239170|T239170]]
* 15:30 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:30 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:29 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:29 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:29 marostegui: Add grants for dump (10.192.0.114,10.192.16.96) for nova_cell0_eqiad database on db1117:3325 and db2078:3325 - [[phab:T239170|T239170]]
* 15:27 marostegui: Add grants for dump (10.64.0.95,10.64.16.31) for nova_cell0_eqiad database on db1117:3325 and db2078:3325 - [[phab:T239170|T239170]]
* 15:25 _joe_: restarting lvs2006 for addition of eventgate-logging-external,blubberoid-https
* 15:24 moritzm: installing freetype bugfix updates from Buster 10.2 point release
* 15:21 oblivian@cumin1001: conftool action : set/weight=10:pooled=yes; selector: service=eventgate-logging-external
* 15:14 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:14 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:11 moritzm: downgrading trapperkeeper-webserver-jetty9-clojure packages on puppetdb hosts to the version shipped in Buster 10.2
* 15:06 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 15:06 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 15:04 ema: cp-ats: rolling ats-<nowiki>{</nowiki>tls,backend<nowiki>}</nowiki> restart to enable lua reload [[phab:T233274|T233274]]
* 15:02 moritzm: remove trapperkeeper-webserver-jetty9-clojure debs from apt.wikimedia.org/buster-wikimedia (these were needed to unbreak TLS on Puppetdb in Buster, but an update landed in Buster 10.2, which replaces our custom hotfix)
* 14:56 marostegui: Add new grants for nova_cell0 database on m5 - [[phab:T239170|T239170]]
* 14:50 marostegui: Create nova_cell0 database on m5 master - [[phab:T239170|T239170]]
* 14:43 effie: reimage mw1346, mw1336, mw1326
* 14:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:33 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 14:15 effie: reimage mw2285, mw2284, mw2283
* 14:14 effie: reimage mw2285, mw2286, mw2283
* 14:01 moritzm: temporarily stop cas on idp1001 for some failover tests
* 14:00 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:00 jmm@cumin1001: START - Cookbook sre.hosts.downtime
* 13:57 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set all of testwikidatawiki to read from the new term store for items ([[phab:T225057|T225057]]) (duration: 00m 56s)
* 13:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:44 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:42 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:42 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:42 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:42 ema: cp1075: repool with tslua reloads enabled [[phab:T233274|T233274]]
* 13:42 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:42 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:41 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:40 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:39 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:39 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:38 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 13:28 ema: cp1075: ats-<nowiki>{</nowiki>tls,backend<nowiki>}</nowiki> restarted to apply tslua reload changes [[phab:T233274|T233274]]
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for schema change', diff saved to https://phabricator.wikimedia.org/P9766 and previous config saved to /var/cache/conftool/dbconfig/20191127-132359-marostegui.json
* 13:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P9765 and previous config saved to /var/cache/conftool/dbconfig/20191127-132220-marostegui.json
* 13:21 effie: reimage mw2288, mw2287, mw2286
* 13:13 effie: reimage  mw1348, mw1338,  mw1328
* 12:51 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,service=nginx,cluster=api_appserver,name=mw2289.codfw.wmnet
* 12:51 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,service=nginx,cluster=api_appserver,name=mw2289.codfw.wmnet
* 12:50 jiji@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=nginx,cluster=api_appserver,name=mw2289.codfw.wmnet
* 12:50 jiji@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=apache2,cluster=api_appserver,name=mw2289.codfw.wmnet
* 12:26 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:26 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1327.eqiad.wmnet,service=nginx
* 12:26 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1347.eqiad.wmnet,service=nginx
* 12:26 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1337.eqiad.wmnet,service=nginx
* 12:26 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1327.eqiad.wmnet,service=apache2
* 12:26 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1347.eqiad.wmnet,service=apache2
* 12:26 jiji@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1337.eqiad.wmnet,service=apache2
* 12:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:23 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 12:22 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 12:18 apergos: reimaged dumpsdata1001 to buster and forgot to use the dang script but it is all ok anyhow :-P
* 11:47 Amir1: deployed security patch for [[phab:T237667|T237667]]
* 11:28 jiji@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1327.eqiad.wmnet,service=nginx
* 11:28 jiji@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1347.eqiad.wmnet,service=nginx
* 11:28 jiji@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1327.eqiad.wmnet,service=apache2
* 11:28 jiji@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1347.eqiad.wmnet,service=apache2
* 11:28 jiji@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1337.eqiad.wmnet,service=nginx
* 11:27 jiji@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1337.eqiad.wmnet,service=apache2
* 11:21 effie: reimage mw2289.codfw.wmnet
* 11:12 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:10 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:09 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:07 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:06 ema: cp1075: depool to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/552955/ and test tslua reloads [[phab:T233274|T233274]]
* 11:06 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 11:04 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 10:43 effie: reimage mw1347,mw1337,mw1327 - [[phab:T239054|T239054]]
* 10:32 ariel@deploy1001: Finished deploy [dumps/dumps@e0b0e76]: skip comment lines in dblist files (duration: 00m 03s)
* 10:32 ariel@deploy1001: Started deploy [dumps/dumps@e0b0e76]: skip comment lines in dblist files
* 09:41 moritzm: installing symfony security updates
* 09:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:33 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:33 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:30 jiji@cumin1001: START - Cookbook sre.hosts.downtime
* 09:29 moritzm: installing php-imagick security updates
* 09:25 ema: cp3050: re-enable request coalescing after performance experiment [[phab:T238494|T238494]]
* 09:02 effie: reimage mw1317.eqiad.wmnet - [[phab:T239054|T239054]]
* 09:01 marostegui: Stop replication on 1124:3318 to reimport wikidatawiki.page table on labsdb1010 - [[phab:T238399|T238399]]
* 08:24 godog: silence codfw varnish traffic drop until dec 9th - [[phab:T239039|T239039]]
* 08:09 godog: swift eqiad-prod: more weight to ms-be105[7-9] - [[phab:T237438|T237438]]
* 07:58 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 07:53 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 07:51 oblivian@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 07:49 elukey: roll restart of eventstreams on scb2* - [[phab:T239220|T239220]]
* 07:41 oblivian@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .
* 07:15 vgutierrez: repooling cp3063 - [[phab:T239310|T239310]]
* 07:04 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3063.esams.wmnet
* 07:04 vgutierrez: depool & powercycle cp3063 - [[phab:T239310|T239310]]
* 07:03 marostegui: Compress tables on db1102:3314
* 06:52 marostegui: Remove db2062 from tendril and zarcillo - [[phab:T238726|T238726]]
* 06:50 marostegui: Stop MySQL on db2062 - [[phab:T238726|T238726]]
* 06:25 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 06:05 marostegui: Promote db2135 to codfw m5 master [[phab:T238183|T238183]]
* 06:02 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Add db2135 to the config [[phab:T238183|T238183]] (duration: 00m 59s)
* 06:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Add db2135 to the config [[phab:T238183|T238183]] (duration: 01m 11s)
* 05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2125 [[phab:T239042|T239042]]', diff saved to https://phabricator.wikimedia.org/P9759 and previous config saved to /var/cache/conftool/dbconfig/20191127-054809-marostegui.json
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311 for schema change', diff saved to https://phabricator.wikimedia.org/P9758 and previous config saved to /var/cache/conftool/dbconfig/20191127-054056-marostegui.json
* 01:58 krinkle@deploy1001: Synchronized vendor: {{Gerrit|4108ff4e2}} (3/3) (duration: 01m 00s)
* 01:56 krinkle@deploy1001: Synchronized wmf-config/: {{Gerrit|4108ff4e2}} (2/3) (duration: 00m 59s)
* 01:55 krinkle@deploy1001: Synchronized lib/: {{Gerrit|4108ff4e2}} (1/3) (duration: 01m 01s)
* 01:28 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 03s)
* 00:05 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Show UploadWizard CTA on testcommonswiki ([[phab:T234960|T234960]]) (duration: 01m 00s)
== 2019-11-26 ==
* 23:55 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WelcomeSurvey for 100% of new users on arwiki (duration: 01m 02s)
* 23:25 eileen: process-control config revision is {{Gerrit|ad80b0136c}}
* 20:33 jforrester@deploy1001: Synchronized dblists/: Update dblists, now autogenerated (no-op, just comment changes) [[phab:T223602|T223602]] (duration: 01m 01s)
* 20:25 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@c282e86]: Followup on [[phab:T230495|T230495]] (duration: 00m 59s)
* 20:24 ebernhardson@deploy1001: Finished deploy [search/airflow@c235ab5]: Rebuild environment for python 3.7.3 (duration: 00m 42s)
* 20:24 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@c282e86]: Followup on [[phab:T230495|T230495]]
* 20:24 ebernhardson@deploy1001: Started deploy [search/airflow@c235ab5]: Rebuild environment for python 3.7.3
* 20:06 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@2b713d6]: Partition CirrusSearchElasticaWrite jobs [[phab:T230495|T230495]] (duration: 01m 23s)
* 20:05 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@2b713d6]: Partition CirrusSearchElasticaWrite jobs [[phab:T230495|T230495]]
* 19:59 Pchelolo: create partitioned topics for cirrusSearchElasticaWrite on kafka-main [[phab:T239135|T239135]]
* 19:57 Urbanecm: Reset email of TheklanBot ([[phab:T239233|T239233]])
* 19:46 brennen@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.8
* 19:39 brennen@deploy1001: Finished scap: testwiki to php-1.35.0-wmf.8 and rebuild l10n cache (duration: 32m 52s)
* 19:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P9753 and previous config saved to /var/cache/conftool/dbconfig/20191126-192724-marostegui.json
* 19:22 shdubsh: restore codfw logstash to baseline - [[phab:T215904|T215904]]
* 19:09 shdubsh: stop logstash codfw, generate some consumer lag, and set batch size to 2000 - [[phab:T215904|T215904]]
* 19:07 ebernhardson@deploy1001: Finished deploy [search/airflow@6ab2cd1]: Align deploy groups in scap.cfg and checks.yaml (duration: 00m 29s)
* 19:07 ebernhardson@deploy1001: Started deploy [search/airflow@6ab2cd1]: Align deploy groups in scap.cfg and checks.yaml
* 19:06 brennen@deploy1001: Started scap: testwiki to php-1.35.0-wmf.8 and rebuild l10n cache
* 19:04 brennen@deploy1001: Pruned MediaWiki: 1.35.0-wmf.2 (duration: 07m 08s)
* 19:03 ebernhardson@deploy1001: Finished deploy [search/airflow@d9779a9]: redeploy current version (duration: 00m 05s)
* 19:03 ebernhardson@deploy1001: Started deploy [search/airflow@d9779a9]: redeploy current version
* 19:03 ebernhardson@deploy1001: Finished deploy [search/airflow@d9779a9]: redeploy current version (duration: 00m 02s)
* 19:03 ebernhardson@deploy1001: Started deploy [search/airflow@d9779a9]: redeploy current version
* 18:55 shdubsh: stop logstash codfw, generate some consumer lag - [[phab:T215904|T215904]]
* 18:44 shdubsh: temporarily update pipeline.batch.size to 1000 on logstash2004 - [[phab:T215904|T215904]]
* 18:33 shdubsh: stop logstash on logstash200[5-6] for metrics collection - [[phab:T215904|T215904]]
* 18:09 brennen: issues with branch.py branch cut; deleted stub wmf/1.35.0-wmf.8 branch and proceeding with standard process
* 17:56 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision: Show UploadWizard CTA in beta ([[phab:T234960|T234960]]) (duration: 00m 52s)
* 17:31 brennen: cutting branch for 1.35.0-wmf.8
* 17:26 paravoid: moving fiberring from cr3-esams:xe-0/0/2 to cr2-esams:xe-0/1/8
* 17:25 ppchelko@deploy1001: Finished deploy [restbase/deploy@0b74625]: Switch group 0 and 1 to Parsoid-PHP [[phab:T229015|T229015]] (duration: 15m 38s)
* 17:10 ppchelko@deploy1001: Started deploy [restbase/deploy@0b74625]: Switch group 0 and 1 to Parsoid-PHP [[phab:T229015|T229015]]
* 17:03 paravoid: above was for cr3-esams
* 17:03 paravoid: cr2-esams: disable interface xe-0/0/2 (transit)
* 16:36 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Drop Scribunto special-case for HHVM, never reached [[phab:T235142|T235142]] (duration: 00m 52s)
* 16:32 jforrester@deploy1001: Synchronized docroot/noc/createTxtFileSymlinks.sh: Drop HHVMRequestInit symlink creation (duration: 00m 52s)
* 16:31 James_F: No sane way to delete HHVMRequestInit.php with a simple sync-dir, so waiting for the full scap.
* 16:30 jforrester@deploy1001: Synchronized docroot/noc/conf/: Drop HHVMRequestInit symlink (duration: 00m 52s)
* 16:27 ssastry@deploy1001: Finished deploy [parsoid/deploy@ee63341]: Update Parsoid to {{Gerrit|7b9b424a}} (duration: 08m 37s)
* 16:19 ssastry@deploy1001: Started deploy [parsoid/deploy@ee63341]: Update Parsoid to {{Gerrit|7b9b424a}}
* 16:10 ssastry@deploy1001: Finished deploy [parsoid/deploy@ee63341]: Testing rollback fixes ([[phab:T238685|T238685]]) (duration: 01m 07s)
* 16:09 ssastry@deploy1001: Started deploy [parsoid/deploy@ee63341]: Testing rollback fixes ([[phab:T238685|T238685]])
* 16:01 ema: cp3050: temporarily disable request coalescing to assess performance impact [[phab:T238494|T238494]]
* 15:15 ema: cp3050: repool after failed test of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/552862/ (reverted) [[phab:T238494|T238494]]
* 14:55 bblack: ignore previous message, restarts not necessary
* 14:53 bblack: rolling through authdns daemon restarts (necessary to reconfigure ANY-address listener) on authdns1001, authdns2001, ganeti3003
* 14:44 oblivian@deploy1001: Synchronized wmf-config/CommonSettings.php: Raise memory limit on parsoid servers 2/2 (duration: 00m 52s)
* 14:42 oblivian@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Raise memory limit on parsoid servers 1/2 (duration: 00m 51s)
* 14:30 oblivian@deploy1001: Scap failed!: Call to mwscript eval.php stderr: not empty
* 14:05 ema: cp3050: depool to merge and test https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/552862/ [[phab:T238494|T238494]]
* 13:11 effie: enable puppet on mediawiki servers
* 13:03 effie: Remove tmpreaper package from all mediawiki servers - [[phab:T229792|T229792]]
* 12:38 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: [[gerrit:552498{{!}}Wikibase (beta-only): Update wmgWikibaseClientDataBridgeHrefRegExp (T238918)]] (duration: 00m 53s)
* 12:07 XioNoX: power down mr1-esams for replacement - [[phab:T238174|T238174]]
* 11:36 elukey: reboot stat1007
* 11:35 marostegui: Deploy schema change on db1139:3311
* 11:35 effie: enable puppet on mw canary servers, and restart apaches
* 10:50 hashar: Updated jenkins job operations-puppet-tests-stretch-docker to use latest Docker container
* 10:30 godog: swift eqiad-prod: add ms-be105[7-9] - [[phab:T237438|T237438]]
* 10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311 for schema change', diff saved to https://phabricator.wikimedia.org/P9749 and previous config saved to /var/cache/conftool/dbconfig/20191126-102442-marostegui.json
* 10:08 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:08 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 10:08 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 10:08 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 10:08 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 10:08 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 10:07 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 10:07 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 09:45 effie: Disable puppet on all mediawiki servers to test 489982
* 09:26 marostegui: Deploy schema change on s8 primary master (db1109) - [[phab:T234066|T234066]] [[phab:T233135|T233135]] [[phab:T237120|T237120]]
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1087 into s8 vslow,dump', diff saved to https://phabricator.wikimedia.org/P9748 and previous config saved to /var/cache/conftool/dbconfig/20191126-092409-marostegui.json
* 09:18 marostegui: Run maintain-views for wikidatawiki.protected_title view on labsdb hosts [[phab:T233135|T233135]]
* 07:53 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Parsoid: Switch Flow to Parsoid/PHP on mw.org -- [[phab:T229015|T229015]] (duration: 00m 52s)
* 07:43 mobrovac@deploy1001: Finished deploy [restbase/deploy@378f504]: Do not use duplicate filter definitions [[phab:T234266|T234266]] (duration: 14m 24s)
* 07:29 mobrovac@deploy1001: Started deploy [restbase/deploy@378f504]: Do not use duplicate filter definitions [[phab:T234266|T234266]]
* 07:28 mobrovac@deploy1001: Finished deploy [restbase/deploy@378f504] (dev-cluster): Do not use duplicate filter definitions (duration: 07m 36s)
* 07:21 mobrovac@deploy1001: Started deploy [restbase/deploy@378f504] (dev-cluster): Do not use duplicate filter definitions
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1061 from config - [[phab:T238624|T238624]]', diff saved to https://phabricator.wikimedia.org/P9745 and previous config saved to /var/cache/conftool/dbconfig/20191126-071746-marostegui.json
* 07:09 marostegui: Stop MySQL on db1061 - [[phab:T238624|T238624]]
* 07:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db1061 from config [[phab:T238624|T238624]] (duration: 00m 52s)
* 07:03 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db1061 from config [[phab:T238624|T238624]] (duration: 00m 54s)
* 06:51 marostegui: Run compare.py for db2125 - [[phab:T239042|T239042]]
* 06:44 marostegui: Remove triggers for ar_comment on db1124:3318 [[phab:T234704|T234704]]
* 06:43 marostegui: Deploy schema change on db1087 with replication, lag will be generated on s8 for labsdb hosts
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1087 from vslow, and pool db1092 temporarily as vslow,dump for s8, for a schema change on db1087', diff saved to https://phabricator.wikimedia.org/P9744 and previous config saved to /var/cache/conftool/dbconfig/20191126-064200-marostegui.json
* 06:34 XioNoX: Rename cr2-knams to cr3-knams - [[phab:T237030|T237030]]
* 06:01 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1086 on s7 master and remove read-only from s7 [[phab:T238044|T238044]]', diff saved to https://phabricator.wikimedia.org/P9743 and previous config saved to /var/cache/conftool/dbconfig/20191126-060108-marostegui.json
* 06:00 marostegui@cumin2001: dbctl commit (dc=all): 'Set s7 as read-only for maintenance [[phab:T238044|T238044]]', diff saved to https://phabricator.wikimedia.org/P9742 and previous config saved to /var/cache/conftool/dbconfig/20191126-060023-marostegui.json
* 06:00 marostegui: Starting s7 failover from db1062 to db1086 - [[phab:T238044|T238044]]
* 05:49 marostegui: Deploy schema change on dbstore1003:3311
* 05:10 marostegui@cumin1001: dbctl commit (dc=all): 'Set weight 0 to db1086 as it will be the new s7 master - [[phab:T238044|T238044]]', diff saved to https://phabricator.wikimedia.org/P9741 and previous config saved to /var/cache/conftool/dbconfig/20191126-051034-marostegui.json
* 05:08 marostegui: Start pre-steps for s7 failover - [[phab:T238044|T238044]]
== 2019-11-25 ==
* 23:39 cstone: payments-wiki revision changed from {{Gerrit|e4d51fe247}} to {{Gerrit|2eb54fd6ef}}
* 23:14 Urbanecm: Evening SWAT done
* 23:12 urbanecm@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 14s)
* 23:10 urbanecm@deploy1001: update-interwiki-cache aborted: Update interwiki cache (duration: 00m 01s)
* 23:09 urbanecm@deploy1001: Synchronized dblists/: SWAT: {{Gerrit|aed2369}}: Add gewikimedia to special.dblist ([[phab:T239173|T239173]]) (duration: 00m 52s)
* 23:07 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|d71b0ab}}: kask-echoseen: Do not report dupes ([[phab:T237143|T237143]]) (duration: 00m 53s)
* 22:13 Jeff_Green: authdns update to deploy {{Gerrit|I21ddc1a3e}}
* 22:04 eileen: civicrm revision changed from {{Gerrit|852c4a36bd}} to {{Gerrit|5cf2d2713f}}, config revision is {{Gerrit|c4ad2f5990}}
* 20:37 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw1298.eqiad.wmnet
* 20:31 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1298.eqiad.wmnet
* 20:31 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw1298.eqiad.wmnet
* 20:22 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1298.eqiad.wmnet
* 20:07 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 20:05 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 20:04 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 19:48 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw1298.eqiad.wmnet
* 19:35 mutante: mw1298 - scap pull
* 19:32 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1298.eqiad.wmnet
* 19:30 ema@cumin1001: conftool action : set/pooled=yes; selector: name=cp4032.ulsfo.wmnet,service=nginx
* 19:14 bblack: cp[245]*: wipe daemon.log and syslog and restart syslog, again
* 19:13 cdanis: restarted grafana-server on grafana1002 [[phab:T220838|T220838]]
* 19:11 cdanis: copied snapshot of database from grafana1001 to grafana1002 [[phab:T220838|T220838]]
* 19:07 cdanis: stopping grafana-next.wikimedia.org (on grafana1002)
* 19:06 cdanis: making grafana.wikimedia.org read-only (on grafana1001) ✔️ cdanis@grafana1001.eqiad.wmnet ~ 🕑☕ sudo chmod -w /var/lib/grafana/grafana.db
* 18:56 Lucas_WMDE: Morning SWAT done
* 18:55 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/TemplateData/: SWAT: [[gerrit:552871{{!}}Implement ParsoidFetchTemplateData hook for Parsoid/PHP (T238954)]] (duration: 00m 53s)
* 18:54 bblack: cp[245]*: wipe daemon.log and syslog and restart syslog, again
* 18:54 ema: cumin -b1 'A:cp-ats and A:esams' 'run-puppet-agent; ats-backend-restart & ats-tls-restart'
* 18:53 ema: cumin -b1 'A:cp-ats and A:eqsin' 'run-puppet-agent; ats-backend-restart & ats-tls-restart'
* 18:53 ema: cumin -b1 'A:cp-ats and A:ulsfo' 'run-puppet-agent; ats-backend-restart & ats-tls-restart'
* 18:52 ema: cumin -b1 'A:cp-ats and A:codfw' 'run-puppet-agent; ats-backend-restart & ats-tls-restart'
* 18:51 ema: cumin -b1 'A:cp-ats and A:eqiad' 'run-puppet-agent; ats-backend-restart & ats-tls-restart'
* 18:50 bblack: cp[245]*: wipe daemon.log and restart syslog, again
* 18:48 mutante: mw1298 - pooling
* 18:26 bblack: cp[245]*: disk space exhausted, rm /var/log/daemon.log + restart rsyslog
* 18:17 bblack: cp4028: disk space exhausted, rm /var/log/daemon.log + restart rsyslog
* 18:16 effie: Restart php-fpm on mw* and wtp* servers in eqiad and codfw - [[phab:T236963|T236963]]
* 18:07 effie: Upgrade php-wikidiff2 to 1.10.0 to all servers - [[phab:T236963|T236963]]
* 17:55 gehel: restart wdqs-updater on all wdqs servers
* 17:55 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@4c5f503]: Revert New Blazegraph Build and WDQS Updates (duration: 10m 24s)
* 17:50 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Parsoid: Switch private wiki clients (Flow, VE) to Parsoid/PHP -- [[phab:T229015|T229015]] (duration: 00m 53s)
* 17:45 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@4c5f503]: Revert New Blazegraph Build and WDQS Updates
* 17:36 marostegui: Upgrade kernel on db2125 [[phab:T239042|T239042]]
* 17:25 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@4c5f503]: New Blazegraph Build and WDQS Updates (duration: 12m 23s)
* 17:19 XioNoX: power down cr2-knams - [[phab:T237030|T237030]]
* 17:14 arlolra@deploy1001: Finished deploy [parsoid/deploy@e7faa19]: Updating Parsoid to {{Gerrit|a6bfdfa}} (duration: 08m 58s)
* 17:12 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@4c5f503]: New Blazegraph Build and WDQS Updates
* 17:05 arlolra@deploy1001: Started deploy [parsoid/deploy@e7faa19]: Updating Parsoid to {{Gerrit|a6bfdfa}}
* 16:48 jynus: upgrading and restarting dbprov* hosts
* 15:49 ema: pool cp3064 with varnish-be [[phab:T227432|T227432]]
* 15:36 ema: cp3064 create filesystem on /dev/nvme0n1p1 (see https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/552547/) and reboot [[phab:T238494|T238494]]
* 15:22 ema: cp3064 manual reboot after wmf-auto-reimage error: 'Unable to run wmf-auto-reimage-host: Failed to reboot_host' [[phab:T238494|T238494]]
* 15:20 ema: cp-ats: rolling ats-<nowiki>{</nowiki>tls,backend<nowiki>}</nowiki> restart to enable lua reload [[phab:T233274|T233274]]
* 15:18 gehel@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:14 gehel@cumin1001: START - Cookbook sre.hosts.downtime
* 15:13 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 15:11 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 15:11 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:11 ema: cp1075: ats-tls-restart to enable lua reload [[phab:T233274|T233274]]
* 15:10 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 15:09 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 15:03 ema: cp1075: ats-backend-restart to enable lua reload [[phab:T233274|T233274]]
* 15:02 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=cp3056.esams.wmnet
* 15:00 bblack@cumin1001: conftool action : set/weight=100; selector: name=cp3056.esams.wmnet,service=ats-be
* 14:50 elukey@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0)
* 14:50 XioNoX: enable cr3-esams:et-1/0/0 - [[phab:T236767|T236767]]
* 14:45 ema: depool cp3064 and reimage with varnish-be [[phab:T227432|T227432]]
* 14:44 elukey@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper
* 14:38 marostegui: Remove triggers from archive table on s1 codfw sanitarium [[phab:T234704|T234704]]
* 14:37 marostegui: Deploy schema change on s1 codfw (this will generate lag on codfw) - [[phab:T234066|T234066]] [[phab:T233135|T233135]]
* 14:23 moritzm: upgrading OpenJDK 11 on an-conf*
* 14:04 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 13:27 elukey: set global read_only=1 on db1108's log database - [[phab:T159170|T159170]]
* 13:16 XioNoX: cleanup config on cr3-esams - [[phab:T237031|T237031]]
* 13:15 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 13:11 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 13:06 XioNoX: cleanup config on cr2-esams - [[phab:T237031|T237031]]
* 13:02 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 12:59 oblivian@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .
* 12:48 XioNoX: bundle esams-knams links on knams side - [[phab:T237031|T237031]]
* 12:42 XioNoX: bundle esams-knams links on esams side - [[phab:T237031|T237031]]
* 12:27 XioNoX: disable BGP to knams transits - [[phab:T237031|T237031]]
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'Increase main traffic weight for db1126', diff saved to https://phabricator.wikimedia.org/P9735 and previous config saved to /var/cache/conftool/dbconfig/20191125-114821-marostegui.json
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1126 after schema change', diff saved to https://phabricator.wikimedia.org/P9734 and previous config saved to /var/cache/conftool/dbconfig/20191125-114733-marostegui.json
* 11:40 effie: cumin -b 2 -s 10 restart php on API servers
* 11:31 effie: restart php-fpm on mw1314
* 11:16 Urbanecm: EU SWAT done
* 11:16 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/AbuseFilter/extension.json: SWAT: {{Gerrit|29a16bd}}: Restrict viewing Special:Log/AbuseFilter, and remove from recent changes ([[phab:T34959|T34959]]) (duration: 01m 04s)
* 11:11 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: SWAT: {{Gerrit|4670d1d}}: Add  throttle rule for WMCL Editathon 2019-12-07 ([[phab:T238986|T238986]]) (duration: 00m 53s)
* 11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|9394f1f}}: Allow enwikiversity interface admins to remove their own interface administratorship ([[phab:T238967|T238967]]) (duration: 00m 57s)
* 09:45 moritzm: installing cron updates from buster point release
* 09:32 moritzm: installing systemd security/bugfix updates on buster
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 - schema change', diff saved to https://phabricator.wikimedia.org/P9732 and previous config saved to /var/cache/conftool/dbconfig/20191125-093157-marostegui.json
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1104 after schema change', diff saved to https://phabricator.wikimedia.org/P9731 and previous config saved to /var/cache/conftool/dbconfig/20191125-093038-marostegui.json
* 09:30 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@db43901]: [[phab:T238822|T238822]] (duration: 13m 08s)
* 09:28 _joe_: building and publishing updated images for envoy
* 09:17 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@db43901]: [[phab:T238822|T238822]]
* 09:13 moritzm: installing python2.7 updates on buster
* 08:53 _joe_: rebuilding base docker images docker-registry.wikimedia.org/wikimedia-<nowiki>{</nowiki>jessie,stretch,buster<nowiki>}</nowiki>
* 08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 07:22 marostegui: Compress db2090
* 07:04 marostegui: Upgrade db2134
* 06:24 marostegui: Compress db2080
* 06:23 marostegui: Compress db2082
* 06:22 marostegui: Compress db2094:3318
* 06:18 marostegui: racadm serveraction hardreset on db2125 [[phab:T239042|T239042]]
* 06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104 - schema change', diff saved to https://phabricator.wikimedia.org/P9730 and previous config saved to /var/cache/conftool/dbconfig/20191125-061629-marostegui.json
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P9729 and previous config saved to /var/cache/conftool/dbconfig/20191125-061542-marostegui.json
* 06:07 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P9728 and previous config saved to /var/cache/conftool/dbconfig/20191125-060728-marostegui.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P9727 and previous config saved to /var/cache/conftool/dbconfig/20191125-060011-marostegui.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2125 - crashed [[phab:T239042|T239042]]', diff saved to https://phabricator.wikimedia.org/P9726 and previous config saved to /var/cache/conftool/dbconfig/20191125-055813-marostegui.json
* 05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1101:3318', diff saved to https://phabricator.wikimedia.org/P9725 and previous config saved to /var/cache/conftool/dbconfig/20191125-055305-marostegui.json
* 03:13 vgutierrez: repooling cp3053 - [[phab:T239041|T239041]]
* 03:00 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3053.esams.wmnet
* 02:59 vgutierrez: depooling & power-cycling cp3053 - [[phab:T239041|T239041]]
* 00:10 eileen: also speed the repair  process-control config revision is {{Gerrit|c4ad2f5990}}
== 2019-11-24 ==
* 20:54 eileen: process-control config revision is {{Gerrit|371782a667}}
* 15:41 ariel@deploy1001: Finished deploy [dumps/dumps@bfdea34]: can skip locks for misc dumps (duration: 00m 03s)
* 15:41 ariel@deploy1001: Started deploy [dumps/dumps@bfdea34]: can skip locks for misc dumps
* 15:01 apergos: rebooting dumpsdata1002 to clear up the other half of the nfs issues
* 14:24 apergos: rebooting snapshot1008 to clear up some nfs + kernel issues
== 2019-11-23 ==
* 18:19 gehel: repool wdqs1007, catched up on lag - [[phab:T238229|T238229]]
* 14:23 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 55s)
* 11:56 _joe_: oblivian@cumin1001:~$ sudo cumin -b2 -s60 A:mw-eqiad 'restart-php7.2-fpm'
* 11:47 _joe_: restarting php7.2-fpm on mw1329
* 09:49 XioNoX: downtime all ripe-atlas checks until Monday (most likely an upstream issue/maintenance)
== 2019-11-22 ==
* 21:55 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T238955|T238955]] (duration: 00m 53s)
* 18:02 shdubsh: restore prometheus services default settings - [[phab:T238807|T238807]]
* 17:52 _joe_: repooling restbase2018
* 17:36 bblack@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 17:34 bblack@cumin1001: START - Cookbook sre.hosts.downtime
* 17:30 shdubsh: clean tombstones on prometheus1004 - [[phab:T238807|T238807]]
* 17:09 shdubsh: restart prometheus on prometheus1004 - [[phab:T238807|T238807]]
* 16:22 shdubsh: clean tombstones on prometheus1003 - [[phab:T238807|T238807]]
* 15:40 XioNoX: renumber AS17639 sessions in eqsin
* 15:16 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/repo/: Stop outputting anything in case of 304 responses in Special:EntityData ([[phab:T238901|T238901]]) (duration: 00m 57s)
* 14:49 _joe_: disabling puppet on restbase2018, testing envoy upgrade [[phab:T238050|T238050]]
* 14:48 _joe_: uploaded envoyproxy 1.12.1 to <nowiki>{</nowiki>buster,stretch<nowiki>}</nowiki> [[phab:T237235|T237235]]
* 13:11 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https ([[phab:T238119|T238119]] [[phab:T238524|T238524]] [[phab:T237375|T237375]] [[phab:T238120|T238120]])
* 13:06 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase/lib/includes/Store/Sql/SqlEntityInfoBuilder.php: [[phab:T238473|T238473]] (duration: 00m 52s)
* 12:34 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 60 RESYNC (duration: 00m 51s)
* 12:32 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 60 (duration: 00m 53s)
* 11:59 effie: reload php7 on canaries
* 11:34 effie: Roll out wikidiff2 1.10.0-1 to canaries - [[phab:T236963|T236963]]
* 11:29 effie: upload wikidiff2 1.10.0-1 - [[phab:T236963|T236963]]
* 09:59 ladsgroup@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 10s)
* 09:56 ladsgroup@deploy1001: Synchronized langlist: [[phab:T238105|T238105]] (duration: 00m 51s)
* 09:47 ladsgroup@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 20s)
* 09:44 ladsgroup@deploy1001: Synchronized langlist: [[phab:T238104|T238104]] [[phab:T238104|T238104]] (duration: 00m 52s)
* 09:28 ema: pool cp1081 with ATS backend [[phab:T227432|T227432]]
* 09:27 gehel: depool wdqs1007 to allow to catch up on lag - [[phab:T238229|T238229]]
* 09:23 reedy@deploy1001: Synchronized php-1.35.0-wmf.5/includes/specials/pagers/ContribsPager.php: Remove live hack of limit for [[phab:T234450|T234450]] (duration: 00m 54s)
* 09:19 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: [[phab:T234450|T234450]] (duration: 00m 55s)
* 09:07 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:05 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 09:04 gehel: remove blazegraph 2.1.5-wmf.11 from archiva, broken upload
* 08:54 gehel: restarting blazegraph and updater on wdqs1007
* 08:54 gehel: restarting blazegraph and updater on edqs1007
* 08:49 ema: depool cp1081 and reimage as text_ats [[phab:T227432|T227432]]
* 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Rebalance weights on s7 in preparation for s7 failover on Tuesday [[phab:T238044|T238044]]', diff saved to https://phabricator.wikimedia.org/P9722 and previous config saved to /var/cache/conftool/dbconfig/20191122-063145-marostegui.json
* 03:49 shdubsh: restart prometheus@ops on prometheus1003 [[phab:T238807|T238807]]
* 00:46 mutante: xhgui1001/xhgui2001 - rsyncing /srv/mongod from tungsten to /srv/tungsten/mongod/ on both new machines ([[phab:T158837|T158837]])
* 00:37 mutante: tungsten - starting ferm service
* 00:20 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Move newcomer tasks JSON config from mw.org to local wikis ([[phab:T237301|T237301]]) (duration: 00m 52s)
* 00:18 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/GrowthExperiments/: Make non-remote titles work in RemotePageConfigurationLoader ([[phab:T237301|T237301]]) (duration: 00m 54s)
== 2019-11-21 ==
* 23:09 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove unused CirrusSearch config variable (duration: 00m 52s)
* 22:11 Urbanecm: mwscript importImages.php --wiki=commonswiki --overwrite --user=Bürgerentscheid . ([[phab:T238764|T238764]])
* 21:42 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/UploadWizard: Revert "Add Machine Vision CTA to final step ([[phab:T234960|T234960]])", take 2 (duration: 00m 41s)
* 21:36 mholloway-shell@deploy1001: Scap failed!: 5/11 canaries failed their endpoint checks(http://en.wikipedia.org)
* 21:34 mholloway-shell@deploy1001: Scap failed!: 4/11 canaries failed their endpoint checks(http://en.wikipedia.org)
* 21:29 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/UploadWizard: Add Machine Vision CTA to final step ([[phab:T234960|T234960]]) (duration: 00m 59s)
* 21:16 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@70154b4]: Update mobileapps to {{Gerrit|c140e88}} (duration: 06m 29s)
* 21:09 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@70154b4]: Update mobileapps to {{Gerrit|c140e88}}
* 20:51 mutante: puppetmaster1001 - revoking puppet certs for xhgui1001/xhgui2001
* 20:49 mutante: ganeti1003 - switching boot order of xhgui1001 to network and reinstalling with stretch ([[phab:T238098|T238098]])
* 20:16 mforns@deploy1001: Finished deploy [analytics/refinery@97015e4]: add new projects to webrequest whitelist (duration: 08m 29s)
* 20:14 mutante: icinga1001 - systemctl reset-failed
* 20:08 mforns@deploy1001: Started deploy [analytics/refinery@97015e4]: add new projects to webrequest whitelist
* 19:01 andrewbogott: upgrading designate to 'ocata' on cloudservices1003 and 1004
* 18:49 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:45 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:42 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:13 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch private wikis back to Parsoid/JS - [[phab:T229015|T229015]] (duration: 00m 52s)
* 18:03 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:02 mobrovac@deploy1001: Synchronized wmf-config/ProductionServices.php: Use HTTPS for contacting Parsoid/PHP - [[phab:T229015|T229015]] (duration: 00m 53s)
* 17:52 mobrovac@deploy1001: Synchronized wmf-config/CommonSettings.php: Switch private wikis to Parsoid/PHP; file 4/4 -- [[phab:T229015|T229015]] (duration: 00m 53s)
* 17:51 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch private wikis to Parsoid/PHP; file 3/4 -- [[phab:T229015|T229015]] (duration: 00m 51s)
* 17:50 mobrovac@deploy1001: Synchronized wmf-config/ProductionServices.php: Switch private wikis to Parsoid/PHP; file 2/4 -- [[phab:T229015|T229015]] (duration: 00m 53s)
* 17:48 mobrovac@deploy1001: Synchronized wmf-config/LabsServices.php: Switch private wikis to Parsoid/PHP; file 1/4 -- [[phab:T229015|T229015]] (duration: 00m 53s)
* 17:27 mobrovac@deploy1001: Finished deploy [restbase/deploy@b987068]: Switch mw.org to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 16m 43s)
* 17:10 mobrovac@deploy1001: Started deploy [restbase/deploy@b987068]: Switch mw.org to Parsoid/PHP - [[phab:T229015|T229015]]
* 17:09 mobrovac@deploy1001: Finished deploy [restbase/deploy@b987068] (dev-cluster): Switch mw.org to Parsoid/PHP (duration: 02m 38s)
* 17:06 mobrovac@deploy1001: Started deploy [restbase/deploy@b987068] (dev-cluster): Switch mw.org to Parsoid/PHP
* 16:54 otto@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 16:48 sbassett@deploy1001: Finished scap: Deploying [[phab:T238451|T238451]] (ext:AbuseFilter), running scap sync for i18n issues. (duration: 16m 42s)
* 16:31 sbassett@deploy1001: Started scap: Deploying [[phab:T238451|T238451]] (ext:AbuseFilter), running scap sync for i18n issues.
* 15:54 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 15:42 mforns@deploy1001: Finished deploy [analytics/refinery@7f32472]: deploying analytics refinery (after refinery-source v0.0.107) (duration: 10m 50s)
* 15:31 mforns@deploy1001: Started deploy [analytics/refinery@7f32472]: deploying analytics refinery (after refinery-source v0.0.107)
* 15:30 ema: pool cp1079 with ATS backend [[phab:T227432|T227432]]
* 15:22 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 15:19 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 15:13 akosiaris: purge https://releases.wikimedia.org/charts/eventgate-0.0.13.tgz, https://releases.wikimedia.org/charts/ and https://releases.wikimedia.org/charts/index.yaml
* 15:09 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:07 bblack: DONE testing deployment software changes on authdns cluster, back to normal
* 15:07 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 14:49 ema: depool cp1079 and reimage as text_ats [[phab:T227432|T227432]]
* 14:47 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@db43901]: Agent filter changes (duration: 18m 33s)
* 14:43 bblack: testing deployment software changes on authdns cluster, please hold dns changes for a few!
* 14:41 thcipriani: restarting Jenkins for update
* 14:28 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@db43901]: Agent filter changes
* 13:59 ema: pool cp1077 with ATS backend [[phab:T227432|T227432]]
* 13:41 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:39 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 13:20 ema: depool cp1077 and reimage as text_ats [[phab:T227432|T227432]]
* 11:53 reedy@deploy1001: Finished scap: [[phab:T234450|T234450]] (duration: 19m 20s)
* 11:42 effie: enable puppet on all mw hosts
* 11:33 reedy@deploy1001: Started scap: [[phab:T234450|T234450]]
* 11:09 Urbanecm: EU SWAT done
* 11:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|e4861ec}}: Set correct language for shywiktionary ([[phab:T238105|T238105]]) (duration: 00m 52s)
* 11:06 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|68d2003}}: Restrict editing CNBanner namespace to autoconfirmed on metawiki ([[phab:T238723|T238723]]) (duration: 00m 54s)
* 11:05 effie: disable puppet on mw[1-2]*
* 10:49 volans: restarting tcpircbot-logmsgbot on icinga1001, has failed to log some messages, no useful log on the host
* 10:22 ema: pool cp2023 with Varnish backend [[phab:T238817|T238817]] [[phab:T227432|T227432]]
* 10:18 arturo: update buster-wikimedia thirdparty/kubeadm-k8s packages (newer version will be used to handle [[phab:T238654|T238654]])
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1090:331<nowiki>{</nowiki>2,7<nowiki>}</nowiki> after upgrade', diff saved to https://phabricator.wikimedia.org/P9714 and previous config saved to /var/cache/conftool/dbconfig/20191121-095401-marostegui.json
* 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1090:331<nowiki>{</nowiki>2,7<nowiki>}</nowiki> after upgrade', diff saved to https://phabricator.wikimedia.org/P9713 and previous config saved to /var/cache/conftool/dbconfig/20191121-093958-marostegui.json
* 09:39 ema: depool cp2023 and reimage back as varnish-be [[phab:T238817|T238817]] [[phab:T227432|T227432]]
* 09:38 marostegui: Stop MySQL on db1067 - [[phab:T238297|T238297]]
* 09:27 marostegui: Upgrade db1090:3312, db1090:3317
* 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1090:3312, db1090:3317 for upgrade', diff saved to https://phabricator.wikimedia.org/P9712 and previous config saved to /var/cache/conftool/dbconfig/20191121-092554-marostegui.json
* 09:08 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9711 and previous config saved to /var/cache/conftool/dbconfig/20191121-090623-marostegui.json
* 09:03 akosiaris@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 08:58 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 08:56 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9710 and previous config saved to /var/cache/conftool/dbconfig/20191121-085644-marostegui.json
* 08:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9709 and previous config saved to /var/cache/conftool/dbconfig/20191121-084500-marostegui.json
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079 after upgrade', diff saved to https://phabricator.wikimedia.org/P9708 and previous config saved to /var/cache/conftool/dbconfig/20191121-083322-marostegui.json
* 08:21 marostegui: Upgrade db1079
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079 for upgrade', diff saved to https://phabricator.wikimedia.org/P9707 and previous config saved to /var/cache/conftool/dbconfig/20191121-082108-marostegui.json
* 07:57 akosiaris: upgrade OTRS to 5.0.39 [[phab:T225925|T225925]]
* 07:56 marostegui: Promote db2133 to codfw m2 master - [[phab:T238183|T238183]]
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9706 and previous config saved to /var/cache/conftool/dbconfig/20191121-072543-marostegui.json
* 07:18 marostegui: Upgrade db1125 (sanitarium)
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9705 and previous config saved to /var/cache/conftool/dbconfig/20191121-071758-marostegui.json
* 06:56 marostegui: Repool labsdb1009
* 06:32 marostegui: Sanitize shywiktionary gcrwiki szywiki minwiktionary gewikimedia on db1124:3313 [[phab:T238115|T238115]] [[phab:T238114|T238114]] [[phab:T237373|T237373]] [[phab:T238522|T238522]] [[phab:T236404|T236404]]
* 06:30 marostegui: Sanitize shywiktionary gcrwiki szywiki minwiktionary gewikimedia on db2094:3313 [[phab:T238115|T238115]] [[phab:T238114|T238114]] [[phab:T237373|T237373]] [[phab:T238522|T238522]] [[phab:T236404|T236404]]
* 06:24 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9704 and previous config saved to /var/cache/conftool/dbconfig/20191121-062412-marostegui.json
* 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1086 after upgrade', diff saved to https://phabricator.wikimedia.org/P9703 and previous config saved to /var/cache/conftool/dbconfig/20191121-061711-marostegui.json
* 06:16 marostegui: Compress db2081
* 06:13 marostegui: Stop MySQL on db1107 [[phab:T238113|T238113]]
* 06:06 marostegui: Compress db2083
* 05:57 marostegui: Depool labsdb1009 for upgrade
* 05:56 marostegui: Upgrade db1086
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1086 for upgrade', diff saved to https://phabricator.wikimedia.org/P9702 and previous config saved to /var/cache/conftool/dbconfig/20191121-055557-marostegui.json
* 05:53 marostegui: Compress db2073
* 00:41 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config does not seem to be applying on half the app servers, resyncing (duration: 00m 52s)
* 00:24 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: GrowthExperiments: Enable suggested edits without opt-in ([[phab:T227728|T227728]]) (duration: 00m 52s)
* 00:18 catrope@deploy1001: Finished scap: GrowthExperiments and MobileFrontend changes SWAT (includes i18n) (duration: 15m 57s)
* 00:02 catrope@deploy1001: Started scap: GrowthExperiments and MobileFrontend changes SWAT (includes i18n)
== 2019-11-20 ==
* 23:14 Amir1: finished creating five wikis, total duration 134 minutes
* 23:14 ladsgroup@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 24s)
* 23:11 ladsgroup@deploy1001: Synchronized langlist: [[phab:T238105|T238105]] (duration: 00m 50s)
* 23:10 ladsgroup@deploy1001: Synchronized static/images/project-logos/: [[phab:T238105|T238105]] (duration: 00m 52s)
* 23:09 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T238105|T238105]] (duration: 00m 51s)
* 23:08 ladsgroup@deploy1001: Synchronized multiversion/MWMultiVersion.php: [[phab:T238105|T238105]] (duration: 00m 51s)
* 23:05 ladsgroup@deploy1001: rebuilt and synchronized wikiversions files: [[phab:T238105|T238105]]
* 22:59 ladsgroup@deploy1001: Synchronized dblists: [[phab:T238105|T238105]] (duration: 00m 53s)
* 22:49 ladsgroup@deploy1001: Synchronized langlist: [[phab:T238104|T238104]] (duration: 00m 51s)
* 22:48 ladsgroup@deploy1001: Synchronized static/images/project-logos/: [[phab:T238104|T238104]] (duration: 00m 52s)
* 22:47 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T238104|T238104]] (duration: 00m 52s)
* 22:43 ladsgroup@deploy1001: Synchronized multiversion/MWMultiVersion.php: [[phab:T238104|T238104]] (duration: 00m 51s)
* 22:41 ladsgroup@deploy1001: rebuilt and synchronized wikiversions files: [[phab:T238104|T238104]]
* 22:36 ladsgroup@deploy1001: Synchronized dblists: [[phab:T238104|T238104]] (duration: 00m 52s)
* 22:22 ladsgroup@deploy1001: Synchronized langlist: [[phab:T237369|T237369]] (duration: 00m 53s)
* 22:21 ladsgroup@deploy1001: Synchronized static/images/project-logos/: [[phab:T237369|T237369]] (duration: 00m 52s)
* 22:19 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T237369|T237369]] (duration: 00m 51s)
* 22:17 ladsgroup@deploy1001: Synchronized multiversion/MWMultiVersion.php: [[phab:T237369|T237369]] (duration: 00m 51s)
* 22:15 ladsgroup@deploy1001: rebuilt and synchronized wikiversions files: [[phab:T237369|T237369]]
* 22:11 ladsgroup@deploy1001: Synchronized dblists: [[phab:T237369|T237369]] (duration: 00m 52s)
* 22:00 Urbanecm: Wiki creation continues
* 21:56 ladsgroup@deploy1001: Synchronized langlist: [[phab:T236861|T236861]] (duration: 00m 52s)
* 21:55 ladsgroup@deploy1001: Synchronized static/images/project-logos/: [[phab:T236861|T236861]] (duration: 00m 51s)
* 21:54 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T236861|T236861]] (duration: 00m 52s)
* 21:52 ladsgroup@deploy1001: Synchronized multiversion/MWMultiVersion.php: [[phab:T236861|T236861]] (duration: 00m 51s)
* 21:49 ladsgroup@deploy1001: rebuilt and synchronized wikiversions files: [[phab:T236861|T236861]]
* 21:44 ladsgroup@deploy1001: Synchronized dblists: [[phab:T236861|T236861]] (duration: 00m 52s)
* 21:38 Urbanecm: mwscript createAndPromote.php --wiki=gewikimedia --sysop --bureaucrat Mehman97 <password redacted> ([[phab:T236389|T236389]])
* 21:35 gehel: repool wdqs1004 - [[phab:T238229|T238229]]
* 21:30 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: new wiki gewikimedia ([[phab:T236389|T236389]]) (duration: 00m 52s)
* 21:29 urbanecm@deploy1001: Synchronized static/images/project-logos/: new wiki gewikimedia ([[phab:T236389|T236389]]) (duration: 00m 53s)
* 21:28 urbanecm@deploy1001: Synchronized multiversion/MWMultiVersion.php: new wiki gewikimedia ([[phab:T236389|T236389]]) (duration: 00m 52s)
* 21:27 ejegg: Fundraising CiviCRM updated from {{Gerrit|2802bdd649}} to {{Gerrit|852c4a36bd}}
* 21:23 mutante: notebook1003 - systemctl start nagios-nrpe-server (second time today already today [[phab:T212824|T212824]])
* 21:20 urbanecm@deploy1001: rebuilt and synchronized wikiversions files: new wiki gewikimedia ([[phab:T236389|T236389]])
* 21:16 urbanecm@deploy1001: Synchronized dblists: new wiki gewikimedia ([[phab:T236389|T236389]]) (duration: 00m 52s)
* 21:01 ssastry@deploy1001: Finished deploy [parsoid/deploy@7665624]: Dummy Parsoid deploy to test [[phab:T238748|T238748]] fix (duration: 07m 20s)
* 20:53 ssastry@deploy1001: Started deploy [parsoid/deploy@7665624]: Dummy Parsoid deploy to test [[phab:T238748|T238748]] fix
* 20:37 ssastry@deploy1001: Finished deploy [parsoid/deploy@d5646b7]: Updating Parsoid to {{Gerrit|2e79460d}} (duration: 09m 14s)
* 20:27 ssastry@deploy1001: Started deploy [parsoid/deploy@d5646b7]: Updating Parsoid to {{Gerrit|2e79460d}}
* 20:27 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 20:23 mutante: notebook1003 - sudo systemctl nagios-nrpe-server (as usual ....)
* 20:19 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 19:31 ejegg: updated fundraising internal dashboard from {{Gerrit|69fdbec60d}} to {{Gerrit|8fc2726736}}
* 19:04 mutante: xhgui1001 - initial puppet run, signed puppet cert on puppetmaster1001
* 18:56 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: RESYNC [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 120 (duration: 00m 50s)
* 18:51 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 120 (duration: 00m 54s)
* 18:42 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 170 (duration: 00m 53s)
* 18:31 mutante: ganeti - introducing and installing buster on new VMs xhgui1001/xhgui2001 - for replacing tungsten (jessie) [[phab:T238098|T238098]]
* 18:17 mobrovac: morning SWAT done
* 18:17 mobrovac@deploy1001: Synchronized php-1.35.0-wmf.5/includes/libs/virtualrest/ParsoidVirtualRESTService.php: Parsoid VRS: Add the Host header - [[phab:T229015|T229015]] [[phab:T229078|T229078]] [[phab:T229074|T229074]] (duration: 00m 52s)
* 18:13 shdubsh: restart mtail on fermium
* 17:40 ema: pool cp2023 with ATS backend [[phab:T227432|T227432]]
* 17:24 mobrovac@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'citoid' for release 'production' .
* 17:21 mobrovac@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'citoid' for release 'production' .
* 17:19 mobrovac@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'citoid' for release 'staging' .
* 17:18 andrewbogott: upgrading pdns to version 4 on cloudservices1003
* 17:06 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:04 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 17:03 andrewbogott: upgrading pdns to version 4 on cloudvirt1004 [[phab:T210715|T210715]]
* 16:58 andrewbogott: disabling puppet on cloudvirt1003 and 1004 for [[phab:T210715|T210715]]
* 16:55 moritzm: installing rpcbind bugfix updates from buster 10.2 point release
* 16:43 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:40 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 16:23 ema: depool cp2023 and reimage as text_ats [[phab:T227432|T227432]]
* 16:14 ema: pool cp2019 with ATS backend [[phab:T227432|T227432]]
* 16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3314 after compression', diff saved to https://phabricator.wikimedia.org/P9695 and previous config saved to /var/cache/conftool/dbconfig/20191120-160813-marostegui.json
* 16:03 gehel: depool wdqs1004 to allow catching up on lag - [[phab:T238229|T238229]]
* 15:42 mobrovac@deploy1001: Synchronized wmf-config/LabsServices.php: [BETA-ONLY] Switch Flow to use Parsoid/PHP - [[phab:T229078|T229078]] (duration: 00m 52s)
* 15:40 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:38 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 15:36 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: RESYNC [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 180 [[gerrit:552069]] (duration: 00m 52s)
* 15:19 ema: depool cp2019 and reimage as text_ats [[phab:T227432|T227432]]
* 15:08 gehel: reset LVS weight for wdqs public eqiad to 10
* 15:05 effie: Enable puppet on mw*
* 14:52 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[phab:T221774|T221774]] - wgWikidataOrgQueryServiceMaxLagFactor 180 [[gerrit:552069]] (duration: 00m 52s)
* 14:50 addshore@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikidata.org: [[phab:T221774|T221774]] - Wikidata.org extension (use altered lag, not raw lag) [[gerrit:552072]] (duration: 00m 53s)
* 14:49 ema: pool cp2016 with ATS backend [[phab:T227432|T227432]]
* 14:47 effie: disable puppet on all mw* servers
* 14:27 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:24 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 14:06 ema: depool cp2016 and reimage as text_ats [[phab:T227432|T227432]]
* 13:32 godog: updated puppet compiler facts on compiler100* hosts
* 12:43 ema: pool cp2013 with ATS backend [[phab:T227432|T227432]]
* 12:27 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 12:25 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 12:08 ema: depool cp2013 and reimage as text_ats [[phab:T227432|T227432]]
* 11:59 ema: pool cp2012 with ATS backend [[phab:T227432|T227432]]
* 11:55 Urbanecm: EU SWAT done
* 11:55 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|2b13fbe}}: [rowiki] Enable deleterevision for patrollers ([[phab:T234051|T234051]]) (duration: 00m 52s)
* 11:46 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|51ecd71}}: Partial cleanup of InitializeSettings ([[phab:T231178|T231178]]) (duration: 00m 52s)
* 11:42 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:40 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 11:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|f847380}}: Set namespace alias for Index: (NS 102/103) for elwikisource ([[phab:T237253|T237253]]) (duration: 00m 54s)
* 11:36 urbanecm@deploy1001: Finished scap: SWAT: {{Gerrit|44ec4e4}}: {{Gerrit|e1baf0e}}:  {{Gerrit|3c02aa7}}: Namespace changes (duration: 06m 15s)
* 11:30 urbanecm@deploy1001: Started scap: SWAT: {{Gerrit|44ec4e4}}: {{Gerrit|e1baf0e}}:  {{Gerrit|3c02aa7}}: Namespace changes
* 11:27 ema: cp2010: ats-backend-restart to clear backend restart alert
* 11:21 ema: depool cp2012 and reimage as text_ats [[phab:T227432|T227432]]
* 11:15 ema: pool cp2010 with ATS backend [[phab:T227432|T227432]]
* 10:54 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:52 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 10:36 mobrovac@deploy1001: Finished deploy [restbase/deploy@daa7808]: Revert switching test2.wp to Parsoid/JS - [[phab:T238716|T238716]] (duration: 13m 56s)
* 10:34 ema: depool cp2010 and reimage as text_ats [[phab:T227432|T227432]]
* 10:30 marostegui: Upgrade db1116
* 10:22 mobrovac@deploy1001: Started deploy [restbase/deploy@daa7808]: Revert switching test2.wp to Parsoid/JS - [[phab:T238716|T238716]]
* 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1094', diff saved to https://phabricator.wikimedia.org/P9694 and previous config saved to /var/cache/conftool/dbconfig/20191120-101727-marostegui.json
* 10:14 marostegui: Compress db2095:3314
* 10:07 mobrovac@deploy1001: Finished deploy [restbase/deploy@c677063]: Switch test2.wp back to Parsoid/JS temporarily - [[phab:T238716|T238716]] (duration: 14m 54s)
* 09:56 marostegui: Compress db2106
* 09:52 mobrovac@deploy1001: Started deploy [restbase/deploy@c677063]: Switch test2.wp back to Parsoid/JS temporarily - [[phab:T238716|T238716]]
* 09:48 marostegui: Compress dbstore1005:3318
* 09:47 marostegui: Compress dbstore1004:3314
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1094 after upgrade', diff saved to https://phabricator.wikimedia.org/P9693 and previous config saved to /var/cache/conftool/dbconfig/20191120-093308-marostegui.json
* 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1094 after upgrade', diff saved to https://phabricator.wikimedia.org/P9692 and previous config saved to /var/cache/conftool/dbconfig/20191120-092337-marostegui.json
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1094 after upgrade', diff saved to https://phabricator.wikimedia.org/P9691 and previous config saved to /var/cache/conftool/dbconfig/20191120-090739-marostegui.json
* 08:55 marostegui: Upgrade db1094
* 08:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1094 for upgrade', diff saved to https://phabricator.wikimedia.org/P9690 and previous config saved to /var/cache/conftool/dbconfig/20191120-085448-marostegui.json
* 08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 08:01 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 07:43 marostegui: Promote db2132 as m1-codfw master - [[phab:T238183|T238183]]
* 07:19 marostegui: Upgrade db2062
* 07:19 marostegui: Upgrade db2078
* 07:14 marostegui: Deploy schema change on s3 (testwikidatawiki) directly on s3 primary master [[phab:T237120|T237120]]
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1136', diff saved to https://phabricator.wikimedia.org/P9688 and previous config saved to /var/cache/conftool/dbconfig/20191120-070511-marostegui.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Give more traffic to db1136', diff saved to https://phabricator.wikimedia.org/P9687 and previous config saved to /var/cache/conftool/dbconfig/20191120-065718-marostegui.json
* 06:44 marostegui: Upgrade db2118 (s7 codfw master)
* 06:41 marostegui: Repool labsdb1011
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1136 into s7 api', diff saved to https://phabricator.wikimedia.org/P9686 and previous config saved to /var/cache/conftool/dbconfig/20191120-064022-marostegui.json
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1136 after upgrade', diff saved to https://phabricator.wikimedia.org/P9685 and previous config saved to /var/cache/conftool/dbconfig/20191120-063628-marostegui.json
* 06:28 marostegui: Upgrade db1136
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1136 for upgrade', diff saved to https://phabricator.wikimedia.org/P9684 and previous config saved to /var/cache/conftool/dbconfig/20191120-062749-marostegui.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3317 after upgrade', diff saved to https://phabricator.wikimedia.org/P9683 and previous config saved to /var/cache/conftool/dbconfig/20191120-062029-marostegui.json
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3314 for compression', diff saved to https://phabricator.wikimedia.org/P9682 and previous config saved to /var/cache/conftool/dbconfig/20191120-061938-marostegui.json
* 05:58 marostegui: Stop MySQL on db1101:3317, db1101:3318 for upgrade and schema change
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3317 and db1101:3318 for upgrade and schema change', diff saved to https://phabricator.wikimedia.org/P9681 and previous config saved to /var/cache/conftool/dbconfig/20191120-055732-marostegui.json
* 05:55 marostegui: Depool labsdb1011 for upgrade
* 05:54 marostegui@cumin2001: dbctl commit (dc=all): 'Repool db1105:3311 db1097:3314 db1098:3316 db1098:3317 after compression', diff saved to https://phabricator.wikimedia.org/P9680 and previous config saved to /var/cache/conftool/dbconfig/20191120-055426-marostegui.json
* 05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1092 after schema change', diff saved to https://phabricator.wikimedia.org/P9679 and previous config saved to /var/cache/conftool/dbconfig/20191120-054840-marostegui.json
* 03:16 tgr: [[phab:T208369|T208369]] ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php kowiki --cutoff 350
* 02:57 vgutierrez: restarting pybal on lvs2002
* 02:54 vgutierrez: restarting pybal on lvs2005
* 02:32 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=phab2001-vcs.codfw.wmnet
* 02:18 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=phab2001-vcs.codfw.wmnet
* 00:10 mutante: phab2001 - restart ssh-phab service after repooling it after buster reinstall, it wasn't listening on the IPv6 IP,causing LVS/pybal alerts
* 00:06 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/GrowthExperiments/: Pass token as editing_session_id for suggested edits ([[phab:T238249|T238249]]) (duration: 00m 53s)
* 00:02 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/VisualEditor/: EditAttemptStep: Allow overriding session ID ([[phab:T238249|T238249]]) (duration: 00m 52s)
* 00:00 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/WikiEditor/: EditAttemptStep: Allow overriding session ID ([[phab:T238249|T238249]]) (duration: 00m 54s)
== 2019-11-19 ==
* 23:58 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MobileFrontend/: EditAttemptStep: Allow overriding session ID ([[phab:T238249|T238249]]) (duration: 00m 53s)
* 23:55 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/WikimediaEvents/: EditAttemptStep: Allow other extensions to trigger oversampling ([[phab:T238249|T238249]]) (duration: 00m 53s)
* 23:42 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=phab2001-vcs.codfw.wmnet
* 21:45 XioNoX: rebooting pfw3-codfw:node1 for upgrade - [[phab:T235150|T235150]]
* 21:14 XioNoX: rebooting pfw3-codfw for upgrade - [[phab:T235150|T235150]]
* 20:59 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 20:17 gehel: completed reloading data from wdqs1007 to wdqs1004 - after failed test of merging updater - [[phab:T212826|T212826]]
* 20:14 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 20:10 XioNoX: homer push on mgmt routers
* 20:09 mutante: phab1003 after merging gerrit:551910 puppet now also stopped the actual aphlict service and removed the systemd unit file. had to manually run 'systemctl reset-failed' though to clean systemd status and avoid icinga alert ([[phab:T238593|T238593]])
* 20:07 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
* 19:20 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:18 dzahn@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97)
* 19:18 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
* 19:08 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@6e6bd42]: Prevent expensive content transforms from blocking the event loop ([[phab:T229286|T229286]]) (duration: 06m 49s)
* 19:01 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@6e6bd42]: Prevent expensive content transforms from blocking the event loop ([[phab:T229286|T229286]])
* 19:00 elukey: regenerate TLS cert for yarn.wikimedia.org (containing SANs for all analytics UIs) to add datasets.w.o SAN (site was failing due to ATS not being able to contact thorium)
* 18:59 rlazarus: restarted php7.2-fpm on wtp2001, wtp2002
* 18:56 rlazarus: restarted php7.2-fpm on wtp1025, wtp1026
* 18:35 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/VisualEditor/: Unbreak instrumentation of init events (duration: 00m 53s)
* 18:34 ssastry@deploy1001: Finished deploy [parsoid/deploy@6e7cffd]: Updating Parsoid to {{Gerrit|1a1105a7}} (duration: 02m 04s)
* 18:32 ssastry@deploy1001: Started deploy [parsoid/deploy@6e7cffd]: Updating Parsoid to {{Gerrit|1a1105a7}}
* 18:30 mutante: icinga config - manually added team-dcops, started icinga
* 18:20 addshore@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikidata.org: [[phab:T221774|T221774]] - Wikidata.org extension (queryservice maxlag, hook) [[gerrit:551858]] (duration: 00m 53s)
* 18:12 RoanKattouw: That was eowiktionary, not eowikisource
* 18:09 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure default search namespaces for eowikisource ([[phab:T237792|T237792]]) (duration: 00m 52s)
* 17:43 addshore@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikidata.org: [[phab:T221774|T221774]] - Wikidata.org extension (queryservice maxlag, maint script) [[gerrit:551857]] (duration: 00m 52s)
* 17:39 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 17:11 addshore@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikidata.org: [[phab:T221774|T221774]] - Wikidata.org extension (queryservice maxlag) [[gerrit:551855]] [[gerrit:551856]] (duration: 00m 54s)
* 17:02 volker-e@deploy1001: Finished deploy [design/style-guide@d73818a]: Deploy design/style-guide:  (duration: 00m 07s)
* 17:02 volker-e@deploy1001: Started deploy [design/style-guide@d73818a]: Deploy design/style-guide:
* 16:58 ema: pool cp2007 with ATS backend [[phab:T227432|T227432]]
* 16:30 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:28 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 16:25 moritzm: installing glib2.0 security updates
* 16:21 mutante: phab1003 - puppet restarts aphlict service even with "phabricator_aphlict_enabled: false" in Hiera. But it does properly remove the proxy config lines from apache. so service is running but not used. ([[phab:T238593|T238593]])
* 16:17 mutante: phab1003 - systemctl stop aphlict (proxy config in apache is disabled as well as disabled in ATS) ([[phab:T238593|T238593]])
* 16:15 gehel: reloading data from wdqs1007 to wdqs1004 - after failed test of merging updater - [[phab:T212826|T212826]]
* 16:14 gehel@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:10 ema: depool cp2007 and reimage as text_ats [[phab:T227432|T227432]]
* 16:09 ema: pool cp2006 with ATS backend [[phab:T227432|T227432]]
* 15:59 mobrovac@deploy1001: Finished deploy [restbase/deploy@564b2c6]: New Parsoid/PHP config structure (duration: 02m 11s)
* 15:57 mobrovac@deploy1001: Started deploy [restbase/deploy@564b2c6]: New Parsoid/PHP config structure
* 15:37 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:34 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 15:27 mobrovac@deploy1001: Finished deploy [restbase/deploy@5e7f759]: Switch test.wp and test2.wp to Parsoid/PHP - [[phab:T229015|T229015]] (duration: 14m 22s)
* 15:15 ema: depool cp2006 and reimage as text_ats [[phab:T227432|T227432]]
* 15:13 mobrovac@deploy1001: Started deploy [restbase/deploy@5e7f759]: Switch test.wp and test2.wp to Parsoid/PHP - [[phab:T229015|T229015]]
* 15:09 mobrovac@deploy1001: Finished deploy [restbase/deploy@5e7f759] (dev-cluster): Switch test.wp and test2.wp to Parsoid/PHP (duration: 02m 58s)
* 15:07 ema: pool cp2004 with ATS backend [[phab:T227432|T227432]]
* 15:06 mobrovac@deploy1001: Started deploy [restbase/deploy@5e7f759] (dev-cluster): Switch test.wp and test2.wp to Parsoid/PHP
* 14:38 ema@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:36 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 14:34 gehel: restarting blazegraph with additional logging on wdqs1004 - [[phab:T231411|T231411]]
* 14:18 ema: depool cp2004 and reimage as text_ats [[phab:T227432|T227432]]
* 14:13 ema: pool cp2001 with ATS backend [[phab:T227432|T227432]]
* 13:57 marostegui: Deploy schema change on metawiki directly on s7 master [[phab:T238370|T238370]]
* 13:57 marostegui: Deploy schema change on mediawikiwiki directly on s7 master [[phab:T238370|T238370]]
* 13:55 marostegui: Deploy schema change on mediawikiwiki directly on s3 master [[phab:T238370|T238370]]
* 13:50 marostegui: Deploy schema change on foundationwiki directly on s3 master - [[phab:T238370|T238370]]
* 13:46 marostegui: Deploy schema change on labswiki (wikitech) - [[phab:T238370|T238370]]
* 13:39 marostegui: Deploy schema change on db1092
* 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1092 for schema change', diff saved to https://phabricator.wikimedia.org/P9673 and previous config saved to /var/cache/conftool/dbconfig/20191119-133850-marostegui.json
* 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1101:3318 after schema change', diff saved to https://phabricator.wikimedia.org/P9672 and previous config saved to /var/cache/conftool/dbconfig/20191119-133704-marostegui.json
* 13:34 ema@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 13:33 ema@cumin2001: START - Cookbook sre.hosts.downtime
* 13:14 ema: depool cp2001 and reimage as text_ats [[phab:T227432|T227432]]
* 12:42 jbond42: add libapache2-mod-auth-cas 1.2-1 to stretch-wikimedia repo
* 12:28 effie: enable puppet on P:mediawiki::php and *.eqiad.wmnet
* 12:22 effie: enable puppet on P:mediawiki::php and *.codfw.wmnet
* 12:12 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db1067 from config [[phab:T238297|T238297]] (duration: 00m 52s)
* 12:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db1067 from config [[phab:T238297|T238297]] (duration: 00m 52s)
* 11:41 gehel: depooling wdqs1004 - [[phab:T231411|T231411]]
* 11:37 gehel: restarting wdqs blazegraph on wdqs1004 - [[phab:T231411|T231411]]
* 11:29 marostegui: Upgrade dbstore1003 (3311,3315,3317)
* 11:16 gehel: restarting wdqs updater on wdqs1004 - [[phab:T231411|T231411]]
* 10:36 marostegui: Compress and upgrade db1098:3316
* 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for upgrade and compression', diff saved to https://phabricator.wikimedia.org/P9671 and previous config saved to /var/cache/conftool/dbconfig/20191119-103540-marostegui.json
* 10:34 marostegui: Compress and upgrade db1098:3317
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3317 for upgrade and compression', diff saved to https://phabricator.wikimedia.org/P9670 and previous config saved to /var/cache/conftool/dbconfig/20191119-103426-marostegui.json
* 10:29 marostegui: Upgrade db2077
* 10:24 marostegui: Upgrade db2120 db2121 db2122
* 10:10 marostegui: Upgrade MySQL on db2086 db2087 db2100
* 10:06 godog: repool centrallog2001
* 09:40 effie: disable puppet on P:mediawiki::php - [[phab:T229792|T229792]]
* 09:21 moritzm: installing ncurses security updates
* 09:20 moritzm: rolling restart of nginx on acmechief/puppetdb to pick up libxslt security updates
* 09:08 moritzm: installing libxslt security updates
* 09:08 marostegui: Deploy schema change on db1101:3318
* 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P9669 and previous config saved to /var/cache/conftool/dbconfig/20191119-090823-marostegui.json
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1099:3318 after schema change', diff saved to https://phabricator.wikimedia.org/P9668 and previous config saved to /var/cache/conftool/dbconfig/20191119-090745-marostegui.json
* 09:05 marostegui: Repool labsbdb1010
* 07:33 mobrovac@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: Enable math links in Beta - [[phab:T208758|T208758]] (duration: 00m 53s)
* 06:45 marostegui: Stop MySQL on db2061 [[phab:T238526|T238526]]
* 06:44 marostegui: Remove db2061 from tendril and zarcillo [[phab:T238526|T238526]]
* 06:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db2061 from config [[phab:T238526|T238526]] (duration: 00m 52s)
* 06:38 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db2061 from config [[phab:T238526|T238526]] (duration: 00m 53s)
* 06:26 vgutierrez: Move cp1089 from nginx to ats-tls - [[phab:T231627|T231627]]
* 06:20 marostegui: Depool labsdb1010 for upgrade
* 06:02 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1131 to s6 master and remove read-only from s6 [[phab:T235469|T235469]]', diff saved to https://phabricator.wikimedia.org/P9667 and previous config saved to /var/cache/conftool/dbconfig/20191119-060203-marostegui.json
* 06:01 marostegui@cumin2001: dbctl commit (dc=all): 'Set s6 as read-only for maintenance [[phab:T235469|T235469]]', diff saved to https://phabricator.wikimedia.org/P9666 and previous config saved to /var/cache/conftool/dbconfig/20191119-060122-marostegui.json
* 06:01 marostegui: Starting s6 failover from db1061 to db1131 - [[phab:T235469|T235469]]
* 05:37 eileen: process control - I reverted the above to check some stuff first
* 05:36 vgutierrez: Move cp1087 from nginx to ats-tls - [[phab:T231627|T231627]]
* 05:26 marostegui: Deploy schema change on db1099:3318
* 05:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3318 for schema change', diff saved to https://phabricator.wikimedia.org/P9665 and previous config saved to /var/cache/conftool/dbconfig/20191119-052632-marostegui.json
* 05:25 marostegui: Compress db1097:3314
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3314 for compression', diff saved to https://phabricator.wikimedia.org/P9664 and previous config saved to /var/cache/conftool/dbconfig/20191119-052412-marostegui.json
* 05:17 vgutierrez: Move cp1085 from nginx to ats-tls - [[phab:T231627|T231627]]
* 05:14 marostegui: Compress tables on db1105:3311
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3311 for compression', diff saved to https://phabricator.wikimedia.org/P9663 and previous config saved to /var/cache/conftool/dbconfig/20191119-051344-marostegui.json
* 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after compression', diff saved to https://phabricator.wikimedia.org/P9662 and previous config saved to /var/cache/conftool/dbconfig/20191119-051259-marostegui.json
* 05:12 eileen: process-control config revision is {{Gerrit|9fbfc79988}} - change gap on repair job to 16 hours to reflect the with-daylight-savings ones
* 05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1131 with weight 0 [[phab:T235469|T235469]] ', diff saved to https://phabricator.wikimedia.org/P9661 and previous config saved to /var/cache/conftool/dbconfig/20191119-050748-marostegui.json
* 05:02 marostegui: Start pre-switchover steps [[phab:T235469|T235469]]
* 04:47 vgutierrez: Move cp2023 from nginx to ats-tls - [[phab:T231627|T231627]]
* 04:17 vgutierrez: Move cp2019 from nginx to ats-tls - [[phab:T231627|T231627]]
* 03:53 vgutierrez: Move cp2016 from nginx to ats-tls - [[phab:T231627|T231627]]
* 03:51 tgr: [[phab:T208369|T208369]] ran mwscript extensions/GrowthExperiments/maintenance/deleteOldSurveys.php cswiki --cutoff 350
* 03:37 vgutierrez: Move cp2013 from nginx to ats-tls - [[phab:T231627|T231627]]
* 01:12 ejegg: re-enabled fundraising CiviCRM contact de-duplication jobs
* 01:05 ejegg: disabled fundraising CiviCRM contact de-duplication jobs
* 00:54 ejegg: updated civicrm from {{Gerrit|1f454aa69a}} to {{Gerrit|2802bdd649}}
* 00:39 mutante: phab2001 - rsyncing /srv/repos data from phab1003 ([[phab:T190568|T190568]])
* 00:30 mutante: rebooting phab2001
== 2019-11-18 ==
* 23:52 catrope@deploy1001: Finished scap: Update GrowthExperiments to master in wmf.5 (includes i18n) (duration: 19m 57s)
* 23:37 mutante: phab2001 - restart ssh-phab service after reimaging (some race condition binding to the IP before getting it on the interface after fresh install .. reschedule pybal checks ([[phab:T190568|T190568]])
* 23:32 catrope@deploy1001: Started scap: Update GrowthExperiments to master in wmf.5 (includes i18n)
* 22:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:56 dzahn@cumin1001: START - Cookbook sre.hosts.downtime
* 22:44 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=phab2001-vcs.codfw.wmnet
* 22:39 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=phab2001.codfw.wmnet
* 22:39 eileen: civicrm revision changed from {{Gerrit|c05c302e54}} to {{Gerrit|1f454aa69a}}, config revision is {{Gerrit|67685c12f5}}
* 22:31 mutante: phab2001 - reinstalling with buster ([[phab:T190568|T190568]])
* 21:59 mholloway-shell@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 21:57 mholloway-shell@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'wikifeeds' for release 'production' .
* 21:57 arlolra: Upgraded Parsoid to {{Gerrit|2245b8f}} ([[phab:T237886|T237886]], [[phab:T237103|T237103]], [[phab:T236864|T236864]], [[phab:T237569|T237569]], [[phab:T236930|T236930]], [[phab:T237463|T237463]], [[phab:T236867|T236867]], [[phab:T234266|T234266]])
* 21:56 mholloway-shell@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'wikifeeds' for release 'staging' .
* 21:47 arlolra@deploy1001: Finished deploy [parsoid/deploy@c6a457f]: Updating Parsoid to {{Gerrit|2245b8f}} (duration: 08m 22s)
* 21:39 arlolra@deploy1001: Started deploy [parsoid/deploy@c6a457f]: Updating Parsoid to {{Gerrit|2245b8f}}
* 20:59 mutante: phab1003 - re-enabling puppet after merging gerrit::551271 - making sure aphlict stays disabled incl. the apache config ProxyPass lines using mod_proxy_wstunnel ([[phab:T238593|T238593]])
* 20:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316 after some compression', diff saved to https://phabricator.wikimedia.org/P9659 and previous config saved to /var/cache/conftool/dbconfig/20191118-202259-marostegui.json
* 19:03 ejegg: updated payments-wiki from {{Gerrit|30579d34d8}} to {{Gerrit|3f99ebecc7}}
* 18:21 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@582d394]: New WDQS build with merging updater (duration: 13m 27s)
* 18:07 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@582d394]: New WDQS build with merging updater
* 17:44 cdanis: rebooting grafana1002 (currently test host not used in prod)
* 17:08 marostegui: Deploy schema change on db1116:3318
* 16:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316 for compression', diff saved to https://phabricator.wikimedia.org/P9658 and previous config saved to /var/cache/conftool/dbconfig/20191118-165410-marostegui.json
* 16:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3316 after compression', diff saved to https://phabricator.wikimedia.org/P9656 and previous config saved to /var/cache/conftool/dbconfig/20191118-164923-marostegui.json
* 16:40 cdanis: ✔️ cdanis@install1002.wikimedia.org ~ 🕦 sudo -E reprepro --restrict grafana update buster-wikimedia
* 16:08 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 16:06 filippo@cumin1001: START - Cookbook sre.hosts.downtime
* 15:13 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set MCR migration stage to NEW on remaining wikis for [[phab:T198312|T198312]] (duration: 00m 53s)
* 14:48 mobrovac@deploy1001: Finished deploy [restbase/deploy@b3b288c]: Parsoid: mirror traffic in split mode; add minwiktionary - [[phab:T229015|T229015]] [[phab:T238523|T238523]] (duration: 13m 58s)
* 14:34 mobrovac@deploy1001: Started deploy [restbase/deploy@b3b288c]: Parsoid: mirror traffic in split mode; add minwiktionary - [[phab:T229015|T229015]] [[phab:T238523|T238523]]
* 14:34 mobrovac@deploy1001: Finished deploy [restbase/deploy@b3b288c] (dev-cluster): Parsoid: mirror traffic in split mode; add minwiktionary - [[phab:T229015|T229015]] [[phab:T238523|T238523]] (duration: 02m 30s)
* 14:31 mobrovac@deploy1001: Started deploy [restbase/deploy@b3b288c] (dev-cluster): Parsoid: mirror traffic in split mode; add minwiktionary - [[phab:T229015|T229015]] [[phab:T238523|T238523]]
* 14:30 mobrovac@deploy1001: Finished deploy [restbase/deploy@b3b288c] (dev-cluster): Parsoid: mirror traffic in split mode; add minwiktionary (duration: 02m 45s)
* 14:28 mobrovac@deploy1001: Started deploy [restbase/deploy@b3b288c] (dev-cluster): Parsoid: mirror traffic in split mode; add minwiktionary
* 14:27 arturo: imported openstack ocata deb packages into stretch-wikimedia/thirdpartdy/openstack-ocata-stretch ([[phab:T238338|T238338]])
* 14:22 marostegui: Deploy schema change on dbstore1005:3318
* 13:10 ema: cp-ats: rolling ats-<nowiki>{</nowiki>tls,backend<nowiki>}</nowiki> restart to apply log_buffer_size config changes [[phab:T237608|T237608]]
* 12:51 Urbanecm: Run mwscript recountCategories.php --wiki=cswiki --mode=<nowiki>{</nowiki>subcats,pages,files<nowiki>}</nowiki> ([[phab:T228585|T228585]])
* 12:48 Urbanecm: Run mwscript recountCategories.php --wiki=dewiki --mode=files ([[phab:T238500|T238500]])
* 12:48 Urbanecm: Run mwscript recountCategories.php --wiki=dewiki --mode=pages ([[phab:T238500|T238500]])
* 12:47 Urbanecm: Run mwscript recountCategories.php --wiki=dewiki --mode=subcats ([[phab:T238500|T238500]])
* 11:32 awight: EU SWAT complete
* 11:28 awight@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Cite: SWAT: [[gerrit:551389{{!}}Track pageviews only on content page views, not edits (T214493)]] (duration: 00m 51s)
* 11:26 awight@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Popups: SWAT: [[gerrit:551397{{!}}Don't record Popups actions on non-content pages (T214493)]] (duration: 00m 51s)
* 11:04 moritzm: installing postgresql-common security updates
* 10:56 moritzm: installing python-werkzeug security updates
* 10:56 marostegui: Deploy schema change on db2078 (codfw master for wikidatawiki), this will create lag on s8 codfw - [[phab:T237120|T237120]]
* 10:53 moritzm: installing gdb updates from buster point release
* 10:49 moritzm: installing python-cryptography bugfix updates from buster point release
* 10:45 moritzm: updated buster netinst image for 10.2 [[phab:T238519|T238519]]
* 10:16 marostegui: Upgrade MySQL on labsdb1012
* 09:33 godog: remove wezen from service, pending reimage
* 09:11 marostegui: Remove ar_comment from triggers on db2094:3318 - [[phab:T234704|T234704]]
* 09:11 marostegui: Deploy schema change on s8 codfw, this will generate lag on s8 codfw - [[phab:T233135|T233135]] [[phab:T234066|T234066]]
* 09:03 marostegui: Restart MySQL on db1124 and db1125 to apply new replication filters [[phab:T238370|T238370]]
* 07:17 marostegui: Upgrade and restart mysql on sanitarium hosts on codfw to pick up new replication filters: db2094 and db2095 - [[phab:T238370|T238370]]
* 07:09 marostegui: Stop MySQL on db2070 to clone db2135 - [[phab:T238183|T238183]]
* 06:52 vgutierrez: Move cp1083 from nginx to ats-tls - [[phab:T231627|T231627]]
* 06:32 vgutierrez: Move cp1081 from nginx to ats-tls - [[phab:T231627|T231627]]
* 06:30 marostegui: Restart tendril mysql - [[phab:T231769|T231769]]
* 06:12 vgutierrez: Move cp2012 from nginx to ats-tls - [[phab:T231627|T231627]]
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for compression', diff saved to https://phabricator.wikimedia.org/P9652 and previous config saved to /var/cache/conftool/dbconfig/20191118-060508-marostegui.json
* 06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 for compression', diff saved to https://phabricator.wikimedia.org/P9651 and previous config saved to /var/cache/conftool/dbconfig/20191118-060207-marostegui.json
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2072, db2088:3311, db2087:3316, db2086:3317 after maintenances and schema changes', diff saved to https://phabricator.wikimedia.org/P9650 and previous config saved to /var/cache/conftool/dbconfig/20191118-060114-marostegui.json
* 05:53 marostegui: Deploy schema change on s5 primary master db1100 - [[phab:T233135|T233135]] [[phab:T234066|T234066]]
* 03:40 vgutierrez: Move cp2007 from nginx to ats-tls - [[phab:T231627|T231627]]
* 00:44 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/Rest/Handler/PageHistoryCountHandler.php: fix extremely slow query [[phab:T238378|T238378]] (duration: 00m 59s)
== 2019-11-16 ==
* 20:27 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:25 ariel@cumin1001: START - Cookbook sre.hosts.downtime
* 12:17 effie: restart rsyslog on mw2221
* 09:43 elukey: systemctl restart hadoop-* on analytics1077 after oom killer
== 2019-11-15 ==
* 22:14 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:12 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 21:54 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:52 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 21:31 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 21:29 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 21:21 _joe_: disabling proxying to ws on phabricator1003
* 20:04 XioNoX: push pfw policies to pfw3-eqiad - [[phab:T238368|T238368]]
* 20:02 XioNoX: push pfw policies to pfw3-codfw - [[phab:T238368|T238368]]
* 19:07 XioNoX: remove vlan 1 trunking between msw1-codfw and mr1-codfw, will cause a quick connectivity issue - [[phab:T228112|T228112]]
* 18:07 XioNoX: homer push on management switches
* 17:30 mutante: phabricator - -started phd service
* 17:11 XioNoX: homer push to management routers (https://gerrit.wikimedia.org/r/550576)
* 16:43 hashar: Restored zuul-merger / CI for operations/puppet.git
* 16:29 hashar: CI slowed down due to a huge spike of internal jobs.  Being flushed as of now # [[phab:T140297|T140297]]
* 16:25 bblack: repool cp2001
* 16:08 bblack: depool cp2001 for experiments
* 16:02 moritzm: rebooting rpki1001 to rectify microcode loading
* 16:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:00 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:51 ejegg: updated Fundraising CiviCRM from {{Gerrit|ae9b3819cd}} to {{Gerrit|c05c302e54}}
* 15:36 ejegg: reduced batch size of CiviCRM contact deduplication jobs
* 15:11 ema: pool cp3064 with ATS backend [[phab:T227432|T227432]]
* 15:07 ema: reboot cp3064 after reimage
* 14:51 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:49 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 14:25 ema: depool cp3064 and reimage as text_ats [[phab:T227432|T227432]]
* 14:17 godog: SIGHUP prometheus@ops on prometheus1004
* 14:13 bblack: lvs1013 - pybal restart for new config
* 14:13 bblack: lvs2001 - pybal restart for new config
* 14:13 bblack: lvs5001 - pybal restart for new config
* 14:13 bblack: lvs4005 - pybal restart for new config
* 14:12 bblack: lvs3005 - pybal restart for new config
* 14:11 bblack: lvs5003 - pybal restart for new config
* 14:11 bblack: lvs4007 - pybal restart for new config
* 14:11 bblack: lvs3007 - pybal restart for new config
* 14:10 bblack: lvs2004 - pybal restart for new config
* 14:09 bblack: lvs1016 - pybal restart for new config
* 13:28 ariel@deploy1001: Finished deploy [dumps/dumps@61090ee]: configuration setting to produce empty abstracts (duration: 00m 03s)
* 13:28 ariel@deploy1001: Started deploy [dumps/dumps@61090ee]: configuration setting to produce empty abstracts
* 13:06 ariel@deploy1001: Finished deploy [dumps/dumps@61090ee]: configuration setting to produce empty abstracts (expecting failure) (duration: 00m 04s)
* 13:06 ariel@deploy1001: Started deploy [dumps/dumps@61090ee]: configuration setting to produce empty abstracts (expecting failure)
* 11:43 ariel@deploy1001: Finished deploy [dumps/dumps@61090ee]: configuration setting to produce empty abstracts (duration: 00m 09s)
* 11:43 ariel@deploy1001: Started deploy [dumps/dumps@61090ee]: configuration setting to produce empty abstracts
* 11:27 moritzm: reboott ganeti4001-4003 to rectify microcode application
* 11:26 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:26 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3315 into vslow,dump after schema change', diff saved to https://phabricator.wikimedia.org/P9645 and previous config saved to /var/cache/conftool/dbconfig/20191115-112520-marostegui.json
* 11:19 marostegui: Reboot dbproxy2002
* 11:15 marostegui: Reboot dbproxy2004
* 11:12 marostegui: Reboot dbproxy2001
* 10:45 marostegui: Run maintain-views for s5 on  labsdb1011 [[phab:T233135|T233135]]
* 10:38 moritzm: installing ghostscript security updates
* 10:37 mobrovac: restbase - truncated parsoidphp data tables - [[phab:T229015|T229015]]
* 10:36 ema: pool cp3062 with ATS backend [[phab:T227432|T227432]]
* 10:24 godog: roll-restart logstash to apply configuration change
* 10:19 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:15 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 09:50 ema: depool cp3062 and reimage as text_ats [[phab:T227432|T227432]]
* 09:47 vgutierrez: Use a synthetic warning for 1% of TLSv1/TLS1v.1 pageviews - [[phab:T238038|T238038]]
* 09:18 vgutierrez: Move cp1079 from nginx to ats-tls - [[phab:T231627|T231627]]
* 09:13 gehel@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 09:02 vgutierrez: Move cp1077 from nginx to ats-tls - [[phab:T231627|T231627]]
* 08:42 vgutierrez: Move cp2006 from nginx to ats-tls - [[phab:T231627|T231627]]
* 08:30 vgutierrez: Move cp2004 from nginx to ats-tls - [[phab:T231627|T231627]]
* 06:41 marostegui: Stop MySQL on db2065 to clone db2134 (this will trigger an haproxy irc alert) - [[phab:T238183|T238183]]
* 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3315 for schema change and temporary pool db1082 into vslow,dump', diff saved to https://phabricator.wikimedia.org/P9643 and previous config saved to /var/cache/conftool/dbconfig/20191115-060807-marostegui.json
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2088:3311 for compression', diff saved to https://phabricator.wikimedia.org/P9642 and previous config saved to /var/cache/conftool/dbconfig/20191115-060425-marostegui.json
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1103:3312 db1082 after schema changes', diff saved to https://phabricator.wikimedia.org/P9641 and previous config saved to /var/cache/conftool/dbconfig/20191115-060300-marostegui.json
* 05:57 marostegui: Run maintain-views for s5 on labsdb1009, labsdb1010, labsdb1012 (pending labsdb1011 as it is still running the schema change) [[phab:T233135|T233135]]
* 05:07 vgutierrez: Move cp3064 from nginx to ats-tls - [[phab:T231627|T231627]]
* 04:38 volker-e@deploy1001: Finished deploy [design/style-guide@2ad7b1a]: Deploy design/style-guide:  (duration: 00m 07s)
* 04:38 volker-e@deploy1001: Started deploy [design/style-guide@2ad7b1a]: Deploy design/style-guide:
* 04:17 vgutierrez: Move cp3062 from nginx to ats-tls - [[phab:T231627|T231627]]
* 04:00 vgutierrez: Move cp3060 from nginx to ats-tls - [[phab:T231627|T231627]]
* 01:35 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/Rest/Handler/CompareHandler.php: deploying REST compare section feature because iOS team need it for a beta release due very soon (duration: 00m 53s)
* 01:33 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/Rest/coreRoutes.json: deploying REST compare section feature because iOS team need it for a beta release due very soon (duration: 00m 52s)
* 01:32 tstarling@deploy1001: Synchronized php-1.35.0-wmf.5/includes/parser/Parser.php: deploying REST compare section feature because iOS team need it for a beta release due very soon (duration: 00m 54s)
== 2019-11-14 ==
* 23:03 mutante: restarting gerrit to ncrease defaultThreadPoolSize to 2
* 22:29 eileen: civicrm revision changed from {{Gerrit|a3714003ff}} to {{Gerrit|ae9b3819cd}}, config revision is {{Gerrit|6adc66a20b}}
* 21:32 ssastry@deploy1001: Finished deploy [parsoid/deploy@150f9af]: Updating Parsoid to {{Gerrit|74203415}} (duration: 08m 21s)
* 21:24 ssastry@deploy1001: Started deploy [parsoid/deploy@150f9af]: Updating Parsoid to {{Gerrit|74203415}}
* 21:14 gehel@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:06 cdanis@cumin2001: dbctl commit (dc=all): 'remove now-defunct wikitech section [[phab:T233236|T233236]]', diff saved to https://phabricator.wikimedia.org/P9639 and previous config saved to /var/cache/conftool/dbconfig/20191114-200649-cdanis.json
* 20:04 gehel: reloading data on wdqs1004 from wdqs1007 to catch up on lag faster - [[phab:T238229|T238229]]
* 19:57 gehel@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:33 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 19:31 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 19:20 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
* 18:49 catrope@deploy1001: Synchronized wmf-config/: Use s10/s11 dblists for wikitechs (for real this time) ([[phab:T233236|T233236]]) (duration: 00m 52s)
* 18:37 catrope@deploy1001: Synchronized dblists/: Use s10/s11 dblists for wikitechs ([[phab:T233236|T233236]]) (duration: 00m 51s)
* 18:35 catrope@deploy1001: Synchronized dblists/: Add s10/s11 dblists for wikitechs ([[phab:T233236|T233236]]) (duration: 00m 52s)
* 18:34 mutante: scandium - restart php7.2-fpm
* 18:31 mutante: phabricator (phab1003, prod server) - upgrade PHP version to 7.2.24 ([[phab:T237239|T237239]])
* 18:17 cdanis@cumin2001: dbctl commit (dc=all): 'alias wikitech section to new s10 section [[phab:T233236|T233236]]', diff saved to https://phabricator.wikimedia.org/P9638 and previous config saved to /var/cache/conftool/dbconfig/20191114-181732-cdanis.json
* 17:46 robh: running dell epsa tool on cp3056 per [[phab:T236497|T236497]]
* 17:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 17:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
* 17:35 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 17:22 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 17:22 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
* 17:22 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 17:22 ejegg: updated payments-wiki from {{Gerrit|bd907656fb}} to {{Gerrit|30579d34d8}}
* 17:17 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .
* 17:17 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'coredns' .
* 17:17 akosiaris@deploy1001: helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
* 16:09 mutante: phab2001 - upgrading PHP version to 7.2.24 ([[phab:T237239|T237239]])
* 16:06 mutante: scandium - upgrading PHP version to 7.2.24 (fyi, @subbu [[phab:T228069|T228069]]) ([[phab:T237239|T237239]])
* 16:04 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/Wikibase: [[gerrit:550828{{!}}Put a layer of APC cache on top of reading wb_terms in SqlEntityInfoBuilder]] ([[phab:T231011|T231011]] [[phab:T229407|T229407]] [[phab:T236681|T236681]]), Try II (duration: 00m 56s)
* 14:54 ema: pool cp3060 with ATS backend [[phab:T227432|T227432]]
* 14:53 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Fix bug when when looking up entity for an unknown ID (duration: 00m 53s)
* 14:48 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set MCR migration stage to NEW on group1 for [[phab:T198312|T198312]] (duration: 00m 53s)
* 14:27 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:24 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 14:01 ema: depool cp3060 and reimage as text_ats [[phab:T227432|T227432]]
* 13:37 ladsgroup@deploy1001: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
* 13:35 gehel: depool wdqs1004 to allow catching up on lag - [[phab:T238229|T238229]]
* 13:06 bblack: removing digicert-2019 files from cache nodes - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/550829/
* 12:24 mobrovac@deploy1001: Finished deploy [restbase/deploy@58cf5ae]: Fix /metrics/mediarequests/top/ indentation (duration: 14m 52s)
* 12:09 mobrovac@deploy1001: Started deploy [restbase/deploy@58cf5ae]: Fix /metrics/mediarequests/top/ indentation
* 11:58 mobrovac@deploy1001: Finished deploy [restbase/deploy@58cf5ae] (dev-cluster): Fix /metrics/mediarequests/top/ indentation (duration: 02m 50s)
* 11:55 mobrovac@deploy1001: Started deploy [restbase/deploy@58cf5ae] (dev-cluster): Fix /metrics/mediarequests/top/ indentation
* 11:26 gehel@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 10:48 vgutierrez: Rolling restart of ats-tls/ats-backend to upgrade to 8.0.5-1wm11 - [[phab:T238307|T238307]]
* 10:44 vgutierrez: uploaded trafficserver-8.0.5-1wm11 to apt.wikimedia.org (stretch) - [[phab:T238307|T238307]]
* 10:43 ema: pool cp3058 with ATS backend [[phab:T227432|T227432]]
* 10:25 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 10:23 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 10:20 godog: netbox1001 bandaid/symlink /srv/deployment/netbox/deploy/src/netbox/project-static to 'static'
* 10:06 gehel: copying journal from wdqs1007 to wdqs1005 - [[phab:T238232|T238232]]
* 10:05 gehel@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 10:03 Urbanecm: Run deleteEqualMessages.php --delete for cswiki and viwiki
* 09:59 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:57 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 09:55 gehel: depool wdqs (public) eqiad - high lag - [[phab:T238229|T238229]]
* 09:34 ema: depool cp3058 and reimage as text_ats [[phab:T227432|T227432]]
* 09:31 marostegui: Compare wikidatawiki.pagelinks between labsdb1011 and labsdb1010 - [[phab:T233986|T233986]]
* 09:25 moritzm: installing ghostscript updates on thumbor1001
* 09:24 marostegui: Stop mysql on db2067 to clone {{Gerrit|db21133}} - [[phab:T238183|T238183]]
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Full weight to db1089 on special groups for s1 [[phab:T223151|T223151]]', diff saved to https://phabricator.wikimedia.org/P9635 and previous config saved to /var/cache/conftool/dbconfig/20191114-092006-marostegui.json
* 09:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 09:05 marostegui: Compare wikidatawiki.pagelinks between db1124:3318 and labsdb1010 - [[phab:T233986|T233986]]
* 09:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:42 marostegui: Remove ar_comment from triggers on db1124:3315 - [[phab:T234704|T234704]]
* 08:41 marostegui: Deploy schema change with replication on db1082, this will generate lag on s5 labs - [[phab:T233135|T233135]] [[phab:T234066|T234066]]
* 08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082 for schema change', diff saved to https://phabricator.wikimedia.org/P9634 and previous config saved to /var/cache/conftool/dbconfig/20191114-084043-marostegui.json
* 08:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
* 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P9633 and previous config saved to /var/cache/conftool/dbconfig/20191114-083729-marostegui.json
* 08:03 eileen: process-control config revision is {{Gerrit|6adc66a20b}} re-enable backfill
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Pool a non partitioned slave db1089 on special groups for s1 [[phab:T223151|T223151]]', diff saved to https://phabricator.wikimedia.org/P9632 and previous config saved to /var/cache/conftool/dbconfig/20191114-080038-marostegui.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 [[phab:T235599|T235599]]', diff saved to https://phabricator.wikimedia.org/P9631 and previous config saved to /var/cache/conftool/dbconfig/20191114-075449-marostegui.json
* 07:41 eileen: process-control config revision is {{Gerrit|b7c2cf7227}} - disabled backfill again - some error?
* 07:29 eileen: process-control config revision is {{Gerrit|909108622d}} re-enable omnirecipient date repair job
* 07:25 eileen: process-control config revision is {{Gerrit|d3ebeddcc1}} (I renabled the old back fill job)
* 07:12 moritzm: installing intel-microcode updates
* 06:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1067', diff saved to https://phabricator.wikimedia.org/P9630 and previous config saved to /var/cache/conftool/dbconfig/20191114-065309-marostegui.json
* 06:16 marostegui: Stop replication on db1067
* 06:01 marostegui@cumin2001: dbctl commit (dc=all): 'Promote db1083 to s1 master and remove read-only from s1 [[phab:T234800|T234800]]', diff saved to https://phabricator.wikimedia.org/P9629 and previous config saved to /var/cache/conftool/dbconfig/20191114-060138-marostegui.json
* 06:00 marostegui@cumin2001: dbctl commit (dc=all): 'Set s1 as read-only for maintenance [[phab:T234800|T234800]]', diff saved to https://phabricator.wikimedia.org/P9628 and previous config saved to /var/cache/conftool/dbconfig/20191114-060026-marostegui.json
* 06:00 marostegui: Starting s1 failover from db1067 to db1083 - [[phab:T234800|T234800]]
* 05:51 jynus: stopping db1114 replication
* 05:34 marostegui: Compress db2089:3316 - [[phab:T235599|T235599]]
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 for schema change', diff saved to https://phabricator.wikimedia.org/P9627 and previous config saved to /var/cache/conftool/dbconfig/20191114-052400-marostegui.json
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P9626 and previous config saved to /var/cache/conftool/dbconfig/20191114-052303-marostegui.json
* 05:13 marostegui: Move replicas from db1067 to db1083 [[phab:T234800|T234800]]
* 05:09 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1083 with weight 0 [[phab:T234800|T234800]]', diff saved to https://phabricator.wikimedia.org/P9625 and previous config saved to /var/cache/conftool/dbconfig/20191114-050940-marostegui.json
* 05:08 vgutierrez: Repooling cp1077 - [[phab:T238289|T238289]]
* 05:07 marostegui: Start pre-failover steps [[phab:T234800|T234800]]
* 05:01 kart_: Updated cxserver to 2019-11-13-111130-production tag ([[phab:T237379|T237379]], [[phab:T235748|T235748]], [[phab:T236906|T236906]])
* 04:56 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 04:51 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
* 04:49 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
* 03:49 vgutierrez: power cycling cp1077 - [[phab:T238289|T238289]]
* 03:49 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1077.eqiad.wmnet
* 03:49 vgutierrez: depooling cp1077 - [[phab:T238289|T238289]]
* 00:41 ebernhardson: [[phab:T237849|T237849]] Start CirrusSearch forceSearchIndex.php commonswiki 2019-10-20T00:00:00 - 2019-11-14T01:00:00 pushing into jobqueue
* 00:40 crusnov@deploy1001: Finished deploy [netbox/deploy@56df4a5]: deploy netbox for script update (duration: 00m 49s)
* 00:39 crusnov@deploy1001: Started deploy [netbox/deploy@56df4a5]: deploy netbox for script update
* 00:39 crusnov@deploy1001: Finished deploy [netbox/deploy@56df4a5]: deploy netbox for script update (duration: 00m 44s)
* 00:38 crusnov@deploy1001: Started deploy [netbox/deploy@56df4a5]: deploy netbox for script update
* 00:36 ebernhardson@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/CirrusSearch/includes/BuildDocument/BuildDocument.php: [[phab:T237849|T237849]]: Restore CirrusSearchBuildDocumentParse hook (duration: 00m 54s)
== 2019-11-13 ==
* 23:00 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 22:58 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 22:25 catrope@deploy1001: Finished scap: For some reason that limited i18n sync didn't work, trying a full scap (duration: 18m 33s)
* 22:07 catrope@deploy1001: Started scap: For some reason that limited i18n sync didn't work, trying a full scap
* 22:04 catrope@deploy1001: scap sync-l10n completed (1.35.0-wmf.5) (duration: 02m 54s)
* 22:00 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/GrowthExperiments/: Update to master ({{Gerrit|b937dce}}) (duration: 00m 54s)
* 20:17 XioNoX: delete unused asw2-esams:ae1
* 19:37 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Update WD item blacklist (again) (duration: 00m 52s)
* 18:49 Jeff_Green: authdns-update to remove host alnilam
* 17:49 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Update WD item blacklist (duration: 00m 53s)
* 16:41 gehel: depool wdqs1005 - [[phab:T238232|T238232]]
* 16:36 gehel: restart blazegraph on wdqs1005
* 16:21 ema: pool cp3054 with ATS backend [[phab:T227432|T227432]]
* 16:21 gehel: draining elastic1017-1031 to prepare for decommission - [[phab:T230746|T230746]]
* 16:02 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:00 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 15:51 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1089', diff saved to https://phabricator.wikimedia.org/P9621 and previous config saved to /var/cache/conftool/dbconfig/20191113-155134-marostegui.json
* 15:39 moritzm: powercycle cloudbackup2002
* 15:35 ema: depool cp3054 and reimage as text_ats [[phab:T227432|T227432]]
* 15:32 moritzm: rebooting cloudbackup2002
* 15:30 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 15:30 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 15:29 jynus: shutdown db2072 [[phab:T237905|T237905]]
* 15:29 gehel: configuration of new elasticsearch servers completed, all working and pooled - [[phab:T230746|T230746]]
* 14:55 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2072', diff saved to https://phabricator.wikimedia.org/P9620 and previous config saved to /var/cache/conftool/dbconfig/20191113-145541-jynus.json
* 13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 for schema change', diff saved to https://phabricator.wikimedia.org/P9619 and previous config saved to /var/cache/conftool/dbconfig/20191113-134938-marostegui.json
* 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1089 after  upgrade', diff saved to https://phabricator.wikimedia.org/P9618 and previous config saved to /var/cache/conftool/dbconfig/20191113-134625-marostegui.json
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1089 after  upgrade', diff saved to https://phabricator.wikimedia.org/P9617 and previous config saved to /var/cache/conftool/dbconfig/20191113-133410-marostegui.json
* 13:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1089 for upgrade', diff saved to https://phabricator.wikimedia.org/P9616 and previous config saved to /var/cache/conftool/dbconfig/20191113-132216-marostegui.json
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P9615 and previous config saved to /var/cache/conftool/dbconfig/20191113-131530-marostegui.json
* 11:56 effie: Upgrade to php 7.2.24-1 mediawiki eqiad hosts and restart php-fpm - [[phab:T237239|T237239]]
* 11:55 ema: cp-ats: rolling trafficserver (8.0.5-1wm10) and fifo-log-demux (0.6) upgrade and restart
* 11:46 moritzm: rebooting cloudcontrol2001-dev for microcode debugging
* 11:45 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:45 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:38 moritzm: rebooting labtestpuppetmaster2001 for microcode debugging
* 11:37 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:37 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:27 ema: cp-ats-ulsfo: rolling trafficserver (8.0.5-1wm10) and fifo-log-demux (0.6) upgrade and restart
* 11:27 moritzm: rebooting cloudcontrol2003-dev for some microcode debugging
* 11:24 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 11:24 jmm@cumin2001: START - Cookbook sre.hosts.downtime
* 11:24 ema: cp4022: trafficserver (8.0.5-1wm10) and fifo-log-demux (0.6) upgrade and restart
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1083', diff saved to https://phabricator.wikimedia.org/P9614 and previous config saved to /var/cache/conftool/dbconfig/20191113-110802-marostegui.json
* 11:05 Urbanecm: EU SWAT done
* 11:05 Urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/ffwiki* ([[phab:T238191|T238191]])
* 11:04 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: {{Gerrit|0a90ef9}}: Update localized logos for the Fula Wikipedia ([[phab:T238191|T238191]]) (duration: 00m 54s)
* 10:53 vgutierrez: Testing ats-tls-restart on cp5007 - [[phab:T237425|T237425]]
* 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9613 and previous config saved to /var/cache/conftool/dbconfig/20191113-104326-marostegui.json
* 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9612 and previous config saved to /var/cache/conftool/dbconfig/20191113-103225-marostegui.json
* 10:27 gehel: start configuration of new elasticsearch servers - [[phab:T230746|T230746]]
* 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1083 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P9610 and previous config saved to /var/cache/conftool/dbconfig/20191113-102054-marostegui.json
* 10:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1083 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P9609 and previous config saved to /var/cache/conftool/dbconfig/20191113-101127-marostegui.json
* 09:51 jynus: upgraded wmf-mariadb101-client on cumin hosts
* 09:50 mobrovac@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'mathoid' for release 'production' .
* 09:43 mobrovac@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'mathoid' for release 'production' .
* 09:41 mobrovac@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'mathoid' for release 'staging' .
* 09:21 mobrovac@deploy1001: Finished deploy [restbase/deploy@1f2c7d8]: Start storing Parsoid/PHP results; add gcrwiki, shywiktionary, szywiki - [[phab:T229015|T229015]] [[phab:T238117|T238117]] [[phab:T238116|T238116]] [[phab:T237374|T237374]] (duration: 11m 19s)
* 09:10 mobrovac@deploy1001: Started deploy [restbase/deploy@1f2c7d8]: Start storing Parsoid/PHP results; add gcrwiki, shywiktionary, szywiki - [[phab:T229015|T229015]] [[phab:T238117|T238117]] [[phab:T238116|T238116]] [[phab:T237374|T237374]]
* 09:09 mobrovac@deploy1001: Finished deploy [restbase/deploy@1f2c7d8] (dev-cluster): Start storing Parsoid/PHP results; add gcrwiki, shywiktionary, szywiki (duration: 02m 35s)
* 09:06 mobrovac@deploy1001: Started deploy [restbase/deploy@1f2c7d8] (dev-cluster): Start storing Parsoid/PHP results; add gcrwiki, shywiktionary, szywiki
* 08:25 marostegui: Stop MySQL on db2062 to copy its data to db2132 [[phab:T238183|T238183]]
* 08:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 08:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
* 07:09 marostegui: Fix replication on labsdb1010 - [[phab:T233986|T233986]]
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315 for schema change', diff saved to https://phabricator.wikimedia.org/P9607 and previous config saved to /var/cache/conftool/dbconfig/20191113-070339-marostegui.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2086:3317 for compression', diff saved to https://phabricator.wikimedia.org/P9606 and previous config saved to /var/cache/conftool/dbconfig/20191113-070055-marostegui.json
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2087:3317 after compression', diff saved to https://phabricator.wikimedia.org/P9605 and previous config saved to /var/cache/conftool/dbconfig/20191113-065952-marostegui.json
* 06:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P9604 and previous config saved to /var/cache/conftool/dbconfig/20191113-065823-marostegui.json
* 06:25 volker-e@deploy1001: Finished deploy [design/style-guide@edce4cc]: Deploy design/style-guide:  (duration: 00m 08s)
* 06:25 volker-e@deploy1001: Started deploy [design/style-guide@edce4cc]: Deploy design/style-guide:
* 01:35 eileen: civicrm revision changed from {{Gerrit|3c15db25bb}} to {{Gerrit|a3714003ff}}, config revision is {{Gerrit|d678dbcaa5}}
== 2019-11-12 ==
* 23:57 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Fix: Do not return after inserting a single suggestion (duration: 00m 52s)
* 23:51 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/resources/src/mediawiki.interface.helpers.styles.less: Remove extraneous semicolons ([[phab:T233649|T233649]]), part 2 (duration: 00m 52s)
* 23:49 catrope@deploy1001: Synchronized php-1.35.0-wmf.5/includes/changes/ChangesList.php: Remove extraneous semicolons ([[phab:T233649|T233649]]), part 1 (duration: 00m 53s)
* 23:49 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:45 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 23:22 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 23:20 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 22:37 bblack: repool cp1076 (experiments concluded)
* 22:35 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: enabling REST API (duration: 00m 52s)
* 22:34 tstarling@deploy1001: Synchronized wmf-config/InitialiseSettings.php: enabling REST API (duration: 00m 52s)
* 22:32 eileen: civicrm revision changed from {{Gerrit|bfa53ee611}} to {{Gerrit|3c15db25bb}}, config revision is {{Gerrit|d678dbcaa5}}
* 21:54 bblack: depooling cp1076 for some local experimentation
* 20:18 herron: reprepro copy buster-wikimedia stretch-wikimedia prometheus-elasticsearch-exporter
* 20:11 otto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 20:11 otto@cumin1001: START - Cookbook sre.hosts.downtime
* 19:46 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P7007 --new-data-type external-id ([[phab:T234221|T234221]])
* 19:45 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki=wikidatawiki --property-id P4839 --new-data-type external-id ([[phab:T234221|T234221]])
* 19:43 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Sync a previously undeployed change to InitialiseSettings-labs.php that someone forgot to deploy (as a no-op) in production (duration: 00m 52s)
* 19:41 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set MCR migration stage to NEW on group0 for [[phab:T198312|T198312]] (duration: 00m 52s)
* 19:19 arlolra: Updated Parsoid to {{Gerrit|6a0a708}} ([[phab:T215000|T215000]], [[phab:T235295|T235295]], [[phab:T235656|T235656]], [[phab:T235217|T235217]], [[phab:T235295|T235295]], [[phab:T236846|T236846]], [[phab:T237556|T237556]], [[phab:T235231|T235231]])
* 19:03 arlolra@deploy1001: Finished deploy [parsoid/deploy@f516018]: Updating Parsoid to {{Gerrit|6a0a708}} (duration: 10m 09s)
* 18:58 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Final fixes and tweaks for testing (duration: 00m 53s)
* 18:53 arlolra@deploy1001: Started deploy [parsoid/deploy@f516018]: Updating Parsoid to {{Gerrit|6a0a708}}
* 18:39 ejegg: re-enabled Omnimail and contact de-duplication jobs
* 18:20 Urbanecm: Morning SWAT done
* 18:18 Urbanecm: Deploy security patch for [[phab:T237887|T237887]]
* 18:15 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: {{Gerrit|130ef87}}: Add right "abusefilter-log-private" to usergroup "rollbacker" at ptwiki ([[phab:T237830|T237830]]) (duration: 00m 53s)
* 18:08 XioNoX: push pfw change to add recdns anycast IP
* 17:33 XioNoX: update fasw-c-eqiad to match current standard (ntp/users/rootpw/lldp)
* 17:22 XioNoX: update fasw-c-codfw to match current standard (ntp/users/rootpw/lldp)
* 17:03 ema: pool cp3052 with ATS backend [[phab:T238085|T238085]]
* 17:03 ema: pool cp3052 with ATS backend [[phab:T227432|T227432]]
* 16:53 bblack: cpNNNN (all cache nodes) - cumin manual removal of globalsign-2018 remnants (key, cert, ocsp config, ocsp output)
* 16:42 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:40 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 16:28 XioNoX: setup bgp session from cr2-codfw to multihop RIS collector - [[phab:T106056|T106056]]
* 16:21 XioNoX: reboot scs-c1-eqiad.mgmt.eqiad.wmnet - [[phab:T238036|T238036]]
* 16:09 ema: depool cp3052 and observe performance impact [[phab:T238085|T238085]] before reimaging as text_ats [[phab:T227432|T227432]]
* 15:49 marostegui: Deploy schema change on db1102:3315 [[phab:T233135|T233135]] [[phab:T234066|T234066]]
* 15:45 mholloway-shell@deploy1001: Synchronized php-1.35.0-wmf.5/extensions/MachineVision: Fixes and tweaks for initial rollout (duration: 00m 53s)
* 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315 for a schema change [[phab:T233135|T233135]] [[phab:T234066|T234066]]', diff saved to https://phabricator.wikimedia.org/P9600 and previous config saved to /var/cache/conftool/dbconfig/20191112-154127-marostegui.json
* 15:24 otto@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=schema
* 14:46 bblack: cpNNNN (all caches): remove stale outputs from transient ocsp failures ( /var/cache/ocsp/update-ocsp-*.tmp )
* 14:41 ema: cp4022: trafficserver (8.0.5-1wm10) and fifo-log-demux (0.6) upgrade and restart
* 14:38 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4021.ulsfo.wmnet,service=nginx
* 14:35 ema: cp4021: ats-tls-restart to see if https://gerrit.wikimedia.org/r/550475 fixed the script
* 14:16 Jeff_Green: authdns-update to deploy fundraising-read.wmnet service cname adjustment
* 14:01 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert "Set all of wikidata for write both for term store" (duration: 00m 52s)
* 12:57 godog: refresh kibana field list
* 12:46 gehel: repool wdqs1004
* 12:37 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=wikidatawiki --batch-size 100 ([[phab:T237984|T237984]])
* 12:19 onimisionipe: restarting blazegraph on wdqs1005
* 12:11 effie: Reimage mwdebug1002 - [[phab:T214734|T214734]]
* 11:47 Amir1: EU SWAT is done
* 11:47 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.4/extensions/Wikibase: Wikibase term store error reduction, [[gerrit:550441{{!}}Do not catch DBError in ReplicaMasterAwareRecordIdsAcquirer.]] ([[phab:T236466|T236466]]) (duration: 00m 56s)
* 11:44 effie: Upgrade wtp* to 7.2.24-1 with elegance and restart php-fpm - [[phab:T237239|T237239]]
* 11:20 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [[gerrit:548584{{!}}Set all of wikidata for write both for term store (T225055)]] (duration: 00m 52s)
* 11:05 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SECURITY: Dont allow Wikimedia sysops to see who had 2FA disabled (duration: 00m 53s)
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1083', diff saved to https://phabricator.wikimedia.org/P9599 and previous config saved to /var/cache/conftool/dbconfig/20191112-104400-marostegui.json
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9598 and previous config saved to /var/cache/conftool/dbconfig/20191112-103641-marostegui.json
* 10:35 onimisionipe: resetting cronfile on wdqs hosts
* 10:33 marostegui: Drop labtestwiki database from m5 master db1133 - [[phab:T236010|T236010]]
* 10:30 marostegui: Deploy schema change on dbstore1003:3315
* 10:07 ema: repool cp3065, nothing interesting in kern.log and SEL [[phab:T238032|T238032]]
* 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'More traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9596 and previous config saved to /var/cache/conftool/dbconfig/20191112-095221-marostegui.json
* 09:42 marostegui: Remove privileges for labtestwiki on m5 - [[phab:T236010|T236010]]
* 09:27 gehel: restarting blazegraph on wdqs1004
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1083', diff saved to https://phabricator.wikimedia.org/P9595 and previous config saved to /var/cache/conftool/dbconfig/20191112-091706-marostegui.json
* 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1083 for mariadb upgrade to 10.1.39 - [[phab:T234800|T234800]]', diff saved to https://phabricator.wikimedia.org/P9594 and previous config saved to /var/cache/conftool/dbconfig/20191112-091158-marostegui.json
* 09:11 marostegui: Upgrade mariadb to 10.1.39 on db1083 (candidate master for s1)
* 08:56 moritzm: restarting archiva to pick up Java security updates
* 08:44 volker-e@deploy1001: Finished deploy [design/style-guide@3de6820]: Deploy design/style-guide:  (duration: 00m 06s)
* 08:44 volker-e@deploy1001: Started deploy [design/style-guide@3de6820]: Deploy design/style-guide:
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1083', diff saved to https://phabricator.wikimedia.org/P9593 and previous config saved to /var/cache/conftool/dbconfig/20191112-083720-marostegui.json
* 08:37 gehel: depool wdqs1004 to investigate update lag
* 08:35 moritzm: installing poppler security updates
* 08:24 volker-e@deploy1001: Finished deploy [design/style-guide@b926b95]: Deploy design/style-guide:  (duration: 00m 07s)
* 08:24 volker-e@deploy1001: Started deploy [design/style-guide@b926b95]: Deploy design/style-guide:
* 08:15 moritzm: installing curl security updates
* 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9592 and previous config saved to /var/cache/conftool/dbconfig/20191112-081322-marostegui.json
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Increase traffic to db1083', diff saved to https://phabricator.wikimedia.org/P9591 and previous config saved to /var/cache/conftool/dbconfig/20191112-074006-marostegui.json
* 07:36 elukey: remove /etc/logrotate.d/wdqs_autodeployment_log from wdqs1009 (not in puppet anymore and causing cronspam)
* 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1083 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P9590 and previous config saved to /var/cache/conftool/dbconfig/20191112-072823-marostegui.json
* 07:10 marostegui: Upgrade kernel on db1083 (s1 candidate master)
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1083 for kernel upgrade - [[phab:T234800|T234800]]', diff saved to https://phabricator.wikimedia.org/P9589 and previous config saved to /var/cache/conftool/dbconfig/20191112-070436-marostegui.json
* 06:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 06:57 marostegui@cumin1001: START - Cookbook sre.hosts.decommission
* 06:44 marostegui: Change triggers on s5 db2094 - [[phab:T234704|T234704]]
* 06:40 marostegui: Deploy schema change on s5 codfw with replication, this will generate lag on s5 codfw [[phab:T233135|T233135]] [[phab:T234066|T234066]]
* 06:21 marostegui: Compress db2087:3316, db2087:3317 [[phab:T235599|T235599]]
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2087:3316, db2087:3317 for compression - [[phab:T235599|T235599]]', diff saved to https://phabricator.wikimedia.org/P9588 and previous config saved to /var/cache/conftool/dbconfig/20191112-061959-marostegui.json
* 03:41 vgutierrez: restart wdqs-blazegraph on wdqs1004
== 2019-11-11 ==
* 22:51 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3065.esams.wmnet
* 22:49 ema: power-cycle cp3065, currently down
* 19:36 XioNoX: disable ALGs on mr1-esams
* 18:20 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@222b1c2]: New WDQS build - 0.3.6-SNAPSHOT (duration: 00m 57s)
* 18:19 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@222b1c2]: New WDQS build - 0.3.6-SNAPSHOT
* 18:16 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@222b1c2]: New WDQS build - 0.3.6-SNAPSHOT (duration: 15m 14s)
* 18:01 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@222b1c2]: New WDQS build - 0.3.6-SNAPSHOT
* 17:53 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0)
* 17:44 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 17:41 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 15:44 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers
* 15:42 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0)
* 15:30 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker
* 14:26 ema: pool cp3050 with ATS backend [[phab:T227432|T227432]]
* 13:50 ema@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 13:48 ema@cumin1001: START - Cookbook sre.hosts.downtime
* 13:25 ema: depool cp3050 and reimage as text_ats [[phab:T227432|T227432]]
* 12:59 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 12:46 effie: Upgrade  to 7.2.24-1 mwdebug[2001-2002].codfw.wmnet,mwmaint2001.codfw.wmnet,deploy2001.codfw.wmnet - [[phab:T237239|T237239]]
* 12:31 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@2cb2dde]: Deploy updates on wdqs1010 (duration: 00m 28s)
* 12:30 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@2cb2dde]: Deploy updates on wdqs1010
* 12:28 effie: Upgrade mw2* to 7.2.24-1 with elegance and restart php-fpm - [[phab:T237239|T237239]]
* 12:21 effie: Upgrade mw2* to 7.2.24-1 with elegance and restart php-fpm - [[phab:T231881|T231881]]
* 11:55 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 10:52 hoo: Updated the Wikidata property suggester with data from the 2019-11-04 JSON dump and applied the [[phab:T132839|T132839]] workarounds
* 10:48 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 10:47 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 10:45 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 10:32 vgutierrez: restarting ats-tls on cp1088
* 10:21 jynus: upgrade mariadb on db2102
* 10:16 ema: repool cp4027 after successful X-Wikimedia-Debug testing P9585 [[phab:T237687|T237687]]
* 10:12 jynus: manually run full backup of labtestpuppetmaster2001 [[phab:T235819|T235819]]
* 09:41 ema: test x-wikimedia-debug-routing.lua on cp4027 (depooled) [[phab:T237687|T237687]]
* 09:09 volker-e@deploy1001: Finished deploy [design/style-guide@0ea65f2]: Deploy design/style-guide:  (duration: 00m 07s)
* 09:09 volker-e@deploy1001: Started deploy [design/style-guide@0ea65f2]: Deploy design/style-guide:
* 08:28 marostegui: Stop MySQL on db2048 before decommissioning - [[phab:T237913|T237913]]
* 08:28 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove db2048 from config [[phab:T237913|T237913]] (duration: 00m 51s)
* 08:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db2048 from config [[phab:T237913|T237913]] (duration: 00m 54s)
* 08:21 marostegui: Remove db2048 from tendril and zarcillo [[phab:T237913|T237913]]
* 06:56 elukey: delete /etc/logrotate.d/wdqs-reload-categories from wdqs* as attempt to reduce cronspam
* 06:44 marostegui: Delete globalblocks table from napwikisource [[phab:T230055|T230055]]
* 05:27 vgutierrez: Switch from nginx to ats-tls on cp3058 - [[phab:T231627|T231627]]
== 2019-11-09 ==
* 20:25 reedy@deploy1001: Synchronized langlist-labs: [[phab:T237823|T237823]] (duration: 00m 54s)
* 02:39 volker-e@deploy1001: Finished deploy [design/style-guide@d2bfc09]: Deploy design/style-guide:  (duration: 00m 07s)
* 02:39 volker-e@deploy1001: Started deploy [design/style-guide@d2bfc09]: Deploy design/style-guide:
* 01:07 volker-e@deploy1001: Finished deploy [design/style-guide@ef82b69]: Deploy design/style-guide:  (duration: 00m 07s)
* 01:07 volker-e@deploy1001: Started deploy [design/style-guide@ef82b69]: Deploy design/style-guide:
* 01:06 volker-e@deploy1001: Finished deploy [design/style-guide@97fb3ee]: Deploy design/style-guide:  (duration: 00m 09s)
* 01:06 volker-e@deploy1001: Started deploy [design/style-guide@97fb3ee]: Deploy design/style-guide:
== 2019-11-08 ==
* 20:26 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Delay annotation request jobs by 5 mins for testing (duration: 00m 52s)
* 16:54 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:52 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 16:19 jeh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 16:15 jeh@cumin1001: START - Cookbook sre.hosts.downtime
* 16:15 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert "MachineVision: Enable testers-only mode on testcommonswiki for debugging" (duration: 00m 54s)
* 15:57 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118, db1106 at 100%', diff saved to https://phabricator.wikimedia.org/P9582 and previous config saved to /var/cache/conftool/dbconfig/20191108-155700-jynus.json
* 15:37 herron: beginning rolling service restarts on logstash hosts for java security updates
* 15:13 mholloway-shell@deploy1001: Synchronized wmf-config/InitialiseSettings.php: MachineVision: Enable testers-only mode on testcommonswiki for debugging (duration: 00m 52s)
* 14:56 volans@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
* 14:55 volans@cumin1001: START - Cookbook sre.hosts.decommission
* 14:50 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 50%', diff saved to https://phabricator.wikimedia.org/P9581 and previous config saved to /var/cache/conftool/dbconfig/20191108-145028-jynus.json
* 14:42 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
* 14:40 volans@cumin1001: START - Cookbook sre.hosts.downtime
* 13:40 jynus: stop and upgrade percona-server on test host db1114
* 13:27 elukey@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 13:12 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 20%', diff saved to https://phabricator.wikimedia.org/P9580 and previous config saved to /var/cache/conftool/dbconfig/20191108-131257-jynus.json
* 13:09 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ee2027c}}: Change the language of Votewiki back to English (en) ([[phab:T230614|T230614]]) (duration: 00m 54s)
* 12:34 elukey@cumin1001: START - Cookbook sre.cassandra.roll-restart
* 12:14 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 10%', diff saved to https://phabricator.wikimedia.org/P9578 and previous config saved to /var/cache/conftool/dbconfig/20191108-121444-jynus.json
* 12:02 jynus: update and restart db1118
* 12:01 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1118 fully', diff saved to https://phabricator.wikimedia.org/P9577 and previous config saved to /var/cache/conftool/dbconfig/20191108-120138-jynus.json
* 11:55 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 20%', diff saved to https://phabricator.wikimedia.org/P9576 and previous config saved to /var/cache/conftool/dbconfig/20191108-115553-jynus.json
* 11:27 jynus@cumin1001: dbctl commit (dc=all): 'Pool db1118 at 50%', diff saved to https://phabricator.wikimedia.org/P9575 and previous config saved to /var/cache/conftool/dbconfig/20191108-112733-jynus.json
* 11:25 jynus@cumin1001: dbctl commit (dc=all): 'repool db2130', diff saved to https://phabricator.wikimedia.org/P9574 and previous config saved to /var/cache/conftool/dbconfig/20191108-112503-jynus.json
* 11:12 jynus: update and restart db2130
* 11:11 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2116, depool db2130', diff saved to https://phabricator.wikimedia.org/P9573 and previous config saved to /var/cache/conftool/dbconfig/20191108-111125-jynus.json
* 10:58 Amir1: running rebuildItemTerms on 8028 items ([[phab:T234329|T234329]])
* 10:51 jynus: update and restart db2116
* 10:50 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2103, depool db2116', diff saved to https://phabricator.wikimedia.org/P9572 and previous config saved to /var/cache/conftool/dbconfig/20191108-105013-jynus.json
* 10:38 jynus: update and restart db2103
* 10:34 jeh: enable IPMI `racadm set iDRAC.IPMILan.Enable 1` on cloudcephmon[1-3] [[phab:T228102|T228102]]
* 10:33 jeh: enable IPMI `racadm set iDRAC.IPMILan.Enable 1` on cloudcephosd[1-3] [[phab:T224188|T224188]]
* 10:32 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2092, depool db2103', diff saved to https://phabricator.wikimedia.org/P9571 and previous config saved to /var/cache/conftool/dbconfig/20191108-103218-jynus.json
* 10:19 jynus: update and restart db2092
* 10:18 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2071, depool db2092', diff saved to https://phabricator.wikimedia.org/P9570 and previous config saved to /var/cache/conftool/dbconfig/20191108-101759-jynus.json
* 10:09 elukey: restart jvm-based hadoop daemons on an-master100[1,2] to pick up the new openjdk version
* 10:06 jynus: update and restart db2071
* 10:03 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2071', diff saved to https://phabricator.wikimedia.org/P9569 and previous config saved to /var/cache/conftool/dbconfig/20191108-100310-jynus.json
* 10:01 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2072', diff saved to https://phabricator.wikimedia.org/P9568 and previous config saved to /var/cache/conftool/dbconfig/20191108-100128-jynus.json
* 09:50 moritzm: uploaded openjdk 8u232-b09-1~deb10u1 to component/jdk8 for apt.wikimedia.org/buster-wikimedia
* 09:41 jynus: update and restart db2072
* 09:41 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2072', diff saved to https://phabricator.wikimedia.org/P9567 and previous config saved to /var/cache/conftool/dbconfig/20191108-094100-jynus.json
* 09:39 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1106 at 50%', diff saved to https://phabricator.wikimedia.org/P9566 and previous config saved to /var/cache/conftool/dbconfig/20191108-093958-jynus.json
* 09:35 elukey@cumin1001: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0)
* 09:29 jynus: update and restart db2094
* 09:27 jynus@cumin1001: dbctl commit (dc=all): 'Repool db1106 at 10%', diff saved to https://phabricator.wikimedia.org/P9565 and previous config saved to /var/cache/conftool/dbco