You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log

From Wikitech
Jump to: navigation, search

2018-03-17

  • 00:13 mutante: running puppet on all cache::misc to rename director bromine to webserver_misc_static (T188163)

2018-03-16

  • 23:32 mutante: signing puppet cert for vega.codfw.wmnet, initial puppet run after fresh stretch install (T188163)
  • 18:43 mutante: creating new ganeti VM vega.codfw.wmnet to be equivalent of bromine, 1G RAM, 30G disk, 1vCPU (T189899)
  • 18:13 jynus: switching back wikireplica cloud dns to the original config
  • 17:32 jynus: reimage dbproxy1010
  • 16:29 jynus: updating wikireplica_dns 2/3
  • 16:22 moritzm: installing curl security updates
  • 16:09 marostegui: Stop MySQL on db1020 - T189773
  • 14:48 andrewbogott: reset contintcloud quotas as per https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#incorrect_quota_violations
  • 14:48 jynus: reimage dbproxy1011
  • 14:27 andrewbogott: restarting nodepool on nodepool1001
  • 14:25 elukey: reboot druid1002 for kernel updates
  • 14:14 andrewbogott: restarting rabbitmq on labcontrol1001
  • 13:57 andrewbogott: stopping nodepool temporarily during changes to nova.conf
  • 13:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2050 (duration: 00m 58s)
  • 13:15 chasemp: disable puppet across cloud things for safe rollout
  • 12:52 moritzm: uploaded libsodium23/php-acpu/php-mailparse to thirdparty/php72 (deps/extentions needed by Phabricator)
  • 12:51 ema: text-esams: reboot for kernel upgrades T188092 and to mitigate https://grafana.wikimedia.org/dashboard/db/varnish-failed-fetches?panelId=7&fullscreen&orgId=1&from=1518746284946&to=1521204628041
  • 12:12 marostegui: Reboot dbproxy1005 for kernel upgrade
  • 12:02 marostegui: Run pt-table-checksum on m2
  • 12:00 marostegui: Run pt-table-checksum on m5
  • 11:11 hashar: zuul: reenqueue all coverage jobs lost when restarting Zuul
  • 10:53 hashar: Upgrading zuul to zuul_2.5.1-wmf4 to resolve a mutex deadlock T189859
  • 10:45 jynus: disable puppet and load balance between 3 wikirreplicas on dbproxy1010
  • 10:19 jynus: upgrade and restart of dbproxy1009 (passive)
  • 10:01 elukey: restart eventlogging_sync on db1108 (eventlogging db slave) as precautions after the change of m4-master.eqiad.wmnet's CNAME
  • 10:00 moritzm: reverting the HHVM/ICU 57 setup on mwdebug2001 which was used for the dry run tests
  • 09:57 elukey: restart eventlogging-consumer@mysql-eventbus on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:56 hashar: Zuul coverage pipeline is deadlocked on an unreleased mutex. Will need a new Zuul version.
  • 09:51 elukey: restart eventlogging-consumer@mysql-m4 on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1015 after kernel, mariadb and socket upgrade (duration: 00m 57s)
  • 09:27 oblivian@tin: Finished deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2 (duration: 00m 29s)
  • 09:26 oblivian@tin: Started deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2
  • 09:17 oblivian@tin: (no justification provided)
  • 09:17 oblivian@tin: Finished deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts (duration: 00m 47s)
  • 09:16 oblivian@tin: Started deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts
  • 09:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 00m 57s)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1015 after kernel, mariadb and socket upgrade (duration: 00m 56s)
  • 08:49 jynus: upgrade and restart of dbproxy1004 (passive)
  • 08:41 marostegui: Stop MySQL on es1015 for maintenance
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1015 for kernel, mariadb and socket upgrade (duration: 00m 58s)
  • 08:40 elukey: reboot druid1006 for kernel updates
  • 08:29 elukey: reboot druid1005 for kernel updates
  • 07:53 moritzm: reimage mc2036 after mainboard replacement (T185587)
  • 07:15 marostegui: Stop MySQL on es2017 (es3 codfw master) for maintenance
  • 07:06 marostegui: Stop MySQL on es2016 (es2 codfw master) for maintenance
  • 06:52 marostegui: Stop MySQL on db2048 (s1 codfw master) for maintenance
  • 06:41 marostegui: Stop MySQL on db2051 (s4 codfw master) for maintenance
  • 06:28 marostegui: Stop MySQL on db2045 (s8 codfw master) for maintenance
  • 06:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 (duration: 00m 58s)
  • 01:46 XioNoX: librenms IRC bot moved to -operations channel. Doc on how to turn it off is on https://wikitech.wikimedia.org/wiki/LibreNMS#IRC_Alerting
  • 01:00 reedy@tin: Synchronized php-1.31.0-wmf.25/includes/specials/pagers/NewFilesPager.php: Fix T189846 (duration: 00m 58s)

2018-03-15

  • 23:25 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: Fix display issues (duration: 00m 59s)
  • 23:20 ebernhardson@tin: Synchronized php-1.31.0-wmf.25/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off Cirrus AB test (duration: 00m 58s)
  • 22:58 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: add some missing globals (duration: 00m 58s)
  • 20:38 demon@tin: Synchronized robots.txt: minor tidying (duration: 00m 58s)
  • 20:05 chasemp: disable puppet for cloud things for a safe rollout
  • 19:50 XenoRyet: updated civicrm from 9e79d63426 to 3291ad35c9
  • 19:14 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.25
  • 18:51 niharika29@tin: Synchronized php-1.31.0-wmf.25/extensions/MobileApp/: https://gerrit.wikimedia.org/r/#/c/419785/; https://gerrit.wikimedia.org/r/#/c/419784/; https://gerrit.wikimedia.org/r/#/c/419776/ (duration: 01m 14s)
  • 18:25 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/417329/ (duration: 01m 15s)
  • 18:11 maxsem@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 16s)
  • 18:09 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 15s)
  • 17:27 ppchelko@tin: Finished deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources (duration: 01m 23s)
  • 17:26 bsitzmann@tin: Finished deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327) (duration: 05m 38s)
  • 17:25 ppchelko@tin: Started deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources
  • 17:20 bsitzmann@tin: Started deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327)
  • 17:18 moritzm: installing dbus updates from stretch 9.4 point release
  • 16:43 ppchelko@tin: Finished deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints (duration: 15m 22s)
  • 16:28 ppchelko@tin: Started deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints
  • 16:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2050 for data checks (duration: 01m 15s)
  • 15:58 volans: updated facts on both CI puppet-compilers
  • 15:56 moritzm: pruning obsolete packages from jessie-wikimedia/experimental
  • 15:56 marostegui: Stop MySQL on s5 codfw master (db2052) this will break replication on s5 codfw
  • 15:51 godog: repool puppetmaster1002
  • 15:47 moritzm: installing libvirt security updates
  • 15:20 elukey: reboot druid1003 for kernel updates
  • 15:13 marostegui: Stop MySQL on s6 codfw master (db2039) this will break replicaiton on s6 codfw
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after socket path location update (duration: 01m 15s)
  • 15:05 _joe_: restarted jobrunner, jobchron on the eqiad jobrunners
  • 14:30 elukey: reboot druid1004 for kernel updates
  • 13:51 elukey: reboot kafka1001 (eventbus/job-queues eqiad) for kernel updates
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 14s)
  • 13:33 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout, again. Last time didn't pick the right partman config
  • 13:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 15s)
  • 13:09 moritzm: restarting HHVM on canaries to pick up curl security update
  • 13:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule, clean expired rules (T189442) (duration: 01m 15s)
  • 12:54 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 12:36 moritzm: installing curl security updates on jessie/stretch
  • 12:26 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout
  • 12:08 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1007 after kernel security update (duration: 01m 14s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after socket path location update (duration: 01m 14s)
  • 11:59 moritzm: rebooting rdb1007 for kernel security update
  • 11:56 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1007 for kernel security update (duration: 01m 14s)
  • 11:52 marostegui: Stop MySQL on es1013 for socket path upgrade
  • 11:51 moritzm: rebooted rdb1005 for kernel security update
  • 11:49 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1005 after kernel security update (duration: 01m 14s)
  • 11:48 godog: reimage puppetmaster1002 with stretch
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for socket path location update (duration: 01m 14s)
  • 11:42 godog: depool puppetmaster1002 for stretch reimage
  • 11:29 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1005 for kernel security update (duration: 01m 10s)
  • 11:16 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1003 after kernel security update (duration: 01m 14s)
  • 11:04 moritzm: rebooting rdb1003 for kernel security update
  • 11:01 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1003 for kernel security update (duration: 01m 14s)
  • 10:48 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1001 after kernel security update (duration: 01m 14s)
  • 10:32 moritzm: rebooting rdb1001 for kernel security update
  • 10:24 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1001 for kernel security update (duration: 01m 14s)
  • 10:22 ema: apt.w.o: upload varnish=5.1.3-1wm4 to jessie-wikimedia/main (upstream "extrachance" fixes) T174932
  • 10:12 gehel@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic1021.eqiad.wmnet
  • 09:56 ema: apt.w.o: move varnish=5.1.3-1wm3, varnish-modules=0.12.1-1+wmf1, libvmod-netmapper=1.6-1 from jessie-wikimedia/experimental to jessie-wikimedia/main T188545
  • 09:56 moritzm: installing curl security updates on Debian
  • 09:30 godog: repool puppetmaster2002
  • 09:16 jynus: reset slave all @db1051
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore normal weight for es1017 (duration: 01m 14s)
  • 08:44 godog: roll-restart thumbor in eqiad/codfw to enable access to swift private container
  • 08:42 jynus: end of maintenance for m2
  • 08:31 jynus: setting m2 as read only
  • 08:29 gilles: setZoneAccess done
  • 08:28 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 08:18 jynus: disable puppet on db1051, db1020 for switchover preparation
  • 08:06 ayounsi@tin: Finished deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1 (duration: 01m 02s)
  • 08:05 ayounsi@tin: Started deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1
  • 08:01 jynus: switching db2044 to be a direct replica of db1051
  • 07:49 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 01m 07s)
  • 07:48 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 05s)
  • 07:30 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 39s)
  • 07:29 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1017 (duration: 01m 14s)
  • 07:21 moritzm: reimaging mc2036 after hardware replacement T185587
  • 07:07 marostegui: Stop mariadb on es1017 for kernel, mariadb and socket location upgrade
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 01m 14s)
  • 07:01 marostegui: Deploy schema change on db1084 - T187089 T185128 T153182
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 01m 15s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 (duration: 01m 15s)
  • 06:29 marostegui: Stop MySQL on db1064 for mariadb upgrade
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 10m 10s)
  • 00:25 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/client/includes/RecentChanges/ExternalChangeFactory.php: T189320 Use only local part of username when building the RC line (duration: 01m 18s)
  • 00:22 tgr@tin: Synchronized php-1.31.0-wmf.24/includes/user/ExternalUserNames.php: T189320 Add ExternalUserNames::getLocal() to get local part of username (duration: 01m 15s)
  • 00:20 ejegg: updated payments-wiki from 9068692c32 to 30f5f3edfb
  • 00:08 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/: VE fixes followup (duration: 01m 15s)
  • 00:03 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 15s)
  • 00:02 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 16s)

2018-03-14

  • 23:45 XenoRyet: updated payments-wiki from 86715f6e9e to 9068692c32
  • 23:45 tgr@tin: Synchronized wmf-config/Wikibase.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 14s)
  • 23:43 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:41 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:21 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 20s)
  • 23:18 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 15s)
  • 22:13 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/Thanks: T189752 (duration: 01m 16s)
  • 21:27 hoo: Ran scap pull on mwdebug1001 after testing https://gerrit.wikimedia.org/r/417180
  • 21:26 andrewbogott: rebuilding labtestweb2001 with Debian Stretch
  • 20:34 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.25
  • 20:32 demon@tin: Synchronized php: symlink bump to wmf.25 (duration: 01m 14s)
  • 20:27 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c (duration: 05m 37s)
  • 20:24 demon@tin: Finished scap: trying a php5/hhvm theory (duration: 06m 37s)
  • 20:21 mholloway-shell@tin: Started deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c
  • 20:17 demon@tin: Started scap: trying a php5/hhvm theory
  • 20:16 demon@tin: Finished scap: scapping, pt. 2. prior one failed because i tested something (duration: 69m 43s)
  • 19:06 demon@tin: Started scap: scapping, pt. 2. prior one failed because i tested something
  • 19:06 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "demon"; reason is "rebuilding l10n" (duration: 00m 00s)
  • 18:20 jynus: running pt-table-checksum on all m2, some lag will happen on passive replicas
  • 18:16 jynus: running pt-table-checksum on all m1, some lag will happen on passive replicas
  • 17:56 demon@tin: Started scap: rebuilding l10n
  • 17:55 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/CentralNotice: updates! (duration: 01m 16s)
  • 17:54 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "reedy"; reason is "updates!" (duration: 00m 00s)
  • 17:54 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralNotice: updates! (duration: 01m 18s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/416489 (duration: 01m 14s)
  • 17:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/419077 (duration: 01m 15s)
  • 16:58 hoo: Manually running extensions/Wikibase/repo/maintenance/dispatchChanges.php on terbium, so that dispatching can catch up
  • 16:56 jynus: deploying new firewall rules to dbproxy1001 and 7
  • 16:40 moritzm: installing cron updates from stretch 9.4 point release
  • 16:35 demon@tin: Synchronized .gitignore: ignore scap logs (duration: 01m 15s)
  • 16:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1074 original weight (duration: 01m 13s)
  • 16:12 godog: temporarily add back puppetmaster2002 as a low-weight backend
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 15:47 andrew@tin: Synchronized multiversion/MWMultiVersion.php: wikitech cleanup (duration: 01m 14s)
  • 15:25 XioNoX: Re-enabling BGP on cr2-codfw Zayo transit - T189452
  • 15:12 XioNoX: Disabling BGP on cr2-codfw Zayo transit - T189452
  • 15:02 jynus: disabling puppet in preparation for reimage of dbproxy1002 and 6
  • 14:59 moritzm: installing virt-what updates from stretch point release
  • 14:58 paravoid: rebooting furud
  • 14:44 ottomata: beginning migration of eventlogging analtyics from Kafka analytics to Kafka jumbo: T183297
  • 14:33 godog: depool puppetmaster2002 for reimage
  • 14:06 Reedy: created wbc_entity_uages on ruwikimedia T188456
  • 13:50 zeljkof: EU SWAT finished
  • 13:49 zfilipin@tin: Synchronized dblists/wikidataclient.dblist: SWAT: Revert "Add ruwikimedia to wikidataclient" (T188456) (duration: 01m 14s)
  • 13:42 zfilipin@tin: Synchronized docroot/noc/conf/: SWAT: Revert "Publish throttle-analyze at noc" (T187894) (duration: 01m 15s)
  • 13:21 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull (duration: 00m 33s)
  • 13:21 ppchelko@tin: Started deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull
  • 13:21 zfilipin@tin: Synchronized docroot/noc/conf/throttle-analyze.php.txt: SWAT: Publish throttle-analyze at noc (T187894) (duration: 01m 13s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination (duration: 00m 38s)
  • 13:20 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination
  • 13:12 zfilipin@tin: Synchronized dblists/commonsuploads.dblist: SWAT: Disable upload for non-admins on kowikiversity (T189021) (duration: 01m 14s)
  • 13:06 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Remove obsolete throttle rules, add one new (T189241) (duration: 01m 15s)
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 14s)
  • 12:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 12:22 kartik@tin: Finished deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c (duration: 03m 12s)
  • 12:19 kartik@tin: Started deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1074 (duration: 01m 14s)
  • 11:45 marostegui: Stop db1074 for kernel upgrade
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for data checks and kernel upgrade (duration: 01m 14s)
  • 11:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1018 after kernel and mariadb upgrade (duration: 01m 15s)
  • 11:02 moritzm: rebooting einsteinium / icinga.wikimedia.org for kernel security update
  • 10:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slwoly repool es1018 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:37 marostegui: Stop mariadb on es1018 for kernel and mariadb upgrade + change socket location
  • 10:35 moritzm: rebooting hydrogen for kernel security update
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1018 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2006 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:22 jynus: dropping testotrs from m2
  • 10:16 jynus: archiving and dropping bugzilla_testing from m2
  • 10:10 marostegui: Stop mariadb on pc2006 for kernel and mariadb upgrade + change socket location
  • 10:09 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2006 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:07 jynus: archiving and dropping testblog from m2
  • 10:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2005 after kernel and mariadb upgrade (duration: 01m 15s)
  • 09:50 marostegui: Stop mariadb on pc2005 for kernel and mariadb upgrade + change socket location
  • 09:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2005 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:44 moritzm: installing samba security update (just the client side libraries)
  • 09:40 marostegui: Stop mysql on es2015 to upgrade socket path
  • 09:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2004 after kernel and mariadb upgrade (duration: 01m 14s)
  • 09:34 marostegui: Stop mysql on es2014 to upgrade socket path
  • 09:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2004 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:23 marostegui: Stop mariadb on pc2004 for kernel upgrade
  • 09:13 marostegui: Stop mysql on es2013 to upgrade socket path
  • 09:08 marostegui: Stop mysql on es2012 to upgrade socket path
  • 08:57 ema: cp3041: restart varnish-be
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after kernel and mariadb upgrade (duration: 01m 15s)
  • 08:28 ema: cp3040: restart varnish-be
  • 08:21 hashar: Restarting the CI Jenkins
  • 07:45 marostegui: Reboot es2004 for kernel upgrade
  • 07:45 marostegui: Reboot es2003 for kernel upgrade
  • 07:34 marostegui: Reboot es2002 for kernel upgrade
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after kernel and mariadb upgrade (duration: 01m 14s)
  • 07:03 marostegui: Stop mariadb on es1013 for mariadb and kernel upgrade
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for kernel and mariadb upgrade (duration: 01m 14s)
  • 06:45 marostegui: Deploy schema change on db1064 with replication (this will generate lag on s4 on labs hosts) - T187089 T185128 T153182
  • 06:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 for alter table (duration: 01m 14s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 after alter table (duration: 01m 15s)
  • 03:13 mutante: bacula is working again - restored missing file set (https://gerrit.wikimedia.org/r/419341 )
  • 02:49 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 40s)
  • 02:44 Jamesofur: deleted 46 archived files
  • 02:18 mutante: helium - running bacula-dir with -f in foreground revealed: ERROR TERMINATION at parse_conf.c:485 - Config error: Could not find config Resource mysql-srv-backups - line 7, col 33 of file /etc/bacula/jobs.d/bohrium.eqiad.wmnet-mysql-predump-piwik-Weekly-Wed-production.conf
  • 02:17 mutante: helium - bacula director process failed (Bacula interrupted by signal 11: Segmentation violation), icinga alerted. attempted to restart it. then: bacula-dir - the configtest failed!
  • 00:01 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: crwiki logo (duration: 01m 15s)
  • 00:00 reedy@tin: Synchronized static/images/project-logos/crwiki.png: (no justification provided) (duration: 01m 14s)

2018-03-13

  • 23:46 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/MobileFrontend/: T188825 (duration: 01m 18s)
  • 23:43 mutante: tin: chmod -R g+w /srv/mediawiki-staging/.git/objects/* ; chmod -R g+w /srv/mediawiki-staging/php-1.31.0-wmf.24/.git/objects/*
  • 23:35 Reedy: that was Enable VirtualPageViews on Hungarian Wikipedia T184793
  • 23:35 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 15s)
  • 23:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: moar logos (duration: 01m 15s)
  • 23:24 reedy@tin: Synchronized static/images/project-logos/: YOU GET A LOGO, YOU GET A LOGO. YOU ALL GET LOGOS (duration: 01m 16s)
  • 23:11 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHTML on 96 wikis T188010 (duration: 01m 16s)
  • 23:10 mutante: restbase-dev1006 - reinstalling, manually skipping " Volume group name already in use" (T185494)
  • 22:52 eileen: civicrm revision changed from c8458c4a2f to 9e79d63426, config revision is 08b7e6216e (Benevity comma fix)
  • 20:40 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.25
  • 20:09 demon@tin: Finished scap: bootstrap wmf.25 (duration: 67m 17s)
  • 19:02 demon@tin: Started scap: bootstrap wmf.25
  • 18:47 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:46 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:42 gehel: repool wdqs1004 & wdqs2001 now that data reload is completed T189548
  • 18:39 XenoRyet: updated civicrm from 8652db05f5 to c8458c4a2f
  • 18:37 moritzm: installing reportbug updates from stretch point release
  • 18:32 moritzm: installing w3m updates from stretch point release
  • 17:55 moritzm: installing ncurses updates from stretch point release
  • 17:53 moritzm: installing ncurses updates from stretch point release
  • 17:19 awight@tin: Started scap: Beta: Fix ORES thresholds and enable JADE, T181159, T176333
  • 17:06 godog: cleanup integration-slave-jessie-1001:/srv/pbuilder/build - T189587
  • 16:45 marostegui: Clean iptables rules on dbproxy1001 to leave it as dbproxy1006
  • 16:33 marostegui: Retroactive: cleared iptables rules on dbproxy1007
  • 16:32 jynus: restarting gerring on cobalt, stalled
  • 16:26 jynus: restarting gerring on cobalt, stalled
  • 16:18 jynus: update CNAME for m1-master and m2-master
  • 15:50 marostegui: Deploy schema change on db1097:3314 - T187089 T185128 T153182
  • 15:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 for alter table (duration: 00m 56s)
  • 15:39 jynus: upgrade and restart dbproxy1007
  • 15:33 vgutierrez: upgrading eqiad LVSs to pybal 1.15.2
  • 15:32 jynus: upgrade and restart dbproxy1001
  • 14:55 vgutierrez: upgrading codfw LVSs to pybal 1.15.2
  • 14:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 after alter table (duration: 00m 57s)
  • 14:51 jynus: stopping db2044 (this will make proxies complain about redundancy)
  • 14:42 moritzm: rebooting chromium for kernel security update
  • 14:11 chasemp: add chico to wmf-nda (verified nda things with moritz and all the goodness)
  • 13:29 jynus: stop db1001 for maintenance (proxies will temporarely complain about lack of redundancy)
  • 13:20 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: wmf-config: enable Singapore oversample as default on all wikis (T188652) (duration: 00m 57s)
  • 12:32 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 12:26 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 12:04 reedy@tin: Synchronized wmf-config/interwiki.php: T188537 (duration: 00m 57s)
  • 11:59 moritzm: rebooting DNS recursors in codfw for kernel security update
  • 11:43 _joe_: include our own etcd package (3.2.16) on stretch
  • 11:37 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 11:33 kartik@tin: Finished deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc (duration: 03m 30s)
  • 11:30 kartik@tin: Started deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc
  • 11:23 jynus: ran update-netboot-stretch.sh
  • 11:21 moritzm: rebooting DNS recursors in esams for kernel security update
  • 10:22 moritzm: rebooting DNS recursors in ulsfo and eqsin for kernel security update
  • 10:17 vgutierrez: upgrading esams LVSs to pybal 1.15.2
  • 10:08 jynus: stopping mysql on db1063 and db1051 to validate the depool before full reimage
  • 10:07 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1001 after kernel security update (duration: 00m 57s)
  • 10:00 gehel: shuttind down blazegraph on wdqs2001 for data transfer to wdqs1004 - T189548
  • 09:48 vgutierrez: upgrading ulsfo LVSs to pybal 1.15.2
  • 09:37 moritzm: rebooting poolcounter1001 for kernel security update
  • 09:15 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling poolcounter1001 for kernel security update (duration: 00m 56s)
  • 09:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 09:02 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 08:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 (duration: 00m 57s)
  • 06:58 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:56 marostegui: Deploy schema change on db1081 - T187089 T185128 T153182
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 for alter table (duration: 00m 56s)
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 after alter table (duration: 01m 19s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 30s)

2018-03-12

  • 22:52 eileen: update civicrm revision changed from a819d64d98 to 8652db05f5, config revision is 08b7e6216e - update civicrm.settings.php
  • 20:44 arlolra: Updated Parsoid to 16ced34 (T188670, T90902)
  • 20:37 arlolra@tin: Finished deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34 (duration: 10m 16s)
  • 20:36 andrewbogott: updated wikitech-static as detailed in https://wikitech.wikimedia.org/wiki/Wikitech-static#Manual_updates
  • 20:27 arlolra@tin: Started deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34
  • 20:26 andrewbogott: apt-get upgrade and reboot on wikitech-static
  • 20:25 andrewbogott: stopping apache2 on Silver in anticipation of it being decommissioned
  • 20:16 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7 (duration: 05m 29s)
  • 20:11 mholloway-shell@tin: Started deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7
  • 19:53 MaxSem: disabled 2FA for User:Ctac (T189520)
  • 19:48 chasemp: labstore1003:~# service nfs-kernel-server restar
  • 19:44 chasemp: labstore1003:~# exportfs -ra
  • 18:53 Krinkle: Clean up left-over .wsp.bak files under frontend.navtiming* on graphite1001 (following T179622)
  • 18:44 mutante: added to DNS: romd.wikimedia.org (and romd.m) for Wikimedians of Romania and Moldova User Group
  • 18:43 mutante: added to DNS: hi.wikimedia.org (and hi.m) for Hindi Wikimedian User Group
  • 18:25 ppchelko@tin: Finished deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries (duration: 15m 25s)
  • 18:09 ppchelko@tin: Started deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries
  • 17:48 ottomata: removed kafka.protocol.version setting for varnishkafka webrequest instances; version should now be properly negotiated
  • 17:29 gehel@tin: Finished deploy [wdqs/wdqs@ce72538]: new wdqs updater (duration: 04m 47s)
  • 17:27 _joe_: poweroff mw2097-2134, T189111
  • 17:24 gehel@tin: Started deploy [wdqs/wdqs@ce72538]: new wdqs updater
  • 16:34 joal@tin: Finished deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug (duration: 08m 50s)
  • 16:25 joal@tin: Started deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug
  • 15:56 mepps: updated payments-wiki from ce68e8e80b to 86715f6e9e
  • 15:51 gehel: restart blazegraph on wdqs2001 to validate new config - T175919
  • 15:43 vgutierrez: eqsin LVSs: upgrade pybal to 1.15.2
  • 15:39 ottomata: bouncing kafka main-eqiad -> jumbo-eqiad mirror maker instances
  • 15:37 ottomata: disabling puppet on kafka1020,1022,1023 to test partition.assigment.strategy change for mirror maker
  • 15:28 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swift user for private containers (T187822) (duration: 00m 54s)
  • 15:26 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 [keeping static files] (duration: 01m 19s)
  • 15:24 vgutierrez: lvs1007,lvs1010 upgraded pybal to 1.15.2
  • 15:17 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 [keeping static files] (duration: 01m 22s)
  • 15:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 (duration: 02m 35s)
  • 15:12 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120 (duration: 00m 31s)
  • 15:11 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120
  • 15:08 joal: Provide correct log message for analytics/refinery scap deploy: Regular deploy of analytics-hadoop code
  • 15:07 joal@tin: Finished deploy [analytics/refinery@fd0a90f]: Regular a (duration: 04m 54s)
  • 15:07 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 (duration: 03m 58s)
  • 15:02 joal@tin: Started deploy [analytics/refinery@fd0a90f]: Regular a
  • 14:42 jynus: upgrade and restart es2001
  • 14:09 sbisson@tin: Finished deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test* (duration: 00m 34s)
  • 14:09 sbisson@tin: Started deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test*
  • 14:02 zeljkof: EU SWAT finished
  • 13:59 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 08s)
  • 13:24 moritzm: synchronised PHP 7.2.3 to thirdparty/php72 for stretch-wikimedia
  • 13:17 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 09s)
  • 12:44 godog: start a catalog compilation on elnath to check for puppetdb4 diffs - T177253
  • 11:26 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1002 after kernel security update (duration: 03m 09s)
  • 11:14 moritzm: reboot poolcounter1002 for kernel security update
  • 11:10 jmm@tin: Synchronized wmf-config/ProductionServices.php: depooling poolcounter1002 for kernel security update (duration: 03m 09s)
  • 10:39 _joe_: running decommission_appserver on mw2097-2134 T189111
  • 10:23 XioNoX: labs->cloud vlan rename in eqiad - T187933
  • 09:56 elukey: restart kafka mirror maker (main eqiad -> jumbo) on kafka1020 (all consumers not assigned to any partition on kafka102*)
  • 09:53 moritzm: installing util-linux security updates
  • 09:31 _joe_: decommission mw2097-mw2134 from conftool T189111
  • 08:40 moritzm: rebooting iron for kernel security update
  • 08:32 ema: cp3033/cp3031: restart varnish-be
  • 08:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2015 after kernel upgrade (duration: 00m 58s)
  • 08:20 ema: cp3033/cp3031: set transaction_timeout to 60s
  • 08:14 marostegui: Stop MySQL on es2015 for kernel upgrade
  • 08:06 ema: cp3042: restart varnish-be
  • 08:03 ema: cp3042: set transaction_timeout to 30s
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 for kernel upgrade (duration: 00m 58s)
  • 07:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2014 after kernel upgrade (duration: 01m 01s)
  • 07:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 59s)
  • 07:26 marostegui: Stop MySQL on es2014 for kernel upgrade
  • 07:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 58s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3316 as vslow,dump in s6 - T184161 (duration: 00m 58s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3315 as vslow,dump in s5 - T184161 (duration: 00m 58s)
  • 06:27 marostegui: Deploy schema change on db1103:3314 - T187089 T185128 T153182
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 for alter table (duration: 01m 06s)
  • 02:52 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 11m 56s)

2018-03-11

  • 08:50 elukey: executed sudo rm /etc/logrotate.d/kafkatee-webrequest-analytics on oxygen/rhenium to stop daily cronspam

2018-03-10

  • 14:56 ema: cp1053: restart varnish-be
  • 13:29 ema: cp1068/cp1055: restart varnish-be

2018-03-09

  • 23:29 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/ReadingLists/src/Api/ApiQueryReadingListEntries.php: T189272 fix stupid ReadingLists typo breaking production (duration: 00m 54s)
  • 19:43 foks: changed global email for User:Mathmensch
  • 19:19 MaxSem: restarted my script on tin, now with more aggressive writes
  • 18:26 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/AbuseFilter/includes/AbuseFilter.class.php: Unbreak AbuseFilter tagging T189299 (duration: 00m 59s)
  • 17:35 andrew@tin: Finished deploy [horizon/deploy@9c234d6]: Another try at fixing T188458 (duration: 03m 00s)
  • 17:32 andrew@tin: Started deploy [horizon/deploy@9c234d6]: Another try at fixing T188458
  • 16:14 andrewbogott: test log
  • 16:07 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3034.esams.wmnet
  • 15:59 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303 (duration: 00m 38s)
  • 15:59 andrewbogott: moving wikitech dns record to point to misc-web and the new labweb cluster, https://gerrit.wikimedia.org/r/#/c/417926/
  • 15:59 ppchelko@tin: Started deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303
  • 15:54 andrew@tin: Finished deploy [horizon/deploy@f59f568]: rolling out a fix for T188458 (duration: 03m 11s)
  • 15:51 andrew@tin: Started deploy [horizon/deploy@f59f568]: rolling out a fix for T188458
  • 15:30 moritzm: installing zsh security update on trusty
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 after cloning db1113:3316 - T184161 (duration: 00m 58s)
  • 15:15 moritzm: installing sensible-utils security update on trusty (Debian already fixed)
  • 15:11 ema: cp-upload_esams: reboot for retpoline kernel updates T188092
  • 13:12 marostegui: Compress s6 on db1113:3316 - T184161
  • 12:41 elukey: manually executed systemctl reset-failed to some old (not present anymore) units on kafka analytics hosts
  • 12:26 marostegui: Compress s5 on db1113:3315 - T184161
  • 12:16 marostegui: Stop mysql on db1063 to clone db1113:3316 - T184161
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 to clone db1113:3316 - T184161 (duration: 00m 58s)
  • 12:11 jynus: dropping test databases on dbstore2* instances
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 11:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:15 marostegui: Stop MySQL on db1051 to clone db1113 - https://phabricator.wikimedia.org/T184161
  • 11:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 to clone db1113 - T184161 (duration: 00m 58s)
  • 09:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with normal load (duration: 00m 58s)
  • 09:22 ema: cp-misc_esams: reboot for retpoline kernel updates T188092
  • 08:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2058 and db2084 (duration: 00m 58s)
  • 08:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with low load (duration: 00m 58s)
  • 07:35 marostegui: Stop mariadb on db2058 and db2084 for mariadb+kernel upgrade
  • 07:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2058 and db2084 (duration: 00m 58s)
  • 07:33 marostegui: Logging for the record: es2013 was stopped and rebooted for mariadb and kernel upgrade
  • 07:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 00m 58s)
  • 07:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2012, depool es2013 (duration: 00m 58s)
  • 06:52 marostegui: Stop MariaDB on es2012 to upgrade mariadb and kernel
  • 06:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2012 for kernel and mariadb upgrade (duration: 00m 58s)
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1019 normal weight (duration: 00m 59s)
  • 05:00 andrew@tin: Finished deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278 (duration: 02m 59s)
  • 04:57 andrew@tin: Started deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278
  • 00:40 thcipriani@tin: Synchronized static/images/project-logos: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART IV (duration: 00m 58s)
  • 00:38 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART III (duration: 00m 58s)
  • 00:36 thcipriani@tin: Synchronized static/images/project-logos/urwiki-2x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART II (duration: 00m 58s)
  • 00:33 thcipriani@tin: Synchronized static/images/project-logos/urwiki-1.5x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART I (duration: 00m 59s)
  • 00:03 urandom: set compression chunk length to 32, parsoid tables (group "enwiki") - T189057

2018-03-08

  • 23:10 urandom: set compression chunk length to 32, parsoid tables (group "wikipedia") - T189057
  • 22:31 urandom: set compression chunk length to 32, parsoid tables (group "commons") - T189057
  • 22:16 reedy@tin: Synchronized php-1.31.0-wmf.24/includes/specials/pagers/BlockListPager.php: T189251 (duration: 00m 59s)
  • 22:07 MaxSem: guess what? trying T187516 again
  • 21:41 urandom: set compression chunk length to 32, parsoid tables (group "others") - T189057
  • 21:15 otto@tin: Synchronized wmf-config/ProductionServices.php: Revert: point monolog avro producer back at Kafka analytics. Too many TCP connections? T188136 (duration: 00m 58s)
  • 21:09 sbisson@tin: Finished deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3) (duration: 04m 42s)
  • 21:04 sbisson@tin: Started deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3)
  • 20:40 urandom: set compression chunk length to 32, mobile tables - T189057
  • 20:34 urandom: set compression chunk length to 32, page_summary tables - T189057
  • 20:30 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to php-1.31.0-wmf.24
  • 20:26 thcipriani@tin: Synchronized php: Ensure symlink for 1.31.0-wmf.24 is up-to-date (duration: 01m 15s)
  • 19:52 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/Echo/: https://gerrit.wikimedia.org/r/#/c/417330/ and https://gerrit.wikimedia.org/r/#/c/417340/ (duration: 01m 21s)
  • 19:33 anomie: Running `cleanupUsersWithNoId.php --table recentchanges --prefix wikidata --force` on wikidata client wikis for T181731. This shouldn't create any local SUL accounts.
  • 19:29 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/: Hooks: Don't register beta features if they're enabled for all https://gerrit.wikimedia.org/r/#/c/417277/ (duration: 01m 14s)
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test* (duration: 02m 40s)
  • 19:23 niharika29@tin: Synchronized wmf-config/CommonSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 15s)
  • 19:22 sbisson@tin: Started deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test*
  • 19:21 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 16s)
  • 18:43 bsitzmann@tin: Finished deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167 (duration: 06m 14s)
  • 18:37 bsitzmann@tin: Started deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167
  • 17:19 andrew@tin: Synchronized wmf-config/wikitech.php: wikitech varnish updates (duration: 01m 15s)
  • 17:05 jynus: stop and reboot db1114 for kernel regression
  • 16:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool es1019 with less weight after HW maintenance (duration: 01m 15s)
  • 16:32 bd808: Running wikireplica_dns from labcontrol1001
  • 16:14 cmjohnson: wdqs1004 down for systemboard replacement
  • 15:56 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:54 andrewbogott: restarting nodepool again
  • 15:42 andrewbogott: stopping nodepool again because something isn't quite right
  • 15:41 marostegui: Power off es1019 - T187530
  • 15:32 otto@tin: Synchronized wmf-config/ProductionServices.php: Point Mediawiki Monolog at new Kafka jumbo-eqiad cluster: T188136 (duration: 01m 16s)
  • 15:29 ottomata: merging and then deploying mediawiki-config to point monolog avro kafka producer at new kafka jumbo cluster: https://phabricator.wikimedia.org/T188136
  • 15:29 andrewbogott: disabling puppet on labnodepool1001
  • 15:17 andrewbogott: silencing nova and other openstack alerts in anticipation of service interruptions for https://phabricator.wikimedia.org/T189005
  • 15:01 marostegui: Disable puppet on db1073 - T189005
  • 15:00 marostegui: Change topology in m5, db2037 to become a slave of db1073 - T189005
  • 14:56 oblivian@tin: Synchronized wmf-config/CommonSettings.php: Use EtcdConfig everywhere (duration: 01m 15s)
  • 14:38 zeljkof: EU SWAT finished
  • 14:38 marostegui: Stop mysql on es1019 - T187530
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: SWAT: Blacklist Web of Trust junk from being added to pages (T189148) (duration: 01m 15s)
  • 14:35 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTarget.js: SWAT: Follow-up I5357a909: Fix logic for autosave from edited state (T189071) (duration: 01m 16s)
  • 14:28 mobrovac@tin: Finished deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052 (duration: 00m 33s)
  • 14:27 mobrovac@tin: Started deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052
  • 14:26 vgutierrez: uploaded pybal_1.15.2_all.deb to apt.wikimedia.org jessie-wikimedia
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: 2017 wikitext editor: Enable by default on officewiki (T188028) (duration: 01m 16s)
  • 14:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create the rollbacker group at ar.wikinews (T189206) (duration: 01m 16s)
  • 13:56 gehel: restart wdqs-updater on wdqs1005 to validate new config option - T188716
  • 13:52 sbisson@tin: Finished deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers (duration: 08m 31s)
  • 13:44 moritzm: depooling mwdebug2001, the host will temporarily be using an HHVM build linked against libicu57 to perform some tests
  • 13:43 sbisson@tin: Started deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers
  • 13:40 elukey: eventlogging analytics migrated from eventlog1001 to eventlog1002
  • 13:35 ariel@tin: Finished deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly (duration: 00m 03s)
  • 13:35 ariel@tin: Started deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly
  • 13:29 ema: cp-ulsfo: reboot for retpoline kernel updates T188092
  • 12:50 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:47 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 fully (duration: 01m 16s)
  • 11:32 moritzm: installing isc-dhcp security updates
  • 10:43 moritzm: installing libvpx security updates
  • 10:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Change db1114 load (duration: 01m 16s)
  • 10:14 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on T181121
  • 10:13 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on
  • 09:57 dcausse: restaring mjolnir-kafka-daemon.service on relforge1002 to switch to kafka jumbo
  • 09:56 dcausse: restaring mjolnir-kafka-daemon.service on relforge1001 to switch to kafka jumbo
  • 09:56 _joe_: decommissioning mw2017-2099 T187467
  • 09:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 partially (duration: 01m 16s)
  • 09:44 moritzm: rearming keyholder on neodymium after reboot
  • 09:40 moritzm: rebooting neodymium for kernel security update
  • 09:22 ema: cp-eqsin: reboot for retpoline kernel updates T188092
  • 09:12 ema: cp3043: varnish-be-restart T189085
  • 09:08 moritzm: rebooting bast1001 for kernel security update
  • 08:58 elukey: restart varnish backend on cp3041 (failed fetches)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046, db2053 and db2060 after kernel upgrade (duration: 01m 15s)
  • 08:58 moritzm: reset RAC on bast1001, serial console was stuck
  • 08:50 elukey: rebooting analytics1003 (Hadoop Hive, Oozie, etc..) for kernel updates
  • 08:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2046, db2053 and db2060 for kernel upgrade (duration: 01m 17s)
  • 08:31 elukey: reboot analytics1002 (Hadoop master standby) for kernel upgrades
  • 08:28 marostegui: Stop MySQL on db2046, db2053 and db2060 for kernel upgrade
  • 08:19 elukey: reboot analytics1001 (Hadoop master) for kernel upgrade (temp failover to analytics1002)
  • 08:09 ema: cp3040: varnish-be-restart T189085
  • 08:00 ema: cp3032: varnish-be-restart T189085
  • 07:44 elukey: reboot kafka2003 (eventbus codfw) for kernel updates
  • 07:24 elukey: reboot kafka2002 (eventbus codfw) for kernel updates
  • 07:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 for maintenance - T187530 (duration: 01m 16s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Revert: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 31s)
  • 04:27 Krinkle: Running whisper-mass-resize for ResourceLoader.* metrics on graphite1001 and graphite2001 (T179622)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 07m 37s)
  • 02:15 tgr@tin: Synchronized wmf-config/throttle.php: T189161 Temporarely remove account creation limit for event on Portuguese Wikipedia on March 08, 2018 (duration: 01m 10s)
  • 01:17 twentyafterfour: phabricator update completed
  • 01:13 twentyafterfour: preparing for phabricator update 2018-03-07/1
  • 00:37 thcipriani@tin: Synchronized wmf-config/db-eqiad.php: SWAT: wikitech: use FQDNs for m5 cluster members (duration: 01m 16s)
  • 00:28 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration for CirrusSearch to instantly index new Wikidata items T183053 (duration: 01m 15s)
  • 00:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable loginOnly mode for local auth provider on group 2 T57420 (duration: 01m 16s)

2018-03-07

  • 23:36 MaxSem: aborted due to growing DB lag
  • 23:08 MaxSem: running script for T187516
  • 23:00 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/AntiSpoof/: https://gerrit.wikimedia.org/r/#/c/417013/ (duration: 01m 16s)
  • 22:52 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/417014/ (duration: 01m 20s)
  • 22:44 MaxSem: dumping centralauth.spoofuser from db1079
  • 22:27 ejegg: deployed patch for T171987 to 1.31.0-wmf.23
  • 22:23 ejegg: deployed patch for T171987 to 1.31.0-wmf.24
  • 21:51 herron: puppetdb server reboots complete — re-enabling puppet agents
  • 21:45 herron: temporarily disabling puppet agents while puppetdb servers nitrogen and nihal are rebooted for kernel updates
  • 21:24 thcipriani@tin: Synchronized wmf-config: Improve load-order documentation for CommonSettings and InitialiseSettings noop doc change (duration: 01m 18s)
  • 21:05 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: Switch wikitech to swift (duration: 01m 15s)
  • 20:58 andrew@tin: Synchronized wmf-config/filebackend.php: Preparing wikitech to use swift for images, step two (duration: 01m 12s)
  • 20:56 andrew@tin: Synchronized wmf-config/CommonSettings.php: Preparing wikitech to use swift for images, step one (duration: 01m 16s)
  • 20:45 andrew@tin: Synchronized multiversion/MWMultiVersion.php: (no justification provided) (duration: 01m 16s)
  • 20:27 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to php-1.31.0-wmf.24
  • 19:43 Amir1: ladsgroup@terbium:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T183019)
  • 19:35 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https on fawiki and hewiki (T183019)
  • 19:18 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=mediawikiwiki --force-protocol https (T183019)
  • 18:56 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: retry (duration: 01m 15s)
  • 18:42 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 16s)
  • 18:40 tgr@tin: Synchronized static/images/project-logos: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 17s)
  • 18:30 tgr@tin: Synchronized debug.json: T187468 Switch to mwdebug hosts in codfw too (duration: 01m 15s)
  • 18:26 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T57420 Enable loginOnly mode for local auth provider on group 1 (duration: 01m 20s)
  • 17:41 moritzm: rebooting restbase-test* for kernel security update
  • 16:55 ema: cp5001: reboot for retpoline kernel updates T188092
  • 16:46 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052 (duration: 00m 33s)
  • 16:46 ppchelko@tin: Started deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052
  • 16:08 elukey: updating pcc facts for new hosts
  • 15:54 moritzm: rebooting rdb* fallback hosts in eqiad for kernel security update
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 16s)
  • 15:26 marostegui: Set disk 32:2 on db1064 as offline
  • 15:20 moritzm: rebooting krypton (running grafana among others) for kernel security update
  • 15:17 reedy@tin: Synchronized wmf-config/throttle.php: T189121 (duration: 01m 15s)
  • 14:45 Amir1: EU SWAT is done
  • 14:42 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052 (duration: 00m 36s)
  • 14:41 ppchelko@tin: Started deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052
  • 14:37 moritzm: rebooting rdb* hosts in codfw for kernel security update
  • 14:37 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 16s)
  • 14:35 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 15s)
  • 14:27 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:19 _joe_: adding mwdebug200{1,2} to ganeti in codfw, T187468
  • 14:17 urandom: reducing compression chunk length to 32kb on "wikipedia_T_page__summary".data - T189057
  • 14:10 zfilipin@tin: Synchronized wmf-config/: SWAT: Load Wikibase Quality extensions using extension registration (T106104) (duration: 01m 17s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T188626) (duration: 01m 18s)
  • 14:01 urandom: setting trace probability to 0.0, restbase eqiad cassandra cluster - T189057
  • 13:22 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all refreshLinks jobs to EventBus, file #2 - T185052 (duration: 01m 15s)
  • 13:22 moritzm: rebooting tungsten for kernel security update
  • 13:21 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all refreshLinks jobs to EventBus - T185052 (duration: 01m 15s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052 (duration: 00m 43s)
  • 13:20 moritzm: rebooting install2002 for kernel security update
  • 13:19 ppchelko@tin: Started deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052
  • 10:55 marostegui: Deploy schema change on codfw s4 master (db2051) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 10:54 moritzm: rearmed keyholders on netmon1002 and netmon2001
  • 10:50 elukey: reboot stat100[56] for kernel upgrades
  • 10:49 moritzm: reboot memcached hosts in codfw for kernel security update
  • 10:34 moritzm: rebooting netmon2001 for kernel security update
  • 10:29 moritzm: rebooting netmon1002 for kernel security update
  • 10:26 moritzm: rebooting boron for kernel security update
  • 10:11 moritzm: rebooting openldap/WMCS servers for kernel security update
  • 10:05 moritzm: rebooting openldap/corp servers for kernel security update
  • 10:03 elukey: reboot analytics10[35,52] for kernel updates - hadoop hdfs journal nodes (didn't manage to complete the work yesterday)
  • 10:03 moritzm: rebooting pool counters in codfw for kernel security update
  • 10:02 akosiaris: upload apertium-rus-ukr_0.2.0~r82706-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:56 moritzm: rebooting tureis/roentgenium for kernel security update
  • 09:53 akosiaris: upload apertium-rus_0.2.0~r82706-1+wmf1 and apertium-ukr_0.1.0~r82563-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:46 moritzm: rebooting etherpad1001 (etherpad.wikimedia.org) for kernel security update
  • 09:31 moritzm: rebooting darmstadtium (docker registry) for kernel security update
  • 09:24 moritzm: rearming keyholder on sarin after reboot
  • 09:16 moritzm: rebooting sarin for kernel security update
  • 08:57 ema: cp3033: restart varnish-be, backend connections piling up (~12k)
  • 08:40 marostegui: Deploy schema change on s7 primary master db1062 - T153182 T185128
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 after alter table (duration: 01m 16s)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2089,db2079 and db2065 after mariadb and kernel upgrade (duration: 01m 16s)
  • 07:30 marostegui: Stop mariadb on db2089,db2079 and db2065 for kernel upgrade
  • 07:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2089,db2079 and db2065 (duration: 01m 15s)
  • 06:49 marostegui: Deploy schema change on db1079 with replication enabled (this will generate lag on labs) - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 for alter table (duration: 01m 16s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 06m 03s)
  • 00:57 Amir1: Evening SWAT is done
  • 00:32 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Re-enable Wikidata descriptions (T188182) (duration: 01m 16s)

2018-03-06

  • 23:10 MaxSem: cancelled
  • 23:05 MaxSem: refreshing spoofuser
  • 23:00 MaxSem: dumping centralauth.spoofuser from db1094
  • 21:22 mutante: restbase-dev1006 powercycled via console (T185494)
  • 20:49 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.24
  • 20:44 ottomata: reverted change to point mediawiki monolog kafka producers at kafka jumbo-eqiad until deployment train is done T188136
  • 20:36 mutante: phab1001 (phabricator) - rebooting for maintenance
  • 20:35 ottomata: pointing mediawiki monolog kafka producers at kafka jumbo-eqiad cluster: T188136
  • 20:08 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache (duration: 29m 13s)
  • 19:39 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache
  • 18:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af (duration: 05m 28s)
  • 18:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af
  • 18:22 godog: puppet-merge Revert: Use hiera3 role/nuyaml backends on >= stretch
  • 17:58 marostegui: Reload haproxy on dbproxy1004 and dbproxy1009
  • 17:53 thcipriani: starting branch cut for 1.31.0-wmf.24
  • 17:53 andrewbogott: disabling puppet and apache on labpuppetmatser1001 and 1002
  • 17:47 moritzm: rebooting dbmonitor1001 for kernel security update
  • 17:42 moritzm: rebooting dbmonitor2001 for kernel security update
  • 17:38 moritzm: rebooting hassaleh for kernel security update
  • 17:34 vgutierrez: update pybal to 1.15.1 on lvs5003
  • 17:32 vgutierrez: update pybal to 1.15.1 on lvs1010
  • 17:28 vgutierrez: uploaded pybal_1.15.1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 after alter table (duration: 00m 58s)
  • 16:58 cmjohnson1: powering off rhenium to reset the idrac
  • 16:44 sbisson@tin: Finished deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch (duration: 05m 47s)
  • 16:38 sbisson@tin: Started deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch
  • 16:11 oblivian@tin: Synchronized wmf-config: Fetch data from etcd on all appservers (duration: 01m 01s)
  • 16:01 marostegui: Deploy schema change on db1069 - T187089 T185128 T153182
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 for alter table (duration: 00m 57s)
  • 15:54 jynus: deploying new query killer logic to all wikidata (s8) db replicas T188505
  • 15:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after alter table (duration: 00m 57s)
  • 15:51 moritzm: installing libvpx security updates
  • 15:50 oblivian@tin: Synchronized wmf-config: Expose etcd last modified index (duration: 01m 00s)
  • 15:45 moritzm: rebooting ununpentium for kernel security update
  • 15:39 oblivian@tin: Finished scap: Deploying Expose the latest modified index seen by EtcdConfig (duration: 09m 49s)
  • 15:29 oblivian@tin: Started scap: Deploying Expose the latest modified index seen by EtcdConfig
  • 15:28 moritzm: rebooting bromine for kernel security update
  • 15:19 mobrovac@tin: Synchronized php-1.31.0-wmf.23/includes/jobqueue/JobQueueSecondTestQueue.php: [JobQueueSecondTestQueue] Support read-only mode - T185052 (duration: 00m 58s)
  • 15:09 vgutierrez: update to pybal 1.15.0 on lvs5003
  • 15:02 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Article counts: Change 'comma' method to 'any' - T188472 (duration: 01m 00s)
  • 14:50 vgutierrez: update pybal to 1.15.0 on lvs1010
  • 14:46 hashar: tin: /srv/mediawiki-staging/php-1.31.0-wmf.23 rebased on tip of https://gerrit.wikimedia.org/r/#/c/416686/ (that revert a merge of master branch)
  • 14:42 gehel: rebooting maps1* (eqiad) for kernel security update completed
  • 14:36 ottomata: beginning migration of webrequest text varnishkafka logs from Kafka analytics to Kafka jumbo-eqiad T185136
  • 14:21 moritzm: rebooting labweb* for kernel security update
  • 14:13 moritzm: rebooting sca* for kernel security update
  • 14:07 gehel: rebooting maps1* (eqiad) for kernel security update
  • 14:07 moritzm: rebooting pybal-test for kernel security update
  • 14:00 _joe_: SWAT is suspended for investigation on tin's git status
  • 14:00 moritzm: rebooting oxygen for kernel security update
  • 13:16 moritzm: powercycling ms-be1038, stuck after reboot
  • 13:10 marostegui: Deploy schema change on db1094 - T187089 T185128 T153182
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 for alter table (duration: 00m 58s)
  • 12:55 moritzm: rebooting URL downloaders for kernel security update
  • 12:51 mobrovac@tin: Finished deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052 (duration: 00m 34s)
  • 12:50 mobrovac@tin: Started deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 after alter table (duration: 00m 58s)
  • 12:33 moritzm: rebooting mwlog* for kernel security update
  • 12:04 moritzm: rebooting graphite hosts in eqiad for kernel security update
  • 11:29 moritzm: rebooting k8s masters for kernel security update
  • 11:05 elukey: reboot analytics10[28,35,52] for kernel updates (one at the time, hadoop hdfs journal nodes)
  • 10:46 moritzm: powercycling ms-be1021, stuck after reboot
  • 10:45 akosiaris@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 01m 22s)
  • 10:43 moritzm: rearming keyholder on naos after reboot
  • 10:39 akosiaris: emergency add a captcha in metawiki contact pages like https://meta.wikimedia.org/wiki/Special:Contact/Stewards to stop bot abuse. phab Task to be filed later on
  • 10:39 godog: reboot ms-be1013 to try fix disk ordering
  • 10:35 moritzm: rebooting naos for kernel security update
  • 10:32 moritzm: rearming keyholder on tin after reboot
  • 10:30 gehel: kafka poller active on all production wdqs nodes - T188252
  • 10:28 moritzm: rebooting tin for kernel security update
  • 10:20 gehel: reboot completed for maps2* and maps-test*
  • 09:51 moritzm: rebooting graphite hosts in codfw for kernel security update
  • 09:42 marostegui: Stop MySQL on db1107 for mariadb and kernel upgrade
  • 09:41 vgutierrez: pybal_1.15.0_all.deb to apt.wikimedia.org jessie-wikimedia
  • 09:40 marostegui: Start proxysql on wasat
  • 09:38 moritzm: rebooting wezen for kernel security update
  • 09:27 elukey: reboot kafka2001 (eventbus codfw) for kernel updates
  • 09:24 marostegui: Deploy schema change on db1086 - T187089 T185128 T153182
  • 09:18 marostegui: Stop and reboot db1086 for kernel and mariadb upgrade
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 for alter table (duration: 00m 57s)
  • 09:17 moritzm: rebooting swift backend servers in eqiad for kernel security update
  • 09:17 moritzm: rebooting wwift backend servers in eqiad for kernel security update
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 after alter table (duration: 00m 57s)
  • 09:05 gehel: rolling restart of maps* for kernel upgrade
  • 08:50 elukey: reboot meitnerium (archiva) for kernel updates
  • 08:38 paravoid: rebooting furud
  • 08:35 moritzm: rebooting wasat for kernel security update
  • 08:30 elukey: drain+reboot analytics[1065-1067] for kernel updates
  • 08:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Update db1069 IP (duration: 00m 57s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1069 IP (duration: 00m 57s)
  • 08:15 moritzm: rebooting ruthenium for kernel security update
  • 08:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Revert depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 57s)
  • 08:10 moritzm: rebooting bast5001 for kernel security update
  • 08:01 elukey: drain+reboot analytics[61,63,64] for kernel updates
  • 07:59 moritzm: rebooting tegmen for kernel security update
  • 07:43 marostegui: Stop mysql on db2090 db2080 db2076 db2073 db2067 for mariadb and kernel upgrade
  • 07:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 58s)
  • 07:36 moritzm: rebooting remaining swift backend servers in codfw for kernel security update
  • 07:18 marostegui: Stop MySQL on db2093 to get some data from the event scheduler
  • 06:56 marostegui: Deploy schema change on db1101:3317 - T187089 T185128 T153182
  • 06:51 marostegui: Stop mysql on db2037 to upgrade it
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 for alter table (duration: 00m 58s)
  • 05:00 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend.php: T180183: I6d72873b9d3 (duration: 00m 56s)
  • 04:59 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 - Ie5a164a9e2b (duration: 00m 57s)
  • 04:58 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta: no-op (duration: 00m 54s)
  • 04:57 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend-labs.php: beta: no-op (duration: 00m 57s)
  • 04:29 bblack: eqsin router maintenance starting soon-ish. all of eqsin will be offline and isn't in production service to begin with. We've tried to downtime all the things, but don't be shocked at spurious alerts! - T187807
  • 04:08 krinkle@tin: Synchronized multiversion/MWMultiVersion.php: Ia2acf57c6 (duration: 00m 57s)
  • 04:01 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 (duration: 01m 33s)
  • 02:26 tgr@tin: Synchronized wmf-config/CommonSettings.php: T186296 Increase ReadingLists list size limit to 5k (duration: 01m 06s)
  • 02:07 tgr@tin: Finished scap: T187226#4025352 update ReadingLists (duration: 18m 49s)
  • 01:48 tgr@tin: Started scap: T187226#4025352 update ReadingLists
  • 01:00 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: refresh wmf-config/InitialiseSettings, seems to have stuck in old state on some servers after doing the initial sync in the wrong order (duration: 00m 57s)
  • 00:54 tgr@tin: Synchronized wmf-config: T57420 Enable loginOnly mode for local auth provider on group 0 (duration: 01m 00s)
  • 00:41 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op I33f09b164e7 (duration: 00m 58s)
  • 00:38 krinkle@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only: I02a4d4 (duration: 00m 57s)

2018-03-05

  • 22:44 bawolff@tin: Synchronized php-1.31.0-wmf.23/includes/logging/LogPager.php: T188145 (duration: 00m 58s)
  • 21:32 arlolra: Updated Parsoid to d115592 (T188591)
  • 21:25 arlolra@tin: Finished deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592 (duration: 12m 12s)
  • 21:13 arlolra@tin: Started deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592
  • 20:04 gehel@tin: Finished deploy [wdqs/wdqs@1983ddf]: wdqs GUI update (duration: 01m 36s)
  • 20:03 gehel@tin: Started deploy [wdqs/wdqs@1983ddf]: wdqs GUI update
  • 20:02 hashar@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase: Fix empty condition list in metadata lookup - T188313 (duration: 01m 58s)
  • 19:51 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/416219/ (duration: 00m 57s)
  • 19:43 maxsem@tin: Synchronized php-1.31.0-wmf.23/extensions/Cite: https://gerrit.wikimedia.org/r/#/c/416467/ (duration: 00m 58s)
  • 19:30 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update (duration: 02m 36s)
  • 19:28 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update
  • 19:23 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416456/ (duration: 00m 58s)
  • 19:21 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 01m 23s)
  • 19:20 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 19:14 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416457/ (duration: 00m 58s)
  • 18:54 jynus: stop slave on db2044
  • 18:24 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken (duration: 00m 54s)
  • 18:23 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken
  • 18:20 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 03m 08s)
  • 18:16 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 17:34 elukey: drain + reboot analytics10[58-60] for kernel updates
  • 17:32 bd808: Added zhuyifei1999_ and chicocvenancio to the "toollabs-trusted" gerrit group
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186699 (duration: 00m 57s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 after alter table (duration: 00m 57s)
  • 16:00 elukey: test
  • 15:56 akosiaris: upload tiller on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:56 akosiaris: upload helm on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:55 urandom: setting trace probability to 0.001 (.1%), eqiad datacenter, restbase cassandra cluster
  • 15:52 urandom: updating `system_traces` keyspace replication strategy, restbase cassandra cluster
  • 15:51 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all of the cdnPurge to EventBus, file 2/2 - T188540 (duration: 00m 57s)
  • 15:50 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 15:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all of the cdnPurge to EventBus, file 1/2 - T188540 (duration: 00m 57s)
  • 15:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka (duration: 00m 35s)
  • 15:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka
  • 15:42 marostegui: stop and poweroff db1069 for rack change - T186699
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186699 (duration: 00m 57s)
  • 15:41 elukey: drain + reboot analytics 1055->57 for kernel updates
  • 15:38 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch 50% for refreshLinks to EventBus - T185052 (duration: 00m 57s)
  • 15:31 ppchelko@tin: Finished deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs (duration: 00m 39s)
  • 15:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs
  • 15:28 marostegui: Mark as failed disk 32:9 on db1068 (s4 primary master) - T188187
  • 15:20 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobExecutor.php: [JobExecutor] Wait for the replicas if the transaction takes too long (duration: 00m 57s)
  • 15:14 moritzm: rebooting webperf2001 for kernel security update
  • 14:57 hashar: European SWAT completed
  • 14:57 hashar@tin: Finished scap: 2017 wikitext editor: Simplify config part 2 (duration: 02m 57s)
  • 14:54 hashar@tin: Started scap: 2017 wikitext editor: Simplify config part 2
  • 14:52 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:44 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable rollbacker user right at arwikiversity - T188633 (duration: 00m 57s)
  • 14:41 hashar@tin: Finished scap: core + Flow, master/replicate race condition - T182358 T184670 (duration: 04m 24s)
  • 14:36 hashar@tin: Started scap: core + Flow, master/replicate race condition - T182358 T184670
  • 14:34 elukey: graphite metrics mw.error.* deprecated in T188749
  • 14:31 hashar@tin: Finished scap: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 23m 08s)
  • 14:11 hashar: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=bdwikimedia translate # T188853
  • 14:08 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 14:06 hashar@tin: scap aborted: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 00m 16s)
  • 14:06 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 13:55 moritzm: rolling reboot of swift backends in codfw for kernel security update
  • 13:49 moritzm: rebooting releases2001 for kernel security update
  • 13:37 moritzm: rebooting neon for kernel security update
  • 13:37 mobrovac@tin: Started restart [cpjobqueue/deploy@b5255f0]: Force RecordLintJob rebalance in Kakfa - T188870
  • 13:04 moritzm: rebooting bast4002 for kernel security update
  • 13:00 marostegui: Deploy schema change on db1098:3317 - T187089 T185128 T153182
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for alter table (duration: 00m 57s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:40 moritzm: rebooting bast4001 for kernel security update
  • 12:30 marostegui: Remove db1011 from tendril as it will be decommissioned - T184703
  • 12:19 moritzm: installing libvpx security updates
  • 12:13 moritzm: installing wavpack security updates
  • 12:08 moritzm: installing freexl security updates
  • 11:59 moritzm: upgrading tor on radium
  • 11:40 moritzm: updating tor packages to 0.3.2.10
  • 11:19 moritzm: running "racadm racreset" on rhenium, mgmt inaccessible
  • 11:09 elukey: drain + reboot analytics10[50,51,53,54] for kernel updates
  • 10:53 moritzm: rebooting bast2001 for kernel security update
  • 10:46 moritzm: rebooting lithium for kernel security update
  • 10:24 elukey: drain + reboot analytics10[46-49] for kernel updates
  • 10:23 moritzm: rolling reboot of logstash* for kernel security update
  • 09:33 godog: roll restart swift in codfw to add thumbor private user
  • 09:15 marostegui: Deploy schema change on s7 codfw master (db2040), this will generate lag on codfw - T187089 T185128 T153182
  • 09:01 godog: roll-restart thumbor to apply https://gerrit.wikimedia.org/r/416240
  • 08:54 marostegui: Stop mariadb on db2037 to copy it to db1073
  • 08:25 marostegui: Stop MySQL on db2078 for mariadb and kernel upgrade
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1073 from config (duration: 00m 58s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1073 from config (duration: 00m 59s)
  • 07:06 marostegui: Deploy schema change on s2 primary master db1054 - T185128 T153182
  • 02:08 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2018-03-04

  • 20:16 tgr: T188721 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 18:05 musikanimal: T188721 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 15:59 elukey: powercycle stat1004 - available via mgmt, root login freezes while trying

2018-03-03

  • 14:16 akosiaris: 13:56:20 ema: powercycle ganeti1005 T181121
  • 13:56 ema: powercycle ganeti1005
  • 13:25 andrewbogott: forced quota update in admin-monitoring as well; the reserved fixed_ip value was incorrect
  • 13:23 andrewbogott: forcing quota update in nova with update quota_usages set reserved='-1' where project_id='contintcloud';
  • 13:10 andrewbogott: restarting rabbitmq-server on labcontrol1001
  • 13:08 andrewbogott: retarting nodepool
  • 13:05 andrewbogott: restarting nova-conductor
  • 13:02 andrewbogott: stopping nodepool for a bit while investigating openstack issues
  • 02:14 chasemp: labnodepool1001:~# service nodepool start
  • 01:30 chasemp: root@labnet1001:~# service nova-fullstack restart
  • 01:21 chasemp: labnodepool1001:~# service nodepool stop

2018-03-02

  • 19:44 jynus: restarting labsdb1010
  • 17:22 mepps: updated payments-wiki 498f49a758 to ce68e8e80b
  • 15:19 elukey: drain + reboot analytics10[41-45] for kernel updates
  • 15:15 moritzm: rebooting auth* for kernel security updates
  • 13:46 elukey: drain + reboot analytics10[38,39,40,41] for kernel updates
  • 13:22 elukey: drain + reboot analytics10[33,34,36,37] for kernel updates
  • 13:17 moritzm: upgrading labtest trusty hosts to latest 4.4 kernel
  • 12:23 moritzm: rebooting kubetcd/kubestagetcd for kernel security update
  • 12:00 moritzm: rebooting etcd* for kernel security updates
  • 11:58 elukey: drain + reboot analytics10[29,31,32] for kernel updates
  • 11:33 moritzm: draining restbase1018 for eventual reboot for kernel security update
  • 11:28 akosiaris: upload to apt.wikimedia.org component thirdparty/ci distro jessie-wikimedia docker-ce_17.12.1~ce-0~debian_amd64 T177499
  • 11:07 moritzm: rebooting mwdebug* for kernel security update
  • 10:54 ema: spare LVSs lvs[1011-1012], lvs[4001-4004]: reboot for retpoline kernel updates T188092
  • 10:53 moritzm: draining restbase1017 for eventual reboot for kernel security update
  • 10:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 (duration: 00m 57s)
  • 10:18 moritzm: draining restbase1016 for eventual reboot for kernel security update
  • 10:18 jynus: shutting down labsdb1010
  • 10:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 56s)
  • 10:01 elukey: deleted /etc/burrow/* from zookeeper main eqiad/codfw after https://gerrit.wikimedia.org/r/415818 (garbage to cleanup)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 57s)
  • 09:40 moritzm: draining restbase1015 for eventual reboot for kernel security update
  • 09:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1114 in s1 after cloning it from db1073 - T183469 (duration: 01m 01s)
  • 08:57 moritzm: rebooting scb1004 for kernel security update (was omitted from earlier reboots due to hardware issues on scb1003)
  • 08:51 moritzm: repooling scb1003 after memory module was replaced (T188385)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 57s)
  • 07:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:11 moritzm: rebooting xenon/praseodymium/cerium for kernel security update
  • 07:11 moritzm: rebooting xenon/praseodymium/xenon for kernel security update
  • 06:52 marostegui: Stop MySQL on db1073 to clone db1114 - T183469
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 to clone db1114 - T183469 (duration: 00m 58s)
  • 02:48 legoktm: manually purged ExtensionDistributor cache (T188692)
  • 01:54 mutante: cobalt (gerrit) - rebooting for kernel upgrade
  • 01:46 mutante: LDAP: added lucaswerkmeister-wmde to 'wmde' and 'nda' groups (T188105)
  • 00:49 ebernhardson@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: T148603: (duration: 00m 57s)
  • 00:48 herron: fermium (lists) and mx systems rebooted for kernel update
  • 00:46 ebernhardson@tin: Synchronized php-1.31.0-wmf.23/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT T187148: Start cirrus query explorer AB test (duration: 00m 57s)
  • 00:25 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148 Configure Cirrus AB test (step 2) (second try) (duration: 00m 57s)
  • 00:23 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: T187148 Configure Cirrus AB test (step 1) (second try) (duration: 00m 57s)
  • 00:12 ebernhardson@tin: Synchronized wmf-config/: REVERT SWAT: T187148 Configure Cirrus AB test (duration: 00m 59s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/: SWAT: T187148 Configure Cirrus AB test (duration: 01m 00s)

2018-03-01

  • 22:35 gehel: rolling restart of elsticsearch / cirrus - eqiad complete, cluster is green
  • 21:45 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.23
  • 21:33 bsitzmann@tin: Finished deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833) (duration: 05m 15s)
  • 21:28 bsitzmann@tin: Started deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833)
  • 21:17 thcipriani@tin: Synchronized php-1.31.0-wmf.23/extensions/GeoData/includes/api/ApiQueryGeoSearchElastic.php: Fix undefined property error in ApiQueryGeoSearchElastic T188659 (duration: 01m 15s)
  • 20:30 thcipriani@tin: Synchronized php: php link to 1.31.0-wmf.23 (duration: 01m 12s)
  • 20:29 andrewbogott: restarting labweb1002
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.23
  • 20:15 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/specials/pagers/NewPagesPager.php: SWAT: NewPagesPages: Use array_merge rather than + for RC query info fields T188555 (duration: 01m 14s)
  • 20:15 andrewbogott: rebooting labweb1001
  • 19:56 thcipriani@tin: Synchronized langlist-labs: SWAT: beta: add nlwiki to langlist T188582 (beta-only change) (duration: 01m 13s)
  • 19:50 gehel: new kafka based poller for wdqs now enabled on wdqs2001 - T188252
  • 19:48 thcipriani@tin: Synchronized wmf-config/throttle-analyze.php: SWAT: Revert "Automatically include commons and wikidata in $wmgThrottlingExceptions" (duration: 01m 14s)
  • 19:36 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollback for editors at zh_classicalwiki T188064 (duration: 01m 14s)
  • 19:31 gehel@tin: Finished deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues (duration: 02m 12s)
  • 19:29 gehel@tin: Started deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues
  • 19:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable responsive references by default on rowiki T187997 (duration: 01m 15s)
  • 19:21 mutante: scb1003 depooled scb1003 from all services on scb because it went down, including mgmt
  • 19:20 dzahn@neodymium: conftool action : set/pooled=no; selector: name=scb1003.eqiad.wmnet
  • 19:17 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Make last throttle limit raise work accross all wikis T188630 (duration: 01m 13s)
  • 19:15 mutante: powercycling crashed scb1003
  • 19:13 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Fix throttle date for outreach dashboard T188630 (duration: 01m 13s)
  • 18:47 demon@tin: Synchronized wmf-config/: killing extension-list-labs (duration: 01m 17s)
  • 18:45 demon@tin: Synchronized wmf-config/InitialiseSettings.php: disable performance inspector in prod explicitly (duration: 01m 14s)
  • 18:43 demon@tin: Synchronized docroot/noc/: killing extension-list-labs (duration: 01m 14s)
  • 18:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833) (duration: 06m 01s)
  • 18:07 bsitzmann@tin: Started deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833)
  • 17:51 gehel: depooling wdqs2001 and switching to kafka poller - T188252
  • 17:47 gehel: restarting wdqs-updater on wdqs1004 -T188045
  • 17:46 mutante: re-enabling icinga notifications for wdqs1004 services, ethernet cable has been replaced (T188045)
  • 17:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 01m 14s)
  • 17:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 28s)
  • 17:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 13s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 01m 13s)
  • 16:41 jynus: reimporting database testreduce_0715 from db1009 to db2037
  • 16:36 marostegui: Restart mariadb on db1093 for binlog format change - T186321
  • 16:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T186321 (duration: 01m 13s)
  • 16:14 moritzm: rebooting hafnium for kernel security update
  • 16:06 marostegui: Fix s7 replication on labsdb1010 - T186579
  • 16:00 moritzm: rebooting radium (tor relay) for kernel security update
  • 15:52 moritzm: draining restbase1014 for eventual reboot for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 as API (duration: 01m 13s)
  • 15:32 bblack: disabling puppet on A:cp for deploy of https://gerrit.wikimedia.org/r/#/c/415204/ and friends
  • 15:30 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.23) - T188540 (duration: 01m 14s)
  • 15:26 mobrovac@tin: Synchronized php-1.31.0-wmf.22/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.22) - T188540 (duration: 01m 13s)
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 13s)
  • 15:22 moritzm: draining restbase1013 for eventual reboot for kernel security update
  • 15:19 zeljkof: EU SWAT finished
  • 15:18 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/Popups: SWAT: Fix: dont assume thumbnail URLs contain pixel size (T187955) (duration: 01m 14s)
  • 15:17 moritzm: rolling restart of swift frontends in eqiad for kernel security update
  • 15:12 godog: upload puppetdb 4.4.0-1~wmf1 to component/puppetdb4 - T177253
  • 15:00 ema: eqiad LVSs: reboot for retpoline kernel updates T188092
  • 14:36 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Import sources on maiwikimedia (T188374) (duration: 01m 13s)
  • 14:28 moritzm: rolling restart of swift frontends in codfw for kernel security update
  • 14:26 moritzm: draining restbase1012 for eventual reboot for kernel security update
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Quiz Extension at zhwikibooks (T188213) (duration: 01m 14s)
  • 14:12 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 2/2 - T188540 (duration: 01m 13s)
  • 14:10 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 1/2 - T188540 (duration: 01m 14s)
  • 14:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540 (duration: 00m 44s)
  • 14:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540
  • 13:54 moritzm: draining restbase1011 for eventual reboot for kernel security update
  • 13:50 ema: codfw LVSs: reboot for retpoline kernel updates T188092
  • 13:33 gehel: force merging enwiki_general index on codfw to reclaim space
  • 13:18 moritzm: draining restbase1010 for eventual reboot for kernel security update
  • 13:17 elukey: reboot kafka-jumbo100[5,6] for kernel updates
  • 13:16 ema: esams LVSs: reboot for retpoline kernel updates T188092
  • 12:44 moritzm: draining restbase1009 for eventual reboot for kernel security update
  • 12:39 moritzm: rolling reboot of parsoid in eqiad for kernel security update
  • 12:27 elukey: reboot kafka-jumbo1004 for kernel updates
  • 12:21 elukey: reboot kafka1023 for kernel updates
  • 11:59 moritzm: draining restbase1008 for eventual reboot for kernel security update
  • 11:48 moritzm: powercycling wtp2013, stuck in reboot
  • 11:36 elukey: reboot kafka-jumbo1003 for kernel updates
  • 11:33 jynus: restarting labsdb1011
  • 11:32 elukey: reboot kafka1022 for kernel updates
  • 11:20 elukey: reboot kafka-jumbo1002 for kernel security updates
  • 11:15 moritzm: draining restbase1007 for eventual reboot for kernel security update
  • 11:13 ema: ulsfo LVSs: reboot for retpoline kernel updates T188092
  • 11:08 elukey: reboot kafka1020 for kernel updates
  • 10:38 ema: eqsin LVSs: reboot for retpoline kernel updates T188092
  • 10:32 moritzm: rolling reboot of parsoid in codfw for kernel security update
  • 10:27 moritzm: draining restbase2012 for eventual reboot for kernel security update
  • 10:20 moritzm: rebooting labnodepool1001 for kernel security update
  • 10:02 moritzm: rebooting contint1001 for kernel security update
  • 09:59 elukey: reboot kafka1014 for kernel security updates
  • 09:57 moritzm: draining restbase2011 for eventual reboot for kernel security update
  • 09:43 elukey: reboot kafka1013 for kernel security updates
  • 09:29 elukey: rebooting analytics1030 for kernel updates
  • 09:17 moritzm: draining restbase2010 for eventual reboot for kernel security update
  • 08:52 moritzm: rebooting prometheus servers in eqiad for kernel security update
  • 08:41 moritzm: draining restbase2009 for eventual reboot for kernel security update
  • 08:34 elukey: reboot kafka1012 for kernel updates - T188594
  • 08:20 gehel: banning elastic1021 from cluster (failed memory) - T188595
  • 07:55 elukey: reboot kafka-jumbo1001 for kerne updates - T188594
  • 07:52 elukey: run kafka preferred-replica-election on kafka1012 to force broker 18 to get back among Kafka topic leaders
  • 07:26 gehel: starting rolling reboot of elasticsearch / cirrus - eqiad (kernel upgrade and config changes)
  • 07:24 demon@tin: Synchronized php-1.31.0-wmf.22/maintenance/sql.php: adding --json output mode (duration: 01m 15s)
  • 06:59 chasemp: restart nova-api on labnet1001
  • 06:57 madhuvishy: Restart nova-conductor on labcontrol1001
  • 06:26 marostegui: Deploy schema change on db1074 - T187089 T185128 T153182
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 14s)
  • 06:09 marostegui: Reload haproxy on dbproxy1005
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 23s)
  • 02:05 demon@tin: Synchronized wmf-config/: removing extension-list-wikitech (duration: 01m 13s)
  • 02:03 demon@tin: Synchronized docroot/noc/: cleanup extension-list-wikitech removal (duration: 01m 12s)
  • 01:49 demon@tin: Synchronized wmf-config/: Undeploying EmailAuth from beta, no-op (duration: 01m 16s)
  • 01:32 eileen: update civicrm revision changed from 341c734a79 to a819d64d98, config revision is 62631813fc (add geocoder extension)
  • 00:43 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up $wgEchoPerUserBlacklist setting (duration: 01m 14s)
  • 00:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Remove $wgUsejQueryThree (duration: 01m 14s)
  • 00:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswikibooks (T145394) (duration: 01m 13s)
  • 00:17 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswiki (T130279) (duration: 01m 14s)

2018-02-28

  • 23:27 eileen: civicrm revision changed from a47eafcbad to 341c734a79, config revision is 62631813fc (update civicrm submodule & vendor but not geocoder extension as yet)
  • 22:11 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.22 T188555
  • 22:00 ejegg: updated payments-wiki from 1acfc4a9a0 to 498f49a758
  • 21:57 milimetric@tin: Finished deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment (duration: 04m 19s)
  • 21:56 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.23
  • 21:53 milimetric@tin: Started deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment
  • 21:46 arlolra: Updated Parsoid to 1415a2a (T58756, T169006)
  • 21:26 arlolra@tin: Finished deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a (duration: 08m 46s)
  • 21:17 arlolra@tin: Started deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a
  • 20:53 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 (back) to 1.31.0-wmf.23
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki to 1.31.0-wmf.23
  • 20:20 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/page/WikiPage.php: WikiPage: Avoid $user variable reuse in doDeleteArticleReal() T188479 (duration: 00m 57s)
  • 19:52 demon@tin: Synchronized README: no-op, forcing co-master sync (duration: 00m 57s)
  • 19:29 gehel: rolling reboot of elasticsearch / cirrus - codfw completed
  • 18:56 demon@tin: Finished deploy [gerrit/gerrit@f16f4a4]: GO plugin (duration: 00m 10s)
  • 18:55 demon@tin: Started deploy [gerrit/gerrit@f16f4a4]: GO plugin
  • 18:53 niharika29@tin: Synchronized wmf-config/throttle.php: Clean obsolete rules and add a new one - T188529 (duration: 00m 56s)
  • 18:44 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:42 niharika29@tin: Synchronized wmf-config/Wikibase.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:32 godog: puppet reenable on einsteinium
  • 18:30 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading from full term entity id everywhere T114903 (duration: 00m 57s)
  • 18:23 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wikibase RC injection for ruwiki [mediawiki-config] - https://gerrit.wikimedia.org/r/415078 (duration: 00m 57s)
  • 18:19 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Compact Language Links out of Beta on English Wikipedia T187677 (duration: 00m 58s)
  • 18:17 mutante: gerrit2001 - reboot for kernel upgrade
  • 18:12 godog: force a puppet run on failed hosts in eqiad for recovery
  • 18:09 apergos: rebooting dataset1001 (dumps.wm.o) for new kernel
  • 18:06 godog: stop and restart apache2 on puppetmaster1002
  • 17:58 godog: restart apache2 on puppetmaster1002
  • 17:46 milimetric@tin: Finished deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact (duration: 06m 45s)
  • 17:46 kart_: Finished running CLL preference migration script on terbium (T187677)
  • 17:39 milimetric@tin: Started deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact
  • 17:38 mutante: phab2001 - downtimed, rebooting for kernel upgrade
  • 16:44 moritzm: draining restbase2008 for eventual reboot for kernel security update
  • 16:10 moritzm: rebooting prometheus servers in codfw for kernel security update
  • 16:10 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons (duration: 00m 41s)
  • 16:09 ppchelko@tin: Started deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons
  • 16:02 moritzm: draining restbase2007 for eventual reboot for kernel security update
  • 15:45 godog: repool rhodium as puppet master backend
  • 15:22 moritzm: rebooting ores in eqiad for kernel security update
  • 15:22 ema: upgrade cache_text@eqiad to varnish 5
  • 15:20 moritzm: draining restbase2006 for eventual reboot for kernel security update
  • 15:16 zeljkof: EU SWAT finished
  • 15:15 zfilipin@tin: Synchronized php-1.31.0-wmf.23/extensions/WikibaseQualityConstraints/: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) Bump cache key for check results (T188384) (duration: 01m 02s)
  • 15:11 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Bump cache key for check results (T188384) (duration: 01m 02s)
  • 14:54 moritzm: rebooting ores in codfw for kernel security update
  • 14:53 jynus: stopping labsdb1011 to clone it to labsdb1010 T186579
  • 14:50 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Drop the medlem user group and editallpages user right (T184981) (duration: 00m 57s)
  • 14:48 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) (duration: 01m 02s)
  • 14:47 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: SWAT: Only filter statuses after collecting metadata (T188384) (duration: 01m 03s)
  • 14:38 jynus: dropping sqldata on dbstore1001
  • 14:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable HTML Previews on all wikipedias (T182319) (duration: 00m 57s)
  • 14:28 moritzm: rebooting kubestage* for kernel security update
  • 14:25 gehel@tin: Finished deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator (duration: 04m 27s)
  • 14:22 moritzm: draining restbase2005 for eventual reboot for kernel security update
  • 14:21 gehel@tin: Started deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: beta: enable VirtualPagePreviews events on beta cluster (T184793 T186728) (duration: 00m 57s)
  • 13:13 moritzm: draining restbase2004 for eventual reboot for kernel security update
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2011 - T187886 (duration: 00m 59s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2011 - T187886 (duration: 00m 58s)
  • 12:35 moritzm: draining restbase2003 for eventual reboot for kernel security update
  • 12:00 marostegui: Reboot db1115 tendril master to pick up new my.cnf options - T184704
  • 11:49 moritzm: draining restbase2002 for eventual reboot for kernel security update
  • 11:37 marostegui: Reset slave all on db2093 - T184704
  • 11:35 moritzm: rebooting eqiad job runners for kernel security update
  • 11:18 moritzm: powercycling restbase2001, stuck in reboot
  • 11:10 godog: rollout thumbor 1.15 to codfw/eqiad
  • 10:59 godog: upload python-thumbor-wikimedia 1.15 - T187822 T187350
  • 10:59 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1261.eqiad.wmnet
  • 10:54 moritzm: draining restbase2001 for eventual reboot for kernel security update
  • 10:43 moritzm: rebooting remaining mediawiki app servers in eqiad
  • 09:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2083, db2082 and db2081 after kernel upgrade (duration: 00m 57s)
  • 09:25 ema: upgrade cache_text@codfw to varnish 5
  • 09:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083, db2082 and db2081 for kernel upgrade (duration: 00m 56s)
  • 09:06 marostegui: Reboot db2083, db2082 and db2081 for kernel and mariadb upgrade
  • 08:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 - T162807 (duration: 00m 57s)
  • 08:42 filippo@neodymium: conftool action : set/pooled=yes; selector: name=neodymium.eqiad.wmnet
  • 08:42 filippo@neodymium: conftool action : set/pooled=no; selector: name=neodymium.eqiad.wmnet
  • 08:34 marostegui: Reboot db2069 for kernel upgrade
  • 08:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2069 - T162807 (duration: 00m 57s)
  • 08:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T162807 (duration: 00m 57s)
  • 08:10 moritzm: rebooting remaining mediawiki API servers in eqiad
  • 07:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 - T162807 (duration: 00m 57s)
  • 07:51 marostegui: Reboot db2062 for mariadb and kernel upgrade
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2085 (duration: 00m 57s)
  • 07:15 marostegui: Upgrade kernel and mariadb on db2085
  • 07:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2085 for mariadb and kernel upgrade (duration: 01m 00s)
  • 06:32 marostegui: Deploy schema change on db1060 (with replication) - this will cause lag on labs servers - T187089 T185128 T153182
  • 06:31 kart_: (Re)Starting CLL preference migration script on terbium (T187677)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 05:43 demon@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 04:55 krinkle@tin: Synchronized wmf-config/profiler.php: Iba417de75a and Ied984d (duration: 01m 06s)
  • 03:01 kart_: Starting CLL preference migration script on terbium (T187677)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 21s)
  • 00:55 demon@tin: Synchronized scap/plugins/wmfbetaautoupdate.py: no-op (duration: 01m 14s)
  • 00:24 papaul: OS install on wdqs200[4-6]
  • 00:03 thcipriani@tin: Synchronized php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameUserJob.php: LocalRenameUserJob: escape backreferences in replacement title T188171 (duration: 01m 13s)

2018-02-27

  • 23:38 krinkle@tin: Synchronized dblists/: remove pp_stage1_raw.dblist (duration: 01m 14s)
  • 21:23 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/user/User.php: Add a missing check of $wgActorTableSchemaMigrationStage T188437 (duration: 01m 14s)
  • 20:42 ppchelko@tin: Finished deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull (duration: 02m 29s)
  • 20:39 ppchelko@tin: Started deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull
  • 20:37 ppchelko@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers (duration: 00m 25s)
  • 20:36 ppchelko@tin: Started deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.23
  • 20:08 herron: eqiad puppet master reboots finished -- re-enabling puppet agents
  • 20:02 herron: temporarily disabling puppet agents and rebooting eqiad puppet masters for kernel update
  • 20:02 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache (duration: 32m 10s)
  • 19:30 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache
  • 19:08 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (duration: 04m 16s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241
  • 19:03 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only) (duration: 00m 22s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only)
  • 18:32 otto@tin: Started restart [eventstreams/deploy@7629e16]: service restart to publish page change related streams: T187241 (scb2001 only)
  • 18:32 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only) (duration: 00m 03s)
  • 18:32 otto@tin: Started deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only)
  • 18:02 moritzm: rebooting kubernetes workers in eqiad for kernel security update
  • 17:46 moritzm: rebooting kubernetes workers in codfw for kernel security update
  • 17:41 jynus: restarting ferm on db2049, seems failed one day ago
  • 17:38 gehel: restarting wdqs-updater on wdqs1004 - T188045
  • 17:32 thcipriani: starting branch cut for 1.31.0-wmf.23 T183962
  • 17:14 godog: upload puppetdb 2.3.8-1~wmf1+stretch to stretch-wikimedia - T184562
  • 17:10 urandom: restarting Cassandra, restbase1007-a to test jmx_exporter
  • 16:53 elukey: restart cassandra-a on aqs1004 to test the prometheus jmx agent before complete rollout - T184795
  • 16:52 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH everywhere (duration: 00m 56s)
  • 16:50 ema: lvs1010: retpoline kernel/libs upgrade T188092
  • 16:46 ema: cp1008: retpoline kernel/libs upgrade T188092
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1081 (duration: 02m 04s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 55s)
  • 16:26 moritzm: rebooting mw1293-mw1298 for kernel security update
  • 16:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:10 thcipriani: restarting jenkins for plugin update
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:06 moritzm: rebooting restbase-dev for kernel security update
  • 15:49 awight: Restarting ORES celery workers, changing from 35 -> 45 workers per node.
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1081 - T186321 (duration: 00m 56s)
  • 15:37 marostegui: Stop MySQL and reboot db1081 for kernel ugprade, mariadb upgrade and binlog format change - T186321
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T186321 (duration: 00m 55s)
  • 15:33 moritzm: installing squid security updates
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 57s)
  • 15:20 moritzm: powercycling thumbor1004, stuck during reboot
  • 15:19 ottomata: beginning migration of varnishkafka webrequest upload from Kafka analytics to kafka jumbo
  • 15:11 ema: upgrade cache_text@esams to varnish 5 T184448
  • 15:02 gilles: EU SWAT finished
  • 15:02 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swit user for private containers (T187822) (duration: 00m 55s)
  • 15:00 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: (T187822) (duration: 00m 56s)
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix: Add missed line in wgLogo (T185977) (duration: 00m 56s)
  • 14:44 moritzm: rebooting thumbor in eqiad for kernel security update
  • 14:31 bblack: puppet disable on RPS-using hosts to be careful with RPS hosts https://gerrit.wikimedia.org/r/#/c/414676/ - cp*, lvs*, labstore
  • 14:27 chasemp: silence labvirt1019/1020 in icinga
  • 14:24 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation (duration: 00m 04s)
  • 14:23 ariel@tin: Started deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation
  • 14:15 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T188292) New throttle rule for cswiki (T187990) New throttle rule (T188034) (duration: 00m 57s)
  • 14:05 marostegui: Update tendril shard table for the "tendril" replication topology - T184704
  • 13:33 gehel: starting rolling restart of elasticsearch / cirrus codfw (config changes + kernel upgrade)
  • 13:25 moritzm: rebooting thumbor in codfw for kernel security update
  • 13:22 godog: upload ruby-mysql 2.9.1-1~bpo9+1 to stretch-wikimedia - T184562
  • 13:00 Amir1: inserting wikidata-related interwikis to site_identifiers table using eval.php in enwiki (T183019)
  • 12:35 marostegui: Remove /srv/tmp/dbstore1001 files from es1017 to free up space - T186596
  • 12:16 Hauskatze: The global rename: Darkweasel94 → Tokfo has FINISHED - T187629
  • 11:56 moritzm: rebooting mw1221-mw1235 (API servers) for kernel security update
  • 11:08 moritzm: rebooting mw1240-mw1258 (app servers) for kernel security update
  • 11:00 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=scb1003.eqiad.wmnet
  • 10:57 moritzm: keeping scb1003 depooled for T188385
  • 10:51 _joe_: updating python-conftool everywhere to 1.0.0
  • 10:51 _joe_: uploaded python-conftool 1.0.0 to stretch-wikimedia
  • 10:49 moritzm: powercycling scb1003, stuck during reboot
  • 10:29 Hauskatze: Starting big global rename: Darkweasel94 → Tokfo - with DBA/OPS green light - T187629
  • 10:07 akosiaris: poweroff sca1004 for T181121 tests
  • 10:05 moritzm: reboot scb in eqiad for kernel security updates
  • 10:03 _joe_: uploading conftool-1.0.0-1 to jessie-wikimedia
  • 09:16 godog: reimage rhodium - T184562
  • 08:42 gehel: powercycling wdqs1004 - T188045
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1084 (duration: 00m 56s)
  • 08:24 gilles@tin: Synchronized private/PrivateSettings.php: Separate Thumbor Swift user for private containers (duration: 00m 56s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 56s)
  • 07:04 marostegui: Stop MySQL on db1084 for kernel and mariadb upgrade
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 56s)
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1084 (duration: 00m 56s)
  • 06:59 demon@tin: Synchronized README: no-op (duration: 00m 56s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Increase traffic for db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly repool db1103:3312 (duration: 00m 56s)
  • 06:33 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:21 marostegui: Stop MySQL on db1115 to copy it to db2093 - tendril (dbtree) service will be down for this maintenance - T184704
  • 06:20 marostegui: Reload haproxy on dbproxy1005
  • 05:26 krinkle@tin: Synchronized wmf-config/profiler.php: I1e7dc263b43 (duration: 00m 56s)
  • 05:00 krinkle@tin: Synchronized wmf-config/profiler.php: I34687c0569af (duration: 00m 57s)
  • 03:28 krinkle@tin: Synchronized wmf-config/profiler.php: various refactor and clean up for T180183 (no-op) (duration: 00m 54s)
  • 03:12 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta only (no-op) (duration: 00m 56s)
  • 02:58 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 [keeping static files] (duration: 01m 24s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 11s)
  • 01:39 mutante: install1002 - re-enabling disabled puppet
  • 00:55 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: add very likely bad faith filter on svwiki (T174560) (duration: 00m 57s)
  • 00:49 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on svwiki (T174560) (duration: 00m 56s)
  • 00:40 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on simplewiki (T182012) (duration: 00m 56s)
  • 00:39 demon@tin: Synchronized wmf-config/CommonSettings.php: beta-only change: lsctorestaticarray (duration: 00m 56s)
  • 00:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on all wikinews wikis (T188000), all private wikis (T188009), test2wiki, loginwiki, votewiki and wikimania2017wiki (T188008) (duration: 00m 56s)

2018-02-26

  • 23:37 bd808@tin: Finished scap: wikitech: use 'labswiki' database on m5-master (T188029) (duration: 03m 21s)
  • 23:34 bd808@tin: Started scap: wikitech: use 'labswiki' database on m5-master (T188029)
  • 23:31 bd808: Pulled T188029 change to silver
  • 22:57 demon@tin: Synchronized wmf-config/: fileimporter/fileexporter improvements (duration: 00m 58s)
  • 22:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: fileimporter/fileexporter improvements (duration: 00m 57s)
  • 22:09 andrewbogott: hotfixed mediawiki on silver to use m5-master for wikitech. This will be finalized with the merge of https://gerrit.wikimedia.org/r/#/c/414733/
  • 22:07 andrewbogott: made mysql on silver read-only, hopefully for good. T188029
  • 22:05 andrewbogott: logging a log to test logging a log
  • 22:03 andrewbogott: testing the log by logging a test
  • 19:46 catrope@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: T184937 (duration: 01m 03s)
  • 19:46 mutante: running puppet on cache::misc servers to add new director for design.wm
  • 19:29 catrope@tin: Synchronized wmf-config/CommonSettings.php: Simplify 2017 wikitext editor config (part 1) (duration: 00m 54s)
  • 19:26 catrope@tin: Synchronized wmf-config/throttle.php: Add throttle rule (T188129) (duration: 00m 56s)
  • 19:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add mushroomobserver.org to wgCopyUploadsDomains (T188203) (duration: 00m 57s)
  • 19:08 herron: codfw puppet master kernel updates complete re-enabling puppet agents
  • 18:31 gehel@tin: Finished deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh (duration: 06m 28s)
  • 18:24 gehel@tin: Started deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh
  • 18:13 demon@tin: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor: Ignore empty repositories (duration: 00m 56s)
  • 17:34 jynus: deploying new query killer to db1109
  • 17:32 akosiaris: shutdown sca1004 on ganeti1005 for T181121
  • 16:39 andrewbogott: making wikitech read-only (via a local patch) while I migrate the database to m5
  • 16:33 marostegui: Reboot db1111 storage crashed - T187526
  • 16:31 papaul: Maintenance: removing Msw-d4-codfw for replacement:T187534
  • 16:29 mutante: restarted stashbot on toolforge because it didn't react to !log
  • 16:26 mutante: test !log
  • 16:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 56s)
  • 15:45 andrewbogott: made wikitech read/write again pending a bit more preliminary work
  • 15:43 cmjohnson1: swapping failed disk db1068
  • 15:42 andrewbogott: marking wikitech read-only (via a local edit to CommonSettings.php) for https://phabricator.wikimedia.org/T188029
  • 15:32 addshore: EU SWAT done
  • 15:31 addshore@tin: Finished scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations (duration: 11m 29s)
  • 15:19 addshore@tin: Started scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations
  • 15:12 Amir1: This might have performance implications roll it back if it affects these wikis too much
  • 15:12 gehel: reboot of relforge completed, cluster is green again
  • 15:11 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading full entity id from wb_terms table in three wikis (T114903) (duration: 00m 56s)
  • 14:54 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add patrol rights/groups to fawikisource (T187662) (duration: 00m 56s)
  • 14:52 gehel: rebooting relforge for kernel upgrade
  • 14:50 godog: upload puppetdb 4.4.0-1~wmf1 to stretch-wikimedia - T177253
  • 14:48 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable statement usage tracking in several wikis (T151717) (duration: 00m 57s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespaces to urwiktionary (T186393) (duration: 00m 56s)
  • 14:28 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 55s)
  • 14:15 moritzm: rebooting scb in codfw for kernel security updates
  • 14:10 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php: SWAT: Added option to continue script from particular User ID Use a replica dedicated to slow queries (if available) (T187880) (duration: 00m 58s)
  • 13:09 moritzm: rebooting video scalers in eqiad for kernel security update
  • 11:12 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:11 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:01 moritzm: powercycling mw1264 (stuck after reboot)
  • 10:10 moritzm: rebooting mw canaries for kernel security update
  • 09:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 and db2070 (duration: 00m 55s)
  • 09:23 elukey: copied burrow 0.1 from jessie-wikimedia to stretch-wikimedia
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1103:3314 (duration: 00m 56s)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1103:3314 after mariadb and kernel upgrade (duration: 00m 56s)
  • 07:08 marostegui: Deploy schema change on db1103:3312 - T187089 T185128 T153182
  • 06:59 marostegui: Stop MySQL on db1103:3312 and 3314 to upgrade it and kernel
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 (duration: 00m 54s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui: Stop MySQL db2070 and db2055 to copy data to db2055 (and upgrade kernel and mariadb)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2055 and db2070 (duration: 01m 07s)
  • 06:15 marostegui: Stop MySQL on db1115 tendril database to copy it to db2093. Tendril (dbtree) service will be down for maintenance - T184704
  • 02:55 XioNoX: labs->cloud vlan rename in codfw - T187933
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 07m 12s)
  • 02:15 XioNoX: disabling ALGs on MR routers

2018-02-25

  • 07:35 marostegui: Fix s7 replication on labsdb1010 - T186579

2018-02-24

  • 06:11 marostegui: Reload haproxy on dbproxy1005
  • 01:42 demon@tin: Synchronized docroot/noc/conf/highlight.php: one last time (duration: 00m 57s)
  • 01:18 demon@tin: Synchronized docroot/noc/conf/index.php: fix dblist links from listing (duration: 00m 56s)
  • 01:13 Reedy: added eqsin ipv6 range to botpasswords ip range restriction T188111
  • 01:08 demon@tin: Synchronized docroot/noc/: dblists cleanup (duration: 00m 57s)
  • 01:07 demon@tin: Synchronized tests/: no-op (duration: 00m 59s)

2018-02-23

  • 22:36 demon@tin: Finished deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file (duration: 00m 10s)
  • 22:35 demon@tin: Started deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file
  • 21:27 demon@tin: Finished scap: pos mysql code (duration: 23m 09s)
  • 21:04 demon@tin: Started scap: pos mysql code
  • 20:48 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.22
  • 20:39 no_justification: wmf.21, that is
  • 20:38 demon@tin: rebuilt and synchronized wikiversions files: roll wikidatawiki back to wmf.11, busted
  • 20:35 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.22
  • 19:10 ebernhardson: restart relforge elasticsearch cluster to test entity extraction on larger dataest
  • 18:28 Amir1: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=enwiki --force-protocol https (T183019)
  • 17:22 ema: libvmod-netmapper 1.6-1 uploaded to apt.w.o/experimental T188089
  • 16:37 moritzm: rebooting image scalers in codfw for kernel security updates
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1083 (duration: 01m 14s)
  • 15:58 moritzm: rebooting job runners in codfw for kernel security updates
  • 15:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 02m 21s)
  • 15:15 jynus: about to deploy gerrit:413375 disabling puppet on affected hosts
  • 14:59 elukey: update facts on puppet compiler
  • 14:40 moritzm: installing kernel updates on API servers in codfw
  • 14:09 jynus: restarting tendril database- will case unavailability of dbtree for a while
  • 13:44 moritzm: reboot ocg1003 for tests
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 and fully repool db1076 (duration: 01m 13s)
  • 12:28 hashar@tin: Synchronized wmf-config/throttle.php: Define new throttle rule - T188090 (duration: 01m 11s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 01m 21s)
  • 12:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 - T186321 (duration: 01m 12s)
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1076 - T186321 (duration: 01m 13s)
  • 11:29 marostegui: Restart mariadb on db1076 for binlog format change - T186321
  • 11:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for binlog format change - T186321 (duration: 01m 08s)
  • 11:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 after alter table (duration: 01m 12s)
  • 11:02 moritzm: installing kernel updates on mw* in codfw
  • 10:30 hashar: releases1001: sudo -u jenkins rm -fR /var/lib/jenkins/jobs/mediawiki-private-nightlies/workspace/BRANCH/REL1_??/mediawiki-snapshot-REL1_??-2018???? # T188080
  • 10:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:01 elukey: restart hhvm on mw1230
  • 09:54 elukey: restart hhvm on mw1286
  • 09:50 elukey: restart hhvm on mw1227
  • 08:05 marostegui: MariaDB and kernel upgrade on db1083
  • 07:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083, fully repool db1089 - T162807 (duration: 01m 12s)
  • 06:55 marostegui: Reboot db2093 to test /srv auto-mounting
  • 06:40 marostegui: Deploy schema change on db1090 - T187089 T185128 T153182
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 for alter table (duration: 01m 13s)
  • 05:58 mutante: puppetmaster1001 - signing puppet certs for kafkamon1001/kafkamon2001 - initial puppet runs, adding as role spare (T187901)
  • 05:40 mutante: ganeti1004 - initial startup of kafkamon1001 - booting to PXE, installing stretch (T187901)
  • 04:56 mutante: ganeti: ganeti2004 - creating new VM kafkamon2001 - vcpus=2,memory=8g,disk=60G, row_A codfw (T187901)
  • 04:53 mutante: ganeti: creating new VM kafkamon1001 - vcpus=2,memory=8g,disk=60G, row_A eqiad (T187901)
  • 02:46 demon@tin: Finished deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin (duration: 00m 10s)
  • 02:46 demon@tin: Started deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin
  • 02:10 demon@tin: Synchronized docroot/: mw.org docroot moving (duration: 01m 13s)
  • 01:45 eileen: update process control process-control config revision is 1605238b2e
  • 01:20 eileen: update civicrm revision changed from aa251f1a93 to a47eafcbad, config revision is c1787646bc
  • 01:19 demon@tin: Synchronized static/favicon/: smaller favicons (duration: 01m 12s)
  • 01:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: point mkwikt favicon to en version, dupe (duration: 01m 15s)
  • 01:08 demon@tin: Synchronized wmf-config/InitialiseSettings.php: rtl wikibooks logo (duration: 01m 13s)
  • 01:06 demon@tin: Synchronized static/favicon/wikibooks-rtl.ico: rtl wikibooks logo (duration: 01m 12s)
  • 00:52 demon@tin: Synchronized static/images/project-logos/: new project logos for urdu wikt (duration: 01m 13s)
  • 00:37 krinkle@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: Ia54cd7 - rm use of MW_LANG (duration: 01m 13s)

2018-02-22

  • 22:33 demon@tin: Synchronized php-1.31.0-wmf.22/includes/filerepo/file/LocalFile.php: Id5cdd8ec (duration: 01m 12s)
  • 22:32 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: Id5cdd8ec (duration: 01m 12s)
  • 22:30 demon@tin: Synchronized php-1.31.0-wmf.22/includes/Storage/: Id5cdd8ec (duration: 01m 13s)
  • 22:16 maxsem@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 12s)
  • 22:14 maxsem@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 14s)
  • 21:51 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: I9334d36e (duration: 01m 15s)
  • 21:37 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1004.eqiad.wmnet
  • 21:11 gehel: powercycling wdqs1004 (complete loss of network)
  • 20:39 demon@tin: Synchronized php-1.31.0-wmf.22/includes/libs/objectcache/WANObjectCache.php: betterer logging for cache ttl reduction, Iea029e78 (duration: 01m 13s)
  • 19:33 XioNoX: redirecting Facebook bots large source of traffic to codfw ( https://gerrit.wikimedia.org/r/#/c/413446/ )
  • 19:14 akosiaris: rolling restart of eqiad appservers. sudo cumin -b3 -s 30 'A:mw-eqiad' 'restart-hhvm' T188019
  • 19:12 twentyafterfour@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/413437/
  • 19:03 chasemp: baham:~# authdns-update
  • 19:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2073 (duration: 01m 12s)
  • 17:23 elukey: installed linux-perf-4.9 on phab1001 to experiment with perf tracing
  • 17:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1076 (duration: 01m 12s)
  • 17:05 XioNoX: rolling back "redirecting ns2 traffic to radon"
  • 17:02 ema: reboot eeden with new kernel 4.9.0-0.bpo.6
  • 16:58 XioNoX: redirecting ns2 traffic to radon
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 (duration: 01m 12s)
  • 16:28 ejegg: updated CiviCRM from b27e6a5019 to aa251f1a93
  • 16:26 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Use EventBus for refreshLinks in test wikis, file 2/2 - T185052 (duration: 01m 12s)
  • 16:25 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for refreshLinks in test wikis, file 1/2 - T185052 (duration: 01m 12s)
  • 16:23 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052 (duration: 00m 36s)
  • 16:23 ppchelko@tin: Started deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052
  • 16:22 mobrovac@tin: scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 16:13 jynus: tendril and dbtree database currently under maintanance
  • 16:04 ejegg: updated payments-wiki from fe311c2d26 to 1acfc4a9a0
  • 15:26 ema: finished upgrading cache_text@ulsfo to varnish 5
  • 15:24 elukey: manually removing from cp1008 and cache::misc old files related to the varnishkafka jumbo testing instance (after https://gerrit.wikimedia.org/r/413370)
  • 14:58 matthiasmullie: EU SWAT finished
  • 14:52 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable 3D file display (duration: 01m 12s)
  • 14:50 mlitn@tin: Synchronized php-1.31.0-wmf.21/extensions/3D/extension.json: Remove MMV dependency for 3D (duration: 01m 12s)
  • 14:41 ottomata: beginning migration of webrequest_misc from Kafka analytics to jumbo: T185136
  • 14:40 mlitn@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:38 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable 3D file display (duration: 01m 13s)
  • 14:32 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2171.codfw.wmnet
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Show HTML summaries on cswiki (T182321) (duration: 01m 13s)
  • 13:41 ema: bounce pybal on lvs1003 to try establish missing etcd connections (zotero, thumbor, wdqs) https://phabricator.wikimedia.org/P6730
  • 13:30 moritzm: rebooting kubernetes1001
  • 13:21 ema: upgrade pybal on lvs1003 to 1.14.4
  • 12:42 _joe_: ended live-hacking on mwdebug1001 (T185078)
  • 12:24 _joe_: live-hacking ProductionServices.php on mwdebug1001 for testing (T185078)
  • 11:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and slowly repool db1076 (duration: 01m 12s)
  • 11:40 kartik@tin: Finished deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1 (duration: 03m 37s)
  • 11:39 akosiaris: purge ORES from scb hosts T168073 T171851
  • 11:37 kartik@tin: Started deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1
  • 11:19 _joe_: upgrading python-conftool on all cache hosts
  • 10:55 ema: upgrading python-conftool on cp5007
  • 10:51 _joe_: upgrading python-conftool on cp1008
  • 10:42 jynus: stop db2073 for maintenance
  • 10:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and fully repool db1104 (duration: 01m 13s)
  • 10:37 _joe_: benchmarking EtcdConfig failure scenarios on mwdebug1001, T185078
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 14s)
  • 10:18 ema: upgrade cache_text @ ulsfo to varnish 5
  • 10:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2073 for maintenance (duration: 01m 12s)
  • 10:08 moritzm: uploaded Linux 4.9.82-1~wmf1 for jessie-wikimedia to apt.wikimedia.org (retpoline-enabled kernel)
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low traffic and depool db1067 - T162807 (duration: 01m 12s)
  • 09:59 akosiaris: reboot kraz.wikimedia.org (irc.wikimedia.org)
  • 09:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 12s)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 - T186321 (duration: 01m 12s)
  • 09:20 marostegui: Stop MySQL on db1104 to switch its binlog to statement - T186321
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T186321 (duration: 01m 13s)
  • 09:19 moritzm: rebooting multatuli
  • 09:03 ema: eqiad LVSs: upgrade pybal to 1.14.4
  • 08:48 jynus: tendril and dbtree database currently under maintanance
  • 08:47 ema: codfw LVSs: upgrade pybal to 1.14.4
  • 08:35 marostegui: Stop tendril database (db1011) to copy it to db1115 - tendril will be offline while the copy is in progress - T184704
  • 08:32 ema: esams LVSs: upgrade pybal to 1.14.4
  • 08:24 ema: ulsfo LVSs: upgrade pybal to 1.14.4
  • 08:05 marostegui: Disable puppet on db1011 - T184704
  • 07:48 krinkle@tin: Synchronized wmf-config/FeaturedFeedsWMF.php: I73945d7d - minor clean-up (duration: 01m 13s)
  • 07:32 _joe_: starting tests on mwdebug1001 again
  • 07:32 marostegui: Deploy schema change on db1076 - T187089 T185128 T153182
  • 07:24 marostegui: Stop MySQL on db1076 for mariadb and kernel upgrade + alter table
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 01m 14s)
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 (duration: 01m 13s)
  • 06:21 marostegui: Stop puppet and mysql on db1011 to get ready to copy its data to db1115 - T184704
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 53s)
  • 01:05 anomie: Running cleanupBlocks.php on more wikis for T187834: alswiki bgwiki bhwiki cawiki dewiki elwiki eswiki frwiki hewiki hiwiki huwiki hywiki jawiki jawikibooks jawikinews jawikiquote jawikisource jawiktionary kawiki kowiki mswiki mswiktionary rowiki sourceswiki
  • 01:01 anomie: Running cleanupBlocks.php on mediawikiwiki for T187834
  • 00:46 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 03m 07s)
  • 00:43 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:41 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 00m 27s)
  • 00:40 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:25 tgr@tin: Synchronized wmf-config/CommonSettings-labs.php: T57420 enable loginOnly flag in beta (duration: 01m 12s)
  • 00:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9 (duration: 06m 05s)
  • 00:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9
  • 00:13 demon@tin: Synchronized php-1.31.0-wmf.22/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 13s)
  • 00:12 demon@tin: Synchronized php-1.31.0-wmf.21/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 21s)
  • 00:00 mutante: LDAP - added uid 'raz-shuty' to group 'wmde' (T187442)

2018-02-21

  • 21:50 elukey: restart hhvm on mw1224 - high load alarms
  • 21:46 elukey: restart hhvm on mw1235 - high load alarms
  • 21:44 elukey: restart hhvm on mw1233 - high load alarms
  • 21:39 awight@tin: Finished deploy [ores/deploy@addba9c]: T187914 on the scb* cluster (duration: 10m 02s)
  • 21:34 elukey: restart hhvm on mw1232 - high load alarms
  • 21:30 ppchelko@tin: Finished deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636 (duration: 15m 59s)
  • 21:30 elukey: restart hhvm on mw1229 - high load alarms
  • 21:29 awight@tin: Started deploy [ores/deploy@addba9c]: T187914 on the scb* cluster
  • 21:28 awight@tin: Finished deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster (duration: 13m 03s)
  • 21:27 elukey: restart hhvm on mw1227 - high load alarms
  • 21:23 elukey: restart hhvm on mw1221 - high load alarms
  • 21:15 awight@tin: Started deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster
  • 21:14 ppchelko@tin: Started deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636
  • 20:53 twentyafterfour: MediaWiki Train for 1.31.0-wmf.22 is blocked by T187942
  • 20:39 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:38 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:34 twentyafterfour: rolling back group1 to wmf.21
  • 20:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.22 (duration: 01m 08s)
  • 20:27 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.22
  • 20:10 mutante: phab2001 - testing phab restart cron
  • 19:34 ebernhardson@tin: Synchronized wmf-config/PoolCounterSettings.php: Increase pool counter workers for cirrus namespace lookup (duration: 01m 13s)
  • 19:24 ottomata: applying changes to kafkatee module, first rhenium then oxygen. will require manual config fixings
  • 18:59 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for Burmese Wiktionary T187882 (duration: 01m 06s)
  • 18:48 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespace localization for sdwiki T186943 (duration: 01m 13s)
  • 18:39 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Added new throttle rule for Wikipedia Women in Red editathon T187803 (duration: 01m 12s)
  • 18:37 chasemp: labsdb rm -fR /usr/local/lib/mediawiki-config && puppet agent --test
  • 18:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Topic namespace alias of zhwiki T187546 (duration: 01m 13s)
  • 18:12 _joe_: stopped testing on mwdebug1001 for SWAT window
  • 17:43 ema: eqsin LVSs: upgrade pybal to 1.14.4
  • 17:34 _joe_: resuming tests on mwdebug1001
  • 17:17 ema: eqiad LVSs: bounce pybal for labweb proxfetch config changes
  • 17:12 ppchelko@tin: Finished deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437 (duration: 01m 23s)
  • 17:11 ppchelko@tin: Started deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437
  • 17:07 _joe_: finished testing on mwdebug1001 for swat
  • 16:56 oblivian@puppetmaster1001: conftool action : edit; selector: name=ReadOnly,scope=eqiad
  • 16:40 _joe_: testing various etcd failure scenarios on mwdebug1001, T185078
  • 16:39 ppchelko@tin: Finished deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437 (duration: 01m 33s)
  • 16:37 ppchelko@tin: Started deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437
  • 16:27 ema: lvs1010: restart pybal
  • 16:00 godog: restart rsyslogd on lithium and wezen - T136312
  • 15:50 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve private wiki thumbnails with Thumbor (T169144) (duration: 01m 12s)
  • 15:44 no_justification: pruned old 1.29.x and 1.30.x versions that somehow stuck around. Also 1.31.0-wmf.* cache/ directories for unused branches. T157030
  • 15:37 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve officewiki thumbnails with Thumbor (T169144) (duration: 01m 11s)
  • 15:27 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 11s)
  • 15:24 chasemp: reboot labtestservices2002
  • 15:24 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 12s)
  • 15:19 gilles: Thumbor private wiki support deployment
  • 15:08 zeljkof: EU SWAT finished
  • 15:08 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removing Mobile beta feedback link (T187712) (duration: 01m 12s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Page Previews EventLogging instrumentation (T185973) (duration: 01m 13s)
  • 14:52 _joe_: rolling restart another 4 api appservers
  • 14:49 oblivian@tin: Synchronized wmf-config: Serve configuration to mwdebug hosts via etcd (duration: 01m 16s)
  • 14:42 _joe_: restarted hhvm on mwdebug1001 too
  • 14:38 _joe_: restarting hhvm on mwdebug1002
  • 14:06 _joe_: restarting hhvm on misbehaving api appservers
  • 14:02 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T187870) (duration: 01m 13s)
  • 13:28 marostegui: Reboot db2092 for a kernel upgrade
  • 13:26 moritzm: powercycling ganeti1007
  • 12:43 _joe_: rolling restart of hhvm on api servers under high load
  • 12:38 elukey: restart hhvm on mw1234 - high load
  • 12:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify that db1067 is now s1 candidate master - T186321 (duration: 01m 13s)
  • 12:26 elukey: restart hhvm on mw1231 - high load, hhvm-dump-debug in /home/elukey/hhvm.6759.bt
  • 12:21 elukey: restart hhvm on mw1227 - high load, hhvm-dump-debug in /home/elukey/hhvm.23382.bt
  • 12:10 moritzm: uploading retpoline-enabled gcc-4.9 to apt.wikimedia.org / jessie-wikimedia to be able to use it on boron for building Linux (trying to adapt our pbuilder setup to also include security.debian.org ran into a few proxy-related problems and this is really a rare corner case anyway)
  • 12:02 ema: lvs5003: pybal upgraded to 1.14.4
  • 12:01 ema: pybal 1.14.4 uploaded to apt.w.o
  • 11:17 moritzm: installing db5.3 security updates
  • 11:12 jynus: cloning db2011 to db2044
  • 10:40 kart_: Finished running CLL preference migration script dry-run on terbium (T187677)
  • 10:33 marostegui: Reload haproxy on dbproxy1005 - T187722
  • 10:26 marostegui: Remove db2030 from tendril - T187768
  • 10:09 moritzm: installing openssh bugfix updates from jessie/stretch point releases
  • 10:01 kart_: Running CLL preference migration script dry-run on terbium (T187677)
  • 09:46 moritzm: installing dbus updates from stretch point release
  • 09:23 moritzm: installing sqlite security updates on stretch
  • 08:35 godog: roll-restart thumbor in codfw and eqiad to apply https://gerrit.wikimedia.org/r/c/412980
  • 08:20 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 07:20 marostegui: Stop Mariadb on db1108 for kernel upgrade
  • 06:36 marostegui: Deploy schema change on db1105:3312 - T187089 T185128 T153182
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 01m 17s)
  • 05:00 eileen: enable major gifts address job
  • 04:41 eileen: update civicrm revision changed from 43a7641597 to b27e6a5019, config revision is ef884a2c5d
  • 04:13 andrew@tin: Finished deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more (duration: 02m 45s)
  • 04:10 andrew@tin: Started deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more
  • 03:34 andrew@tin: Finished deploy [horizon/deploy@0e28f49]: updating branded graphics (duration: 02m 49s)
  • 03:31 andrew@tin: Started deploy [horizon/deploy@0e28f49]: updating branded graphics
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 06m 18s)
  • 02:15 no_justification: running `initSiteStats.php --update` for all wikis in medium.dblist. T187845
  • 02:01 no_justification: running `initSiteStats.php --update` for all wikis in small.dblist. T187845
  • 01:54 no_justification: WikipediaMobileFirefoxOS submodule references caused labsdb* (and related) puppet failures. They should recover now (self reverted my docroot changes). Filed T187850
  • 01:51 demon@tin: Synchronized docroot/: revert docroot improvements. some servers don't like improvements (duration: 01m 12s)
  • 01:36 demon@tin: Synchronized docroot/: Swapping wikimedia.org docroot for symlink (second try, old WPFirefoxMobileOS cleanup was still needed) (duration: 01m 12s)
  • 01:16 eileen: update civicrm revision changed from efba904b06 to 43a7641597, config revision is ef884a2c5d
  • 01:10 cwd: disabled process-control
  • 01:08 eileen: start outage to upgrade civicrm to 4.7.31
  • 00:56 mutante: gerrit2001 - restarted gerrit to test that gerrit:411397 and gerrit:411394 don't break anything - didn't touch cobalt right now to minimize affecting users and their logins
  • 00:43 thcipriani@tin: Synchronized wmf-config/abusefilter.php: SWAT: Allow CheckUsers and Stewards to access private data from the AbuseLog T160357 (duration: 01m 12s)
  • 00:29 thcipriani@tin: Synchronized php-1.31.0-wmf.21/includes/page/WikiPage.php: SWAT: site_stats: Unbreak counting newly created pages (duration: 01m 12s)
  • 00:26 thcipriani@tin: Synchronized php-1.31.0-wmf.21/resources/src/mediawiki/mediawiki.ForeignStructuredUpload.js: SWAT: Follow-up I0bb4ed7f7: Use correct "this" T187523 (duration: 01m 13s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable x-kill feature everywhere T186714 T184322 (duration: 01m 13s)

2018-02-20

  • 22:58 ejegg: restarted donations queue consumer
  • 22:26 ejegg: turned off donations queue consumer for timing test
  • 22:25 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/Thanks/modules/ext.thanks.revthank.js: T187757 (duration: 01m 14s)
  • 22:20 chasemp: T184209 create labs-instance-transport1-b-codfw
  • 22:06 eileen: update civicrm revision changed from 915a4419c8 to efba904b06, config revision is 8c7ce87207 (extended report update for regex)
  • 21:44 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.22
  • 21:39 no_justification: ran `namespaceDupes.php --wiki=enwikiversity` for T187660
  • 21:18 twentyafterfour@tin: Finished scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961 (duration: 46m 59s)
  • 20:34 ejegg: updated CiviCRM from 31115684f6 to 915a4419c8
  • 20:31 twentyafterfour@tin: Started scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961
  • 20:20 chasemp: labtestmetal2001:~# aptitude install linux-image-4.4.0-109-generic && aptitude install linux-image-extra-4.4.0-109-generic
  • 20:17 chasemp: labtestmetal mkfs -t xfs -i size=512 /dev/mapper/labtestmetal2001--vg-data
  • 20:16 andrew@tin: Finished deploy [horizon/deploy@b02c819]: trying to get a clean deploy (duration: 01m 54s)
  • 20:14 andrew@tin: Started deploy [horizon/deploy@b02c819]: trying to get a clean deploy
  • 20:10 andrew@tin: Finished deploy [horizon/deploy@b02c819]: a couple of bug fixes (duration: 02m 55s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@b02c819]: a couple of bug fixes
  • 20:07 andrew@tin: Started deploy [horizon/deploy@6a40f84]: a couple of bug fixes
  • 19:57 twentyafterfour: Cutting new branch wmf/1.31.0-wmf.22 - Deployment blockers: T183961
  • 19:45 demon@tin: Synchronized docroot/mediawiki/keys/: symlink magic (duration: 00m 56s)
  • 19:26 mobrovac@tin: Started restart [changeprop/deploy@5fdc03a]: (no justification provided)
  • 19:00 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2 (duration: 02m 47s)
  • 18:57 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 (duration: 14m 02s)
  • 18:43 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875
  • 18:34 arlolra@tin: Finished deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113 (duration: 10m 37s)
  • 18:23 arlolra@tin: Started deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113
  • 18:03 ppchelko@tin: Finished deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875 (duration: 16m 01s)
  • 17:52 moritzm: installing cups updates from jessie point release
  • 17:50 gilles: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --wiki=officewiki --backend=local-multiwrite --private
  • 17:47 ppchelko@tin: Started deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875
  • 17:41 andrew@tin: Finished deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts (duration: 00m 55s)
  • 17:40 andrew@tin: Started deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts
  • 17:11 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 1 (duration: 00m 56s)
  • 16:33 godog: roll-restart thumbor in codfw/eqiad to apply https://gerrit.wikimedia.org/r/412935
  • 16:25 moritzm: installing initramfs-tools update from jessie point release
  • 16:17 jynus: drop s3 from dbstore2001
  • 16:14 gilles@tin: Synchronized private/PrivateSettings.php: Add Thumbor secret to Swift configuration (duration: 00m 56s)
  • 15:37 oblivian@puppetmaster1001: conftool action : edit; selector: dc=esams,name=cp3033.esams.wmnet
  • 15:36 bblack: eqsin: restarting all varnish backends for storage changes (not in prod traffic flow, yet!)
  • 15:27 _joe_: upgrading conftool on swift proxies, thumbor
  • 15:25 _joe_: upgrading conftool on parsoid,wdqs
  • 15:23 _joe_: upgrading conftool on aqs, restbase, ores clusters
  • 15:19 _joe_: upgrading conftool on the mediawiki appservers
  • 15:15 _joe_: upgrading conftool on the maps cluster
  • 15:10 _joe_: installing python-conftool on puppetmasters, cumin masters
  • 14:53 godog: roll-restart thumbor after rollback
  • 14:50 volans: running puppet on thumbor1002 (was already logged in)
  • 14:40 zeljkof: EU SWAT finished
  • 14:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the sitename of newiki (T186952) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft namespace to hiwikiversity. (T187535) (duration: 00m 56s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoconfirmed at zhwikt (T187018) (duration: 00m 55s)
  • 14:10 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T187171) (duration: 00m 55s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: throttle: add new rule for Wikidata edit-a-thon (T187655) (duration: 00m 56s)
  • 13:29 marostegui: Upgrade kernel and reboot db1113 and db1114
  • 13:23 marostegui: Stop MySQL and reboot db1111 for kernel and mariadb upgrade
  • 13:17 marostegui: Stop MySQL and reboot db1112 for kernel and mariadb upgrade
  • 13:03 moritzm: installing libav security updates
  • 12:11 _joe_: upgrading conftool to 1.0.0~beta2 on scb*
  • 11:24 jynus: upgrding mariadb-client on neodymium and sarin
  • 11:09 marostegui: Deploy schema change on labtestweb2001 - T153182 T185128 T187089
  • 11:00 marostegui: Deploy schema change on s2 codfw master (db2035) with replication, this will generate lag on codfw - T187089 T185128 T153182
  • 11:00 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2037 and db2044 (duration: 00m 55s)
  • 10:58 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2037 and db2044 (duration: 00m 53s)
  • 10:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2037 and db2044 (duration: 00m 55s)
  • 10:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2030 from config - T187768 (duration: 00m 55s)
  • 10:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2030 from config - T187768 (duration: 00m 56s)
  • 10:13 volans: unified python-requests-mock packages in apt.wikimedia.org jessie-wikimedia to be 1.3.0-3~wmf1, removed binaries for 1.3.0-3
  • 09:49 marostegui: Deploy schema change on s6 primary master db1061 - T185128 T153182
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 55s)
  • 09:16 marostegui: Data checks for db2037 before removing it from s4 - T187722
  • 09:14 elukey: restart zookeeper on druid1001 (follower) to verify that the last changes are no-op
  • 09:12 marostegui: Deploy schema change on db1088 - T187089 T185128 T153182
  • 09:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 55s)
  • 09:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316 and db1085 (duration: 00m 55s)
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:02 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:01 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:56 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:51 oblivian@puppetmaster2001: conftool action : edit; selector: scope=common
  • 08:32 _joe_: uploading conftool 1.0.0~beta1 on stretch
  • 08:26 _joe_: uploading conftool 1.0.0~beta1 to jessie
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 08:09 godog: powercycle ganeti1006
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 01m 10s)
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 07:27 marostegui: Deploy schema change on db1096:3316 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 00m 56s)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1085 (duration: 00m 55s)
  • 06:58 marostegui: Upgrade mariadb and kernel on db1085
  • 06:26 marostegui: Deploy schema change on db1085 (with replication - this will generate lag on labs hosts) - T187089 T185128 T153182
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 00m 56s)
  • 04:56 krinkle@tin: Synchronized docroot/mediawiki/keys/: Ie26638ed0c - rm old 2009 keys file (duration: 00m 56s)
  • 04:27 krinkle@tin: Synchronized w/extract2.php: Ib6d77e863b - clean up MW_LANG indirection (duration: 00m 55s)
  • 03:40 krinkle@tin: Synchronized wmf-config/CommonSettings.php: Ie4c7879f8ac - Clean up TemplateSandboxEditNamespaces config (duration: 00m 57s)
  • 03:37 Krinkle: It seems 'scap pull' on mwdebug1002 is acting weird (prompt doesn't return until 3-5 minutes after last line of "Finished rsync common")
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 50s)

2018-02-19

  • 23:21 eileen: re-enable omnirecipient jobs - process-control config revision is 8c7ce87207
  • 22:03 volans: uploaded cumin_3.0.1-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 20:03 volans: uploaded cumin_3.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 19:29 volans: uploaded python3-requests-mock, python-requests-mock and python-requests-mock-doc for version 1.3.0-3~wmf1 to apt.wikimedia.org jessie-wikimedia
  • 18:53 volans: disabled all notifications on Icinga for db2030
  • 18:04 volans: uploaded clustershell_1.8-1~wmf1_all.deb, python-clustershell_1.8-1~wmf1_all.deb and python3-clustershell_1.8-1~wmf1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:04 elukey@tin: Finished deploy [eventlogging/analytics@8bebdf7]: (no justification provided) (duration: 00m 05s)
  • 17:04 elukey@tin: Started deploy [eventlogging/analytics@8bebdf7]: (no justification provided)
  • 16:29 _joe_: uploading conftool 1.0.0beta1 to reprepro for jessie
  • 16:22 andrew@tin: Finished deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002 (duration: 00m 10s)
  • 16:22 andrew@tin: Started deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002
  • 16:11 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 22s)
  • 16:10 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 16:10 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 17s)
  • 16:09 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 14:59 jynus: testing new dbproxy1010 configuration locally to pool labsdb1010 for analytics
  • 13:44 godog: roll-restart prometheus after retention period bump
  • 13:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 (duration: 00m 55s)
  • 13:19 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 13:16 ema: upgrade cache_text@eqsin to varnish 5
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1098 s6 and s7 (duration: 00m 55s)
  • 12:27 marostegui: Deploy schema change on db1063 - T187089 T185128 T153182
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 for alter table (duration: 00m 55s)
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 (duration: 00m 55s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 56s)
  • 11:07 jdrewniak@tin: Synchronized portals: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:06 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 56s)
  • 10:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 55s)
  • 10:35 marostegui: Deploy schema change on db1093 - T187089 T185128 T153182
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 56s)
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1098 s6 and s7 (duration: 00m 56s)
  • 10:10 marostegui: Upgrade mariadb and kernel on db1098
  • 09:59 marostegui: Enable GTID on dbstore2002:3313 and dbstore2001:3316
  • 09:57 marostegui: Enable GTID on dbstore2002 and dbstore2001 for x1
  • 09:55 jynus: reenable gtid replication on db1053 and db2042
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1260.eqiad.wmnet
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1259.eqiad.wmnet
  • 09:43 marostegui: Upgrade mariadb and kernel on db2033
  • 09:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1090 (duration: 00m 55s)
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 - T162807 (duration: 00m 55s)
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for mariadb and kernel upgrade (duration: 00m 55s)
  • 08:49 marostegui: Deploy schema change on db1098:3316 - T187089 T185128 T153182
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 55s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1090 (duration: 00m 55s)
  • 08:11 godog: repool mw1227 - T149287
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2034 to x1 codfw master - T184888 (duration: 00m 56s)
  • 07:58 moritzm: installing werkzeug security updates on trusty
  • 07:42 marostegui: Change topology on x1 codfw - T184888
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1090 (duration: 00m 55s)
  • 07:01 marostegui: Reboot db1090 for kernel ugprade, mariadb upgrade, socket path location upgrade
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 55s)
  • 06:44 marostegui: Stop MySQL on db1089 to update its socket path
  • 06:42 marostegui: Deploy schema change on s6 codfw master (db2039), this will generate lag on codfw - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1105 - T162807 (duration: 00m 56s)
  • 05:29 andrew@tin: Finished deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes (duration: 03m 14s)
  • 05:26 andrew@tin: Started deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 10m 59s)

2018-02-18

  • 15:49 _joe_: rolling restart (1 at a time, staggered by 2 minutes) of 18 api appservers in equiad

2018-02-17

  • 17:33 twentyafterfour: restarting apache on phab1001 to clear deadlocked workers. refs T182832
  • 03:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 [keeping static files] (duration: 01m 17s)
  • 03:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 (duration: 04m 32s)

2018-02-16

  • 21:12 hashar: Upgraded Zuul to https://gerrit.wikimedia.org/r/#/c/411322/3 | T187567
  • 20:40 andrew@tin: Finished deploy [horizon/deploy@efcba2b]: sudo dashboard update (duration: 01m 16s)
  • 20:39 andrew@tin: Started deploy [horizon/deploy@efcba2b]: sudo dashboard update
  • 20:11 andrew@tin: Finished deploy [horizon/deploy@1fdd122]: two more small fixes (duration: 01m 21s)
  • 20:10 andrew@tin: Started deploy [horizon/deploy@1fdd122]: two more small fixes
  • 19:54 andrew@tin: Finished deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix (duration: 03m 12s)
  • 19:51 andrew@tin: Started deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix
  • 18:34 hashar: upgraded zuul
  • 16:21 andrew@tin: Finished deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements (duration: 08m 00s)
  • 16:13 andrew@tin: Started deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements
  • 16:06 cmjohnson1: labstore1006 and labstore1007 down for rack relocation
  • 16:03 andrew@tin: Finished deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints (duration: 02m 18s)
  • 16:00 andrew@tin: Started deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints
  • 15:40 andrew@tin: Finished deploy [horizon/deploy@29f9afb]: second attempt at ocata branch (duration: 03m 22s)
  • 15:37 andrew@tin: Started deploy [horizon/deploy@29f9afb]: second attempt at ocata branch
  • 15:29 andrew@tin: Finished deploy [horizon/deploy@58d2718]: first attempt at ocata branch (duration: 01m 28s)
  • 15:28 andrew@tin: Started deploy [horizon/deploy@58d2718]: first attempt at ocata branch
  • 15:27 godog: shut ms-be1018 for bbu swap - T186988
  • 15:16 akosiaris: run T181121#3978654 oneliner once more on sca1004, this time the VM has no DRBD
  • 15:14 akosiaris: poweroff sca1004, switch from DRBD to plain disk template T181121
  • 14:15 akosiaris: doing more IO stress tests on ganeti1005. T181121. Seems like we can reproduce
  • 14:06 chasemp: T184209 initial setup of labs-instances2-b-codfw and hosts
  • 13:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1094 (duration: 00m 56s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 and db1067 - T162807 (duration: 00m 55s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 jynus: reload dbproxy1008 configuration
  • 12:44 jynus: reload dbproxy1003 configuration
  • 12:37 ema: cp3049: restart varnish-fe to clear 'child restarted' alert
  • 12:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1094 (duration: 00m 56s)
  • 12:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 12:17 marostegui: Stop MySQL on db1094 for mariadb upgrade, kernel upgrade and socket location upgrade
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 (duration: 00m 56s)
  • 12:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 00m 56s)
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:35 jynus: stopping mysql on db1043, db2012 for clonning data away
  • 11:33 jynus: changing socket location on phabricator db hosts T148507
  • 11:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:28 ema: cp3036: restart varnish-fe to clear 'child restarted' alert
  • 11:28 hashar: Switching operations/mediawiki-config job for composer to Docker | https://gerrit.wikimedia.org/r/#/c/411206/
  • 11:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 00m 56s)
  • 11:09 elukey: restart nfaccd on rhenium to see if it picks up the new kafka topic config (3 partitions)
  • 11:06 marostegui: Stop MySQL on db1093 for mariadb and kernel upgrade, also update socket path
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 00m 56s)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1053 (duration: 00m 56s)
  • 09:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1053 (duration: 00m 56s)
  • 08:48 akosiaris: doing IO stress tests on ganeti1005. T181121
  • 08:34 akosiaris: manually allocate logstash1008 on ganeti1005 to undo the manual override of sensible allocation rules by ganeti
  • 08:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 (duration: 00m 57s)
  • 08:14 akosiaris: powercycle ganeti1006 T181121
  • 08:13 akosiaris: powercycle ganeti1006
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 59s)
  • 06:41 moritzm: installing installing quagga security updates
  • 06:35 marostegui: Deploy schema change on s5 primary master db1070 - T185128 T153182
  • 00:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.21/extensions/ProofreadPage/modules/page/ext.proofreadpage.page.edit.js: SWAT: T187454 fix text selection on #wpTextbox1 (duration: 00m 58s)

2018-02-15

  • 23:43 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 56s)
  • 22:54 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/MassMessage/includes/MassMessage.php: fix use statement, T187510 (duration: 00m 57s)
  • 21:50 ejegg: updated CiviCRM from 61acc9175e to 31115684f6
  • 20:22 twentyafterfour: 1.31.0-wmf.21 deployed: no apparent change in fatalmonitor error rate. refs T183960
  • 20:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.21
  • 20:11 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/TwoColConflict/includes/TwoColConflictHooks.php: sync https://gerrit.wikimedia.org/r/#/c/410809/ (duration: 01m 13s)
  • 20:09 twentyafterfour: syncing a patch before deploying 1.31.0-wmf.21 to all wikis.
  • 19:55 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Follow-up 77be427a1: Enable the Beta Feature on all wikis T185708 (duration: 01m 12s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Portal and Portal talk namespace alias of zhwiki T184866 (duration: 01m 13s)
  • 19:13 thcipriani@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Set SPARQL endpoint for category search T184840 (duration: 01m 12s)
  • 18:42 arlolra@tin: Finished deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195 (duration: 08m 34s)
  • 18:33 arlolra@tin: Started deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195
  • 18:11 bsitzmann@tin: Finished deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475) (duration: 05m 54s)
  • 18:06 bsitzmann@tin: Started deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475)
  • 17:24 foks: removed 2FA from User:Lea Lacroix (WMDE)
  • 17:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 (duration: 01m 12s)
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1097:3315, db1089, db1066 (duration: 01m 12s)
  • 16:32 andrew@tin: Finished deploy [horizon/deploy@4e7ccc5]: lots of updates (duration: 03m 13s)
  • 16:29 andrew@tin: Started deploy [horizon/deploy@4e7ccc5]: lots of updates
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3315 (duration: 01m 12s)
  • 15:34 ema: upgrade upload @ eqsin to varnish 5
  • 15:27 marostegui: Deploy schema change on db1051 - T187089 T185128 T153182
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051, fully repool db1097:3314, increase weight for db1097:3315 (duration: 01m 13s)
  • 15:15 zeljkof: EU SWAT finished
  • 15:14 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Log accessing private abusefilter details (T160357) (duration: 01m 12s)
  • 14:58 moritzm: installing erlang security updates on labcontrol1001
  • 14:53 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable the visual diff beta feature (T185708) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.21/includes/Revision.php: SWAT: Log the reason why revision->getContent() returns null (T184670) (duration: 01m 12s)
  • 14:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable log channel T184670 (T184670) (duration: 01m 12s)
  • 14:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2042 (duration: 01m 11s)
  • 14:20 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2042 (duration: 01m 12s)
  • 14:09 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add RevisionStore to wmgMonologChannels: (duration: 01m 13s)
  • 12:01 addshore: script run for T185738 done
  • 11:59 milimetric@tin: Finished deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars (duration: 09m 33s)
  • 11:58 addshore: addshore@terbium:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki elwiktionary --batchsize 1000 # T185738
  • 11:49 milimetric@tin: Started deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars
  • 10:58 marostegui: Stop replication in sync db1089 and db1066
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 and slowly repool db1097:3315 (duration: 01m 12s)
  • 10:38 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 fully (duration: 01m 12s)
  • 10:28 marostegui: Upgrade mariadb on db1066
  • 10:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 (duration: 01m 12s)
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1097:3314 (duration: 01m 12s)
  • 09:48 marostegui: Deploy schema change on db1097:3315 - T187089 T185128 T153182
  • 09:39 marostegui: Upgrade kernel and mariadb on db1097
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 for s4 and s5 (duration: 01m 12s)
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 (duration: 01m 12s)
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1082 (duration: 01m 12s)
  • 08:54 moritzm: installing erlang security updates on labtestcontrol*
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1082 (duration: 01m 13s)
  • 08:18 marostegui: Upgrade kernel + mariadb on db1082 (sanitarium master in s5)
  • 07:55 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1066 - T162807 (duration: 01m 12s)
  • 07:39 marostegui: Deploy schema change on db1082 (sanitarium master) with replication, this will generate lag on labs - T187089 T185128 T153182
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 13s)
  • 07:35 moritzm: installing libvorbis security updates on stretch
  • 07:30 twentyafterfour: phabricator upgrade finished. phd is back online.
  • 07:27 twentyafterfour: phabricator database migration finished
  • 07:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1110 (duration: 01m 12s)
  • 07:09 jynus: reimage dbproxy1003 to stretch
  • 07:04 twentyafterfour: Applying patch "phabricator:20180215.maniphest.02.populate.php" to host "m3-master.eqiad.wmnet"...
  • 07:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1110 (duration: 01m 13s)
  • 06:57 twentyafterfour: apache restarted, update appears to be successful
  • 06:57 twentyafterfour: phabricator database migrations applied
  • 06:50 twentyafterfour: shutting down apache on phab1001 to deploy update, downtime should be only a couple of minutes
  • 06:49 twentyafterfour: starting phabricator upgrade tagged release/2018-02-15/1
  • 06:45 twentyafterfour: restarted apache on phab1001 and reset cluster.read-only to false
  • 06:44 jynus: set db1059 in read-write
  • 06:38 jynus: merging dns update for phabricator db
  • 06:35 jynus: set db1043 as read only
  • 06:34 twentyafterfour: set cluster.read-only in phabricator
  • 06:33 jynus: about to set phabricator.wikimedia.org as read only
  • 06:28 jynus: scheduling downtime for phabricator on phab1001
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1110 (duration: 01m 13s)
  • 06:06 marostegui: Upgrade mysql on db1110
  • 05:57 jynus: restarting dbproxy1008 for kernel upgrade
  • 05:43 andrew@tin: Finished deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon (duration: 03m 06s)
  • 05:40 andrew@tin: Started deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 07m 25s)
  • 02:01 mutante: phab1001 - restarted apache to fix server status page
  • 01:27 twentyafterfour: restarting apache2 on phab1001 to free deadlocked php processes.
  • 01:03 twentyafterfour: using the current phabricator maintenance window to deploy https://gerrit.wikimedia.org/r/#/c/410626/
  • 01:03 twentyafterfour: the scheduled phabricator upgrade is delayed until 06:00 UTC Thursday because of large database migrations. Doing the upgrade at a time when DBAs are available to assist.
  • 00:52 maxsem@tin: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 14s)
  • 00:49 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 13s)

2018-02-14

  • 23:39 AaronSchulz: Running initSiteStats.php on s3 for T186947
  • 22:04 aaron@tin: Synchronized php-1.31.0-wmf.20/includes/SiteStats.php: f549559dc0 (duration: 01m 13s)
  • 21:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with full weight (duration: 01m 13s)
  • 21:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f (duration: 06m 01s)
  • 21:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f
  • 21:30 arlolra@tin: Finished deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed (duration: 15m 12s)
  • 21:15 arlolra@tin: Started deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed
  • 21:00 ema: upgrade cp1099 to varnish 5 (last upload@eqiad host)
  • 20:54 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice: Sync CentralNotice again after proper rebase (duration: 01m 14s)
  • 20:43 ema: upgrade cp1074 to varnish 5
  • 20:42 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice/: sync https://gerrit.wikimedia.org/r/#/c/410346/ for Ejegg (duration: 01m 15s)
  • 20:40 twentyafterfour: Group1 wikis are now running MediaWiki 1.31.0-wmf.21 - still no blockers on T183960
  • 20:38 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:37 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:33 ema: upgrade cp1073 to varnish 5
  • 20:05 ema: upgrade cp1072 to varnish 5
  • 19:44 ema: upgrade cp1071 to varnish 5
  • 19:25 XioNoX: enabling netflow on cr1-eqiad
  • 19:24 no_justification: ran namespaceDupes.php --fix for hiwiki
  • 19:24 demon@tin: Synchronized wmf-config/InitialiseSettings.php: portal aliases for hiwiki (duration: 01m 13s)
  • 19:22 ema: upgrade cp1064 to varnish 5
  • 19:20 no_justification: running updateCollation.php on nowikimedia
  • 19:19 demon@tin: Synchronized wmf-config/InitialiseSettings.php: nowikimedia collation, T185630 (duration: 01m 13s)
  • 19:16 andrewbogott: rebooting labvirt1019 so I can have a look at the raid setup, for T172538
  • 19:14 no_justification: ran namespaceDupes.php --fix on wawiktionary
  • 19:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: wawiktionary namespaces, T185289 (duration: 01m 13s)
  • 19:11 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Revert prior, busted the canaries (duration: 01m 15s)
  • 19:08 demon@tin: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 19:06 demon@tin: rebuilt and synchronized wikiversions files: namespace aliases for zhwiki, T184866
  • 19:00 ema: upgrade cp1063 to varnish 5
  • 17:43 ema: upgrade cp1062 to varnish 5
  • 17:42 moritzm: updated jenkins packages on apt.wikimedia.org for stretch (thirdpary/ci) and jessie (thirdparty) to 2.89.4
  • 17:39 hashar: CI Jenkins seems all happy following the upgrade ^o^
  • 17:34 moritzm: updating remaining python-cryptography updates from jessie point release
  • 17:32 hashar: Upgrading Jenkins on contint1001 / contint2001
  • 17:30 godog: roll-restart ms-fe to pick up https://gerrit.wikimedia.org/r/c/410199/
  • 17:22 moritzm: installing uwsgi jessie update on graphite*
  • 17:20 godog: roll-upgrade thumbor 1.14 in eqiad/codfw
  • 16:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 09s)
  • 16:56 ema: upgrade cp1050 to varnish 5
  • 16:50 marostegui: Deploy schema change on db1110 - T187089 T185128 T153182
  • 16:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 01m 12s)
  • 16:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with low weight (duration: 01m 12s)
  • 16:19 ema: upgrade cp1049 to varnish 5
  • 15:59 jynus: upgrade and restart db1088
  • 15:52 moritzm: rolling out debdeploy 0.0.99.2 (cumin masters already upgraded for a while, just synching the clients)
  • 15:51 andrewbogott: powering down labvirt1008 so chris can re-apply thermal paste
  • 15:45 moritzm: installing libgcrypt security updates on trusty
  • 15:31 zeljkof: EU SWAT finished
  • 15:24 godog: roll-upgrade thumbor to 1.13 - T187159 T179954 T187088
  • 15:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add suppressredirect to autoconfirmed at zhwikt" (T187018) (duration: 01m 13s)
  • 15:18 ema: upgrade cp1048 to varnish 5
  • 14:47 moritzm: installing PHP security updates
  • 14:44 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable flood flag at zhwikt (T187018) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Require 7 days & 10 edits for autoconfirmed at zhwiktionary (T187018) (duration: 01m 13s)
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity (T185347) (duration: 01m 12s)
  • 14:21 akosiaris: reboot ganeti1008 for kernel upgrade T181121
  • 14:14 zfilipin@tin: Synchronized wmf-config/reverse-proxy.php: SWAT: wgSquidServersNoPurge: add eqsin, remove dead IP (T156027) (duration: 01m 12s)
  • 14:11 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/mmv.3d.head.js: Fix 3D badge (duration: 01m 12s)
  • 14:10 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge and Webkit thumb load detection (duration: 01m 13s)
  • 13:44 elukey: rollback java 8 upgrade for archiva - issues with Analytics builds
  • 13:34 elukey: installed openjdk-8 on meitnerium, manually upgraded java-update-alternatives to java8, restarted archiva
  • 13:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 original weight (duration: 01m 12s)
  • 13:16 jynus: stop slave and rolling schema change on db1059 m3 replica
  • 13:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 (duration: 01m 12s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1106 (duration: 01m 12s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1106 (duration: 01m 12s)
  • 12:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1106 (duration: 01m 12s)
  • 11:25 marostegui: Deploy schema change on db1106 - T187089 T185128 T153182
  • 11:16 marostegui: Stop MySQL and reboot db1106 for mysql and kernel upgrade
  • 11:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 12s)
  • 11:14 filippo@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1002 after disk replacement (duration: 01m 12s)
  • 11:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 (duration: 01m 12s)
  • 10:46 jynus: dropping test databases from m5 T186585
  • 10:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 10:28 moritzm: installing libvorbis security updates on trusty systems
  • 10:13 marostegui: Deploy schema change on db1100 - T187089 T185128 T153182
  • 10:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 01m 12s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316,3315 (duration: 01m 12s)
  • 09:50 akosiaris: set standard weight for all ores* hosts
  • 09:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool slowly db1096:3316,3315 (duration: 01m 13s)
  • 09:08 marostegui: Deploy schema change on s5 dbstore1002 https://phabricator.wikimedia.org/T187089 https://phabricator.wikimedia.org/T185128 https://phabricator.wikimedia.org/T153182
  • 09:02 marostegui: Stop MySQL on db1096:3315 and 3316 for mysql+kernel upgrade
  • 08:45 jynus@tin: Synchronized wmf-config/db-eqiad.php: Rebalance s8 (duration: 01m 13s)
  • 08:38 akosiaris: pybal restart on lvs1003 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:29 akosiaris: pybal restart on lvs1006, lvs1009, lvs1012 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:08 _joe_: powercycled ganeti1008
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 (duration: 01m 12s)
  • 06:44 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 12s)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1096:3315 for alter table (duration: 01m 13s)
  • 06:30 marostegui: Deploy schema change on db1096:3315 - T187089 T185128 T153182
  • 05:55 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 03m 13s)
  • 05:52 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 05:52 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 00m 20s)
  • 05:51 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 39s)
  • 02:02 demon@tin: Synchronized fonts/: removing executable bits, no-op (duration: 01m 15s)
  • 01:33 demon@tin: Finished deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now) (duration: 00m 11s)
  • 01:32 demon@tin: Started deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now)
  • 00:25 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add uploader user group to mznwiki and make it automagically added T187187 (duration: 01m 12s)
  • 00:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable xkill on top wikis that use x aspect T187265 (duration: 01m 14s)

2018-02-13

  • 21:19 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.21
  • 21:07 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 49s)
  • 21:07 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 20:43 twentyafterfour@tin: Finished scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis (duration: 31m 01s)
  • 20:41 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 00m 21s)
  • 20:41 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:26 jynus: upgrading labsdb1010 database - proxies will complain for some time
  • 20:18 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 01m 17s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:12 twentyafterfour@tin: Started scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis
  • 20:11 twentyafterfour: Currently there are no blockers listed on T183960 and the train is leaving the station.
  • 20:05 twentyafterfour: MediaWiki Train 1.31.0-wmf.21 branched, prepped and patched | Changelog uploaded to https://www.mediawiki.org/wiki/MediaWiki_1.31/wmf.21/Changelog | Blockers: T183960
  • 19:03 jynus: upgrade and restart db2042
  • 18:53 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 (duration: 01m 58s)
  • 18:25 elukey: Analytics Hadoop cluster upgrade to Java 8 about to start - complete cluster shutdown is needed - T166248
  • 18:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc (duration: 05m 28s)
  • 18:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc
  • 18:00 twentyafterfour: Preparing to cut new MediaWiki branch wmf/1.31.0-wmf.21 - report deployment blockers for this branch in phabricator: T183960
  • 17:54 godog: repool mw1256 after disk swap - T186535
  • 17:20 demon@tin: Synchronized README: forcing git config sync, setting core.sharedRepository=group, T187076 (duration: 01m 12s)
  • 17:13 cmjohnson1: sorry snapshot1001 is going down for rack relocation
  • 17:12 cmjohnson1: stat1001 going down to for rack relocation
  • 17:04 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 17:03 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 16:36 demon@tin: Synchronized scap/plugins/clean.py: no-op, consistency (duration: 00m 55s)
  • 16:23 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 0 (duration: 00m 56s)
  • 16:17 cmjohnson1: replacing disk poolcounte1002
  • 15:35 marostegui: Deploy schema change on s5 codfw master (db2052), this will generate lag on codfw - T187089 T185128 T153182
  • 15:30 bblack: deploying changes to URL-encoding normalization on caches - https://gerrit.wikimedia.org/r/407488
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 55s)
  • 15:01 zeljkof: EU SWAT finished
  • 14:59 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 55s)
  • 14:58 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 54s)
  • 14:37 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change logos for sdwiki (T185865) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized php-1.31.0-wmf.20/extensions/ContentTranslation/extension.json: SWAT: Add ext.cx.widgets.overlay dependency to template editor (T187119) (duration: 00m 55s)
  • 14:22 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for sdwiki (T184521) (duration: 00m 57s)
  • 13:51 marostegui: Reboot db2066 to pick up new kernel
  • 13:50 marostegui: Deploy schema change on dbstore2001 - T187089 T185128 T153182
  • 12:51 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 56s)
  • 12:20 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:19 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:07 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge (duration: 00m 56s)
  • 11:57 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 55s)
  • 11:56 marostegui: Deploy schema change on db2066 - T187089 T185128 T153182
  • 11:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Rpool db2038 and db2059 (duration: 00m 55s)
  • 11:47 jynus: reenabling puppet on all eqiad databases
  • 11:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099 (duration: 00m 56s)
  • 11:37 marostegui: Stop MySQL on db2059 and db2038 for kernel upgrade
  • 11:29 ema: lvs1003: restart pybal to reconnect to etcd
  • 11:27 ema: lvs1006/1010: restart pybal to reconnect to etcd
  • 11:26 ema: lvs4005: restart pybal to reconnect to etcd
  • 11:23 ema: esams primary LVSs: restart pybal to reconnect to etcd
  • 11:21 ema: esams secondary LVSs: restart pybal to properly reconnect to etcd
  • 11:14 ema: repool cp3007
  • 11:13 ema: depool cp3007 to test pybal's behavior on lvs3002
  • 10:51 filippo@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 for disk replacement (duration: 00m 56s)
  • 10:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099 (duration: 00m 54s)
  • 10:08 godog: roll-restart ms-fe in codfw/eqiad after applying https://gerrit.wikimedia.org/r/c/409942/
  • 10:03 ema: restart pybal on lvs2003
  • 09:58 ema: restart pybal on lvs2006
  • 09:52 filippo@neodymium: conftool action : set/pooled=no; selector: name=ms-fe2005.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2038 and db2059 (duration: 00m 55s)
  • 09:32 marostegui: Stop mysql on db2075 for mysql and kernel upgrade
  • 09:30 marostegui: Stop replication in sync on db1089 and dbstore1002 - T162807
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 09:22 elukey: powercycle analytics1062 - not reachable via ssh, frozen via serial console
  • 09:22 jynus: disabling puppet on all eqiad databases
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 09:20 marostegui: Stop replication in sync on db1089 and db1065 - T162807
  • 09:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2084:3315, depool db2075 (duration: 00m 55s)
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 08:52 marostegui: Stop replication in sync on db1089 and db1099:3311 - T162807
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089, db1099 - T162807 (duration: 00m 55s)
  • 08:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 (duration: 00m 56s)
  • 08:37 hashar: tin.eqiad.wmnet: removing live hack in /srv/mediawiki-staging/scap/plugins/clean.py | T187160
  • 08:32 moritzm: installing wavpack security updates
  • 08:09 moritzm: installing exim security updates on trusty hosts
  • 07:02 marostegui: Deploy schema change on s5 db2089 db2084 db2075 db2039 db2059 - T187089
  • 06:28 marostegui: reload haproxy on dbproxy1005
  • 05:10 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp50(0[12345789]|1[12]).eqsin.wmnet
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 29s)
  • 00:24 cwd: re-enabled p-c
  • 00:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/modules/ve-mw/ui/pages/: T187112 (duration: 00m 56s)
  • 00:10 cwd: disabled p-c jobs for reboot
  • 00:04 demon@tin: Synchronized wmf-config/: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 57s)
  • 00:03 demon@tin: Synchronized wmf-config/InitialiseSettings.php: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 56s)

2018-02-12

  • 23:47 demon@tin: Finished deploy [gerrit/gerrit@6adde70]: reviewers plugin (duration: 00m 12s)
  • 23:46 demon@tin: Started deploy [gerrit/gerrit@6adde70]: reviewers plugin
  • 23:32 mutante: terbium,wasat: touch /var/log/mediawwiki/purge_abusefilter.log ; set owner/permissions like other logfiles
  • 23:13 elukey: manual restart of Yarn Node Managers on analytics1058/31 (failed due to root partition filled up for the issue logged before)
  • 23:09 elukey: cleaned up tmp files on all analytics hadoop worker nodes, job filling up tmp
  • 21:27 andrew@tin: Finished deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content (duration: 03m 18s)
  • 21:24 andrew@tin: Started deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content
  • 21:06 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5 (duration: 05m 46s)
  • 21:00 mholloway-shell@tin: Started deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5
  • 20:21 andrew@tin: Finished deploy [horizon/deploy@c009388]: updating puppet dashboard (duration: 03m 22s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c009388]: updating puppet dashboard
  • 20:13 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/UUID.php: T186909 (duration: 00m 56s)
  • 20:08 andrew@tin: Finished deploy [horizon/deploy@cba66d2]: more submodule tinkering (duration: 01m 15s)
  • 20:07 ppchelko@tin: Finished deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API (duration: 15m 10s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@cba66d2]: more submodule tinkering
  • 20:01 andrew@tin: Finished deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks (duration: 01m 02s)
  • 20:00 andrew@tin: Started deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks
  • 19:58 andrew@tin: Finished deploy [horizon/deploy@9d73005]: fixes to post-isntall checks (duration: 01m 01s)
  • 19:57 andrew@tin: Started deploy [horizon/deploy@9d73005]: fixes to post-isntall checks
  • 19:52 ppchelko@tin: Started deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API
  • 19:50 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 45s)
  • 19:50 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:48 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 03s)
  • 19:47 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:44 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes (duration: 01m 06s)
  • 19:43 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes
  • 19:17 niharika29@tin: Synchronized wmf-config/filebackend.php: Proxy public wiki thumb.php requests through Thumbor T169144 (duration: 00m 55s)
  • 19:13 andrew@tin: Finished deploy [horizon/deploy@01021b4]: trying another force (duration: 00m 17s)
  • 19:13 andrew@tin: Started deploy [horizon/deploy@01021b4]: trying another force
  • 19:12 niharika29@tin: Synchronized php-1.31.0-wmf.20/extensions/PageAssessments/: Fix 500 error with PageAssessments API T185037 (duration: 00m 56s)
  • 19:07 niharika29@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Stop PHP errors from going to the hhvm channel T45086 (duration: 00m 56s)
  • 18:58 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 07m 39s)
  • 18:50 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:48 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 14s)
  • 18:35 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:34 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 06m 47s)
  • 18:27 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:23 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: ores1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 18:12 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 30s)
  • 18:09 gehel@tin: Finished deploy [wdqs/wdqs@b6bd483]: new WDQS GUI (duration: 01m 53s)
  • 18:07 gehel@tin: Started deploy [wdqs/wdqs@b6bd483]: new WDQS GUI
  • 18:00 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:47 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 13m 18s)
  • 17:45 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards again (duration: 00m 17s)
  • 17:45 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards again
  • 17:34 gilles: added thumborUrl to PrivateSettings.php on labs, in preparation for https://gerrit.wikimedia.org/r/#/c/407611/
  • 17:34 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:21 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 [keeping static files] (duration: 02m 08s)
  • 17:18 elukey: home dirs on stat1004 moved to /srv/home (/home symlinks to it)
  • 17:10 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards (duration: 00m 54s)
  • 17:09 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards
  • 16:56 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns5001.wikimedia.org
  • 16:52 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/ApiVisualEditor.php: T186934 (duration: 00m 57s)
  • 16:27 andrew@tin: Finished deploy [horizon/deploy@4d1bdeb]: updating requirements.txt (duration: 01m 04s)
  • 16:26 andrew@tin: Started deploy [horizon/deploy@4d1bdeb]: updating requirements.txt
  • 16:16 andrew@tin: Finished deploy [horizon/deploy@de72527]: scap debugging run (duration: 00m 24s)
  • 16:16 andrew@tin: Started deploy [horizon/deploy@de72527]: scap debugging run
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 15:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 55s)
  • 15:28 marostegui: Stop replication in sync on db1089 and db1105:3311 - T162807
  • 15:23 moritzm: installing libtasn security updates
  • 15:02 reedy@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/maintenance/: Fix maintenance scripts (duration: 00m 56s)
  • 15:01 godog: roll-upgrade thumbor to 1.12 - T186500 T186594 T186492
  • 14:54 elukey: upload prometheus-burrow-exporter 0.0.4 on jessie/stretch-wikimedia
  • 14:51 ottomata: emitting IP field from varnishkafka-eventlogging instance T186833
  • 14:51 zeljkof: EU SWAT finished
  • 14:47 filippo@neodymium: conftool action : set/pooled=no; selector: name=mw1227.eqiad.wmnet
  • 14:44 addshore@tin: Finished scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description) (duration: 19m 56s)
  • 14:44 andrew@tin: Finished deploy [horizon/deploy@de72527]: just checking that this still doesn't work (duration: 00m 04s)
  • 14:44 andrew@tin: Started deploy [horizon/deploy@de72527]: just checking that this still doesn't work
  • 14:38 moritzm: uploading cassandra 3.11.0-wmf5 to component/cassandra311 for stretch-wikimedia/apt.wikimedia.org (T186619)
  • 14:24 addshore@tin: Started scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description)
  • 14:22 otto@tin: Finished deploy [eventlogging/analytics@01d5761]: T186833 (duration: 00m 04s)
  • 14:22 otto@tin: Started deploy [eventlogging/analytics@01d5761]: T186833
  • 14:20 godog: grant group write for wikidev on tin on /srv/mediawiki-staging/php-1.31.0-wmf.20/.git
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 06s)
  • 13:11 marostegui: Deploy schema change on db2084 and db2075 - T185128 T153182
  • 12:03 moritzm: upgrading jessie-based servers in deployment-prep/beta to the HHVM build using ICU 57 (component/icu57)
  • 11:15 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:14 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 10:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 00m 55s)
  • 10:07 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 10:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 00m 55s)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 09:51 elukey: reboot mw1302 (hhvm defunct processes, hungs registered in dmesg, very high load)
  • 09:46 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 56s)
  • 09:29 moritzm: installing libdatetime-timezone-perl SUA update
  • 09:25 godog: install swift stretch updates on ms-be eqiad - T177739
  • 09:19 marostegui: Deploy schema change on s5 - T185128 T153182
  • 09:05 marostegui: Stop replication in sync on db1089 and db2048 - T162807
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 08:57 moritzm: installing glibc security updates on trusty (harmless in our environment; CVE-2018-1000001 is non-exploitable due to disabled unprivileged user name spaces)
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T184599 (duration: 00m 55s)
  • 08:36 marostegui: Reboot db1087 to pick new kernel
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092, depool db1087 - T184599 (duration: 00m 55s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318, depool db1092 - T184599 (duration: 00m 55s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318, depool db1099:3318 - T184599 (duration: 00m 55s)
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104, depool db1101:3318 - T184599 (duration: 00m 55s)
  • 08:01 hashar: Upgrading CI Jenkins plugins
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1109, depool db1104 - T184599 (duration: 00m 55s)
  • 07:46 moritzm: installing exim security updates on remaining hosts
  • 07:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 - T184599 (duration: 00m 55s)
  • 06:53 marostegui: Reboot db1109 to pick up new kernel
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T184599 (duration: 00m 56s)
  • 06:40 marostegui: Drop dewiki database from s8 servers - T184599
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 11m 40s)

2018-02-11

  • 14:06 moritzm: installing exim4 security updates on MXs

2018-02-10

  • 16:51 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/specials/SpecialLog.php: SpecialLog: Fix results when no offender is specified - T186950 (duration: 00m 57s)
  • 01:10 demon@tin: Finished deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend (duration: 00m 10s)
  • 01:10 demon@tin: Started deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend

2018-02-09

  • 23:28 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:26 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:01 jynus: restart haproxy on dbproxy1005
  • 22:47 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again) (duration: 00m 03s)
  • 22:47 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again)
  • 22:45 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:43 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:42 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again) (duration: 00m 40s)
  • 22:42 tgr@tin: Synchronized php-1.31.0-wmf.20/includes/parser/ParserOutput.php: emergency fix for T186927 (duration: 00m 57s)
  • 22:42 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again)
  • 22:36 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (duration: 09m 59s)
  • 22:26 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again (duration: 00m 03s)
  • 22:10 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again
  • 22:08 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches (duration: 00m 17s)
  • 22:08 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches
  • 21:40 andrew@tin: Finished deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try (duration: 00m 14s)
  • 21:40 andrew@tin: Started deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try
  • 21:28 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/includes/api/ApiQueryAbuseLog.php: T186914 (duration: 00m 54s)
  • 21:20 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Block/TopicList.php: T186911 (duration: 00m 55s)
  • 21:10 ejegg@tin: Synchronized php-1.31.0-wmf.20/extensions/CentralNotice/CentralNoticePageLogPager.php: Sync CentralNotice for banner content log fix (duration: 00m 56s)
  • 20:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/user/User.php: Avoid pointless DB_MASTER connections in User::clearSharedCache() (duration: 00m 55s)
  • 20:08 demon@tin: Synchronized php-1.31.0-wmf.20/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 55s)
  • 20:07 demon@tin: Synchronized php-1.31.0-wmf.20/includes/MediaWiki.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 57s)
  • 19:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Scribunto/common/Hooks.php: silence divide by zero / no such index 0 errors (duration: 00m 56s)
  • 18:31 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.20
  • 18:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/filerepo/file/LocalFile.php: Fix CommentStore->createComment() call in LocalFile.php (duration: 01m 12s)
  • 18:08 bblack: cp4023: after a brief period of levelling off a bit: sharp, steep recovery of mbox lag ramp back to ~6K. not sure if this is a new floor or will drop further, but seems pretty ok.
  • 18:03 bblack: cp4023: now seems to be leveling off on lag and decreasing objhdr locks. either expiry thread prio helped (which argues for our prio-related patches) or it was naturally going to end?
  • 17:44 bblack: cp4023: experimental, "renice -19 39007" (backend cache-timeout aka expiry thread), to see if mbox lag resolves on its own quicker
  • 17:19 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 16:53 andrew@tin: Finished deploy [horizon/deploy@de72527]: Rolling out pyldap wheel (duration: 02m 26s)
  • 16:51 andrew@tin: Started deploy [horizon/deploy@de72527]: Rolling out pyldap wheel
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:29 demon@tin: Finished deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change (duration: 00m 10s)
  • 16:29 demon@tin: Started deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change
  • 15:49 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1 T186866
  • 15:47 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1
  • 15:47 akosiaris: upload etherpad-lite 1.6.3-1 to apt.wikimedia.org/jessie-wikimedia/main T186866
  • 15:00 herron: upgraded mailman on fermium for security updates
  • 14:24 demon@tin: Synchronized php-1.31.0-wmf.20/tests/phpunit/includes/db/LBFactoryTest.php: no-op to prior (duration: 01m 12s)
  • 13:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 13:33 demon@tin: Finished deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin (duration: 00m 10s)
  • 13:33 demon@tin: Started deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin
  • 10:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 10:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 for data checksumming - T162807 (duration: 01m 11s)
  • 10:36 moritzm: uploaded php-luasandbox 2.0.14~stretch2 for stretch-wikimedia to apt.wikimedia.org (this removes the php-luasandbox binary from our internal luasandbox build in favour of the php-luasandbox package maintained by legoktm from stretch-backports). As such the php-luasandbox source package we build internall now only provides the HHVM extension (and we can retire it entirely when migrating to PHP7)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1080 - T162807 (duration: 01m 11s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 12s)
  • 09:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 11s)
  • 09:06 marostegui: Fix data drifts on db1067 - T162807
  • 08:45 demon@tin: Synchronized wmf-config/: rm cleanchanges (duration: 01m 14s)
  • 08:44 demon@tin: Synchronized multiversion/submodules.json: rm CleanChanges (duration: 01m 13s)
  • 07:57 marostegui: Stop replication on labsdb1004 to fix replication issues
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 - T162807 (duration: 01m 11s)
  • 07:39 elukey: forced remount of /mnt/hdfs on stat1005
  • 06:52 marostegui: Fix replication on labsdb1010 - T186579
  • 06:47 marostegui: Fix data drifts, upgrade kernel, mariadb and socket path on db1080 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T162807 (duration: 01m 12s)
  • 02:41 andrew@tin: Finished deploy [horizon/deploy@60cac8e]: updating with designate dashboard (duration: 02m 42s)
  • 02:38 andrew@tin: Started deploy [horizon/deploy@60cac8e]: updating with designate dashboard
  • 00:18 demon@tin: rebuilt and synchronized wikiversions files: surprise, it broke. revert group1 back to wmf.20
  • 00:16 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20 *duck and cover*

2018-02-08

  • 23:49 ppchelko@tin: Finished deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag (duration: 15m 44s)
  • 23:33 ppchelko@tin: Started deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag
  • 22:37 bsitzmann@tin: Finished deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95 (duration: 05m 07s)
  • 22:32 bsitzmann@tin: Started deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95
  • 22:17 demon@tin: rebuilt and synchronized wikiversions files: mw.org back to wmf.20
  • 22:08 XioNoX: rebooting cr1-eqsin
  • 21:59 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess? (duration: 00m 03s)
  • 21:58 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess?
  • 21:53 ottomata: finished upgrade of scb to librdkafka 0.11 and node-rdkafka 2
  • 21:49 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 49s)
  • 21:49 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 35s)
  • 21:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 46s)
  • 21:47 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:40 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 15s)
  • 21:40 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:40 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 04s)
  • 21:40 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:39 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 47s)
  • 21:38 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 24s)
  • 21:38 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:38 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:32 herron: restarted rsyslogd services on lithium and wezen to clear rsyslog tls listener on port 6514 icinga alerts
  • 21:23 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 54s)
  • 21:23 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 01m 03s)
  • 21:23 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:22 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 25s)
  • 21:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:22 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:13 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 21:12 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:11 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 45s)
  • 21:10 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:09 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 21s)
  • 21:09 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five (duration: 01m 25s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five
  • 20:52 andrew@tin: Finished deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four (duration: 01m 36s)
  • 20:50 andrew@tin: Started deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four
  • 20:34 ppchelko@tin: Started restart [changeprop/deploy@5fdc03a]: Restart CP to force rule rebalance
  • 20:27 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 46s)
  • 20:26 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 20:26 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 20:26 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 20:24 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 22s)
  • 20:24 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 20:20 ottomata: starting deploy process to update scb cluster to librdkafka 0.11 and node-rdkafka 2. we will depool, stop puppet, deploy, test, start puppet on each node
  • 20:03 no_justification: gerrit: killed about 12 parallel clones of mediawiki/extensions/Math that had been running between 2-3 days (wtf?)
  • 19:24 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/AbstractRevision.php: T186077 (duration: 01m 11s)
  • 19:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on svwiki (T176082) (duration: 01m 11s)
  • 19:17 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Campaigns/CampaignsSecondaryAuthenticationProvider.php: T185870 (duration: 01m 13s)
  • 19:02 bsitzmann@tin: Finished deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94 (duration: 08m 21s)
  • 19:00 bblack: lvs@ulsfo - all back to normal
  • 18:55 bblack: lvs@ulsfo - puppet disabled, trying tagged vlan deploy
  • 18:54 bsitzmann@tin: Started deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94
  • 18:38 arlolra: Updated Parsoid to 961a5cf (T186630)
  • 18:27 arlolra@tin: Finished deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf (duration: 08m 11s)
  • 18:26 andrew@tin: Finished deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three (duration: 01m 16s)
  • 18:25 andrew@tin: Started deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three
  • 18:19 arlolra@tin: Started deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf
  • 18:10 ema: upgrade cp2026 to varnish 5
  • 17:55 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns400[12].wikimedia.org
  • 17:21 akosiaris: repool sca1004 (zotero) for T181121
  • 17:21 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 17:16 ema: upgrade cp2024 to varnish 5
  • 16:58 ema: upgrade cp2022 to varnish 5
  • 16:39 moritzm: installing PHP7 security updates
  • 16:32 moritzm: installing mysql security updates on auth*
  • 16:31 ema: upgrade cp2020 to varnish 5
  • 16:30 bblack: puppet disabled on all ntp servers for initial ulsfo recdns/ntp config process
  • 16:25 bblack: puppet disabled on lvs400[67] for initial ulsfo recdns config process
  • 16:23 elukey: stop archiva on meitnerium to swap /var/lib/archiva from the root partition to a new separate one - T186020
  • 16:20 akosiaris: reboot ganeti1005 T181121
  • 16:18 akosiaris: depool sca1004 (zotero) for T181121
  • 16:17 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 16:13 bblack: rebooting dns400[12] (downtimed, currently spare::system)
  • 16:13 ema: upgrade cp2017 to varnish 5
  • 16:11 andrew@tin: Finished deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two (duration: 01m 24s)
  • 16:10 andrew@tin: Started deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two
  • 16:05 bblack: ntp servers back to normal
  • 16:04 andrew@tin: Finished deploy [horizon/deploy@2f176e2]: updating with designate dashboard (duration: 01m 11s)
  • 16:03 andrew@tin: Started deploy [horizon/deploy@2f176e2]: updating with designate dashboard
  • 15:57 ema: upgrade cp2014 to varnish 5
  • 15:48 moritzm: installing libio-socket-ssl-perl update from jessie point release
  • 15:47 bblack: disabling puppet on all global dns recursors for controlled config deploy
  • 15:35 ema: upgrade cp2011 to varnish 5
  • 15:18 ema: upgrade cp2008 to varnish 5
  • 15:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1073 - T162807 (duration: 01m 12s)
  • 14:59 moritzm: installing icu security updates from jessie/stretch point releases
  • 14:56 ema: upgrade cp2005 to varnish 5
  • 14:49 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 14:47 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on meta and mediawiki.org (duration: 01m 12s)
  • 14:43 zeljkof: EU SWAT finished
  • 14:31 moritzm: upgrading deployment-mediawiki04 to HHVM linked against ICU 57
  • 14:23 ema: upgrade cp2002 to varnish 5
  • 13:54 marostegui: Rename dewiki tables on s8 slaves - T184599
  • 13:53 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454 (duration: 00m 02s)
  • 13:53 ariel@tin: Started deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454
  • 13:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with low weight - T162807 (duration: 01m 11s)
  • 13:41 marostegui: Drop dewiki already renamed tables and database on s8 master (db1071) - T184599
  • 13:22 marostegui: Fixing data drifts on db1073, also upgrade kernel, socket location and mysql - T162807
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T162807 (duration: 01m 12s)
  • 13:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T184599 (duration: 01m 12s)
  • 13:09 moritzm: upgrade deployment servers and script runners to HHVM 3.18.7
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T184599 (duration: 01m 11s)
  • 13:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 - T184599 (duration: 01m 11s)
  • 13:02 moritzm: upgrade mwdebug servers to HHVM 3.18.7
  • 12:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T184599 (duration: 01m 11s)
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T184599 (duration: 01m 11s)
  • 12:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T184599 (duration: 01m 11s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 - T184599 (duration: 01m 11s)
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T184599 (duration: 01m 11s)
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:37 marostegui: Fix replication on labsdb1010 - T186579
  • 11:33 akosiaris: reboot ganeti1005 T181121
  • 11:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 (duration: 01m 11s)
  • 11:12 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 12s)
  • 11:00 marostegui: Drop wikidata renamed tables and database from s5 eqiad hosts - T184599
  • 10:07 marostegui: Drop deleted databases from sanitarium and labsdb hosts - T186685
  • 10:07 moritzm: upgrading remaining nginx-full packages on mw* in eqiad to 1.13.6-2+wmf1~jessie1
  • 08:07 moritzm: upgrade remaining app servers to HHVM 3.18.7
  • 07:27 _joe_: depooled mw1256 from traffic, scap (faulty disk, T186535); now powering it off
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 02:20 eileen: Update CiviCRM civicrm revision changed from 71b1e35b99 to 61acc9175e (deploy citibank, benevity import updates)
  • 01:30 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 01:30 andrew@tin: Finished deploy [horizon/deploy@9223ba7]: Now with static content, I hope (duration: 01m 15s)
  • 01:29 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 00:35 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/: Revert "Use wgEditSubmitButtonLabelPublish from upstream", Assume wpTextbox1 has an API registered already (duration: 01m 12s)
  • 00:33 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/CirrusSearch/: T186765: Add special handling for profiles into config dump (duration: 01m 27s)

2018-02-07

  • 23:59 mutante: restarted icinga-wm, too quiet
  • 21:53 ebernhardson: mwdebug1001 back to standard deployed versions
  • 21:51 bsitzmann@tin: Finished deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643) (duration: 06m 41s)
  • 21:44 bsitzmann@tin: Started deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643)
  • 21:40 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:40 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:39 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:39 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:33 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png (duration: 03m 55s)
  • 21:29 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png
  • 21:27 ebernhardson: deploying wmf.20 to en* (except enwiki) on mwdebug1001 to debug new cirrus errors in wmf.20/wmf.19 mixed sister search
  • 21:13 andrew@tin: Finished deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times (duration: 01m 24s)
  • 21:12 andrew@tin: Started deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times
  • 21:07 demon@tin: rebuilt and synchronized wikiversions files: mw.org also back to wmf.17
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:04 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 02m 38s)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 44s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:00 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 05s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 20:39 demon@tin: rebuilt and synchronized wikiversions files: revert, huge spike in db lag
  • 20:36 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 19:47 ejegg: updated SmashPig from 1f56978c0c to 1ebee97a45
  • 19:43 ejegg: updated payments-wiki from 39a7ef32e5 to fe311c2d26
  • 19:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS_MAIN to $wgNamespacesWithSubpages for cawikimedia T185436 (duration: 01m 12s)
  • 19:11 chasemp: after conversation with andrew we moved labweb to public for T186729
  • 19:09 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Rename Project NS on Wikimedia Canada Chapter wiki T185661 (duration: 01m 11s)
  • 18:55 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove old "accountcreator" rules now handled by default T185417 T186462 (duration: 01m 12s)
  • 18:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tidy: Re-do this as a sorted negative list that gets shorter over time (duration: 01m 13s)
  • 18:07 jynus: fixing ferm breakage by restarting the service on db1051
  • 17:38 awight: ORES celery workers restarted on scb100[1-4]
  • 16:53 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options - https://gerrit.wikimedia.org/r/408718 (Unbreak ExtensionDistributor) (duration: 01m 12s)
  • 16:47 gehel: upgrade of tilerator / kartotherian on maps eqiad completed, sorry for the noise...
  • 16:46 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 21s)
  • 16:46 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:44 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:44 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:43 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:42 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:39 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:39 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:38 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:38 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:37 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 17s)
  • 16:37 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:31 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 20s)
  • 16:31 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:30 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 17s)
  • 16:28 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:28 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 16:27 gehel: upgrading tilerator / kartotherian on maps eqiad
  • 16:00 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1271.eqiad.wmnet
  • 14:37 moritzm: installing poppler security updates
  • 14:33 zeljkof: EU SWAT finished
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Updates to enable transliteration for crhwiki (T23582) (duration: 01m 11s)
  • 14:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "Portal" namespace on it.wikiquote (T185232) (duration: 01m 13s)
  • 14:05 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 47s)
  • 14:03 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:58 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: (no justification provided) (duration: 03m 02s)
  • 13:55 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:38 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 02m 45s)
  • 13:36 moritzm: installing p7zip security updates
  • 13:35 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:35 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 21s)
  • 13:34 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:33 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 06s)
  • 13:32 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:20 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 55s)
  • 13:19 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:18 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 22s)
  • 13:17 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:16 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:16 marostegui: Drop wikidata tables and database from s5 codfw hosts - T184599
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 11s)
  • 12:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight (duration: 01m 11s)
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low weight (duration: 01m 40s)
  • 11:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186321 (duration: 01m 11s)
  • 11:09 elukey: install libc6-dbg on phab1001 to get a more precise gdb stack trace - T182832
  • 11:04 marostegui: Stop MySQL on db1069 for MySQL upgrade, kernel upgrade and change binlog format to statement - T186321
  • 10:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186321 (duration: 01m 09s)
  • 09:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1051 after the BBU change - T186049 (duration: 01m 14s)
  • 09:41 kartik@tin: Finished deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901) (duration: 03m 44s)
  • 09:38 marostegui: Failover back labsdb1010 to labsdb1009 - T174569
  • 09:37 kartik@tin: Started deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901)
  • 09:18 marostegui: Failover labsdb1009 to labsdb1010 - T174569
  • 09:16 marostegui: Failover back labsdb1010 to labsdb1011 - T174569
  • 09:05 marostegui: Failover labsdb1011 to labsdb1010 - T174569
  • 08:43 marostegui: Change triggers for s3 on db1095 - T174569
  • 08:21 marostegui: Change triggers for s1 on db1095 - T174569
  • 08:11 marostegui: Change triggers for s5 on db1095 - T174569
  • 07:53 marostegui: Change triggers for s8 on db1095 - T174569
  • 07:17 marostegui: Change triggers for s7 on db1102 - T174569
  • 07:05 marostegui: Change triggers for s6 on db1102 - T174569
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Start repooling db1051 after the BBU change - T186049 (duration: 01m 15s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 34s)
  • 01:14 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking, another batch. (T186645) (duration: 01m 11s)
  • 01:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable AICaptcha data collection everywhere (T186244) (duration: 01m 11s)
  • 00:45 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Support fallback values for referrer policy (T180921) (duration: 01m 12s)
  • 00:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options (duration: 01m 11s)
  • 00:28 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on wikis with < 10 errors in all high-priority categories (T184656) (duration: 01m 09s)

2018-02-06

  • 23:02 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 04s)
  • 23:02 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 23:00 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 03s)
  • 23:00 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:56 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 02m 45s)
  • 22:53 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:42 ejegg: updated SmashPig standalone from 778e8f87b4 to 1f56978c0c
  • 22:23 hashar: Zuul/CI seems to work all fine now
  • 21:49 hashar: Flushing Zuul queue and upgrading to zuul_2.5.1-wmf2 | T186381
  • 21:49 hashar: Flushing Zuul queue and upgrading
  • 21:41 hashar: Going to shutdown Zuul in a few for an emergency hotfix | T186381
  • 21:35 andrew@tin: Finished deploy [horizon/deploy@a316e45]: (no justification provided) (duration: 01m 00s)
  • 21:34 andrew@tin: Started deploy [horizon/deploy@a316e45]: (no justification provided)
  • 21:14 legoktm: restarted zuul due to patch being stuck (T186381)
  • 20:25 andrew@tin: Finished deploy [horizon/deploy@fbf761e]: (no justification provided) (duration: 01m 21s)
  • 20:23 andrew@tin: Started deploy [horizon/deploy@fbf761e]: (no justification provided)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.20
  • 20:11 demon@tin: Synchronized php: symlink swap (duration: 01m 17s)
  • 19:25 hashar: Restarted Zuul due to T186381
  • 18:55 demon@tin: Finished scap: bootstrap wmf.20 @ testwiki (duration: 26m 09s)
  • 18:55 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 00m 15s)
  • 18:55 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:47 arlolra: Updated Parsoid to 8a0ff6c (T183515, T129372, T181408)
  • 18:46 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 06m 23s)
  • 18:40 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:39 arlolra@tin: Finished deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c (duration: 03m 47s)
  • 18:35 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 18:29 demon@tin: Started scap: bootstrap wmf.20 @ testwiki
  • 18:22 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 (duration: 07m 29s)
  • 18:15 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 16:56 elukey: restart httpd on phab1001
  • 16:50 gehel: upgrading kartotherian / tilerator on maps codfw completed
  • afk: restarting jenkins for updates
  • 16:41 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 16:41 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:40 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:40 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 36s)
  • 16:39 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:38 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 16:37 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 16:36 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 30s)
  • 16:36 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:35 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 01s)
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2003.codfw.wmnet
  • 16:30 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:30 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:29 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 02m 34s)
  • 16:29 mutante: mw1262 started hhvm, it had Unhandled server exception: Class undefined: Psr\Log\LogLevel
  • 16:27 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:26 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:24 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 34s)
  • 16:24 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:15 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:14 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 22s)
  • 16:14 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:10 gehel: upgrading kartotherian / tilerator on maps codfw
  • 15:36 elukey: drain + shutdown of analytics1038 to replace faulty BBU - T185409
  • 15:02 zeljkof: EU SWAT finished
  • 15:01 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow eliminators to undelete at urwiki (T185829) (duration: 00m 55s)
  • 14:53 marostegui: Poweroff db1051 for BBU replacement - T186049
  • 14:50 akosiaris: upgrade service-checker to 0.1.4 on scb1001
  • 14:45 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Typo, its 2018 not 2017 (T185794) (duration: 00m 55s)
  • 14:39 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T186530) (duration: 00m 55s)
  • 14:35 chasemp: disable puppet on labs things for a cautious change rollout
  • 14:33 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on test wikis (duration: 00m 56s)
  • 14:28 marostegui: Changing triggers on s2 - T174569
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on fiwiki, hewiki, ruwiki, svwiki (T185945) (duration: 00m 55s)
  • 14:14 mlitn@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionsDetailsWidget.js: T184380 (duration: 00m 55s)
  • 14:10 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Add entityUsageModifierLimits config for Wikibase (T185693) (duration: 00m 55s)
  • 14:07 urandom: re-enable smartpath on restbase1010 (revert experiment) - T178177
  • 13:35 gehel: upgrading prometheus-elasticsearch-exporter across all elasticsearch nodes
  • 12:32 marostegui: Power cycled dbstore1001 after it crashed - T186596
  • 11:54 marostegui: Sanitize s4 - T174569
  • 11:11 _joe_: forcing a resync of /dev/md1 on conf2001 to verify if the higher timeouts avoid consensus loss in etcd
  • 11:02 ema: restart pybal on codfw primary LVSs to make them reconnect to etcd
  • 11:01 ema: restart pybal on codfw secondary LVSs to make them reconnect to etcd
  • 10:57 ema: restart pybal on eqiad primary LVSs to make them reconnect to etcd
  • 10:55 ema: restart eqiad secondary LVSs to make them reconnect to etcd
  • 10:47 _joe_: rolling restart of the eqiad etcd cluster
  • 10:39 _joe_: rolling restart of the codfw cluster to pick up the config changes
  • 09:38 marostegui: Sanitizing s2 - T174569
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1077 (duration: 00m 55s)
  • 08:21 elukey: rollback apache/httpd changes on phab1001 (restart required)
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low weight (duration: 00m 53s)
  • 07:06 marostegui: Stop MySQL on db1077 for a full upgrade
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for MariaDB and kernel upgrade (duration: 00m 56s)
  • 06:49 marostegui: Fix replication on labsdb1010 - T186579
  • 03:32 demon@tin: Finished deploy [gerrit/gerrit@f25f017]: adding gitiles plugin (duration: 00m 10s)
  • 03:32 demon@tin: Started deploy [gerrit/gerrit@f25f017]: adding gitiles plugin
  • 03:17 foks: reset email for User:Andrewman327
  • 02:32 demon@tin: Synchronized tests/Defines.php: no op (duration: 00m 55s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 15s)
  • 01:47 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AICaptcha data collection on group0/group1 T186244 (duration: 00m 56s)
  • 00:25 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ps.svg: SWAT: Update the ps mobile wordmark T184442 (duration: 00m 55s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure settings feedback link T182217 (duration: 00m 56s)

2018-02-05

  • 23:21 mutante: nihal - restarted puppetdb service
  • 23:07 mobrovac@tin: Finished deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395 (duration: 03m 29s)
  • 23:04 mobrovac@tin: Started deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395
  • 22:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 55s)
  • 22:45 mobrovac@tin: Synchronized wmf-config/jobqueue.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 56s)
  • 22:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate (duration: 00m 54s)
  • 22:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate
  • 21:47 mholloway-shell@tin: Finished deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a (duration: 06m 38s)
  • 21:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023 (duration: 02m 27s)
  • 21:45 chasemp: asw-b-codfw# rollback 0 pending questions on T183167
  • 21:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023
  • 21:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a
  • 21:07 tgr@tin: Finished scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki (duration: 18m 24s)
  • 20:48 tgr@tin: Started scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki
  • 19:44 demon@tin: Synchronized wmf-config/InitialiseSettings.php: collation for abwiki (duration: 00m 55s)
  • 19:32 demon@tin: Finished scap: adding collation for Abkhaz (duration: 05m 12s)
  • 19:27 demon@tin: Started scap: adding collation for Abkhaz
  • 19:26 demon@tin: Synchronized multiversion/MWWikiversions.php: drop php5.3 support (duration: 00m 56s)
  • 19:22 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder extension for urwiki (duration: 00m 56s)
  • 19:05 elukey: executed 'echo '/srv/apache2_dump/core.%h.%e.%p.%t' > /proc/sys/kernel/core_pattern' on phab1001 - T182832
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps (duration: 14m 42s)
  • 18:42 ppchelko@tin: Started deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps
  • 18:37 mutante: added bstorm to acl*operations-team (project 29) on Phabricator (T185493)
  • 18:35 elukey: add 'ulimit -c unlimited' to /etc/default/apache2 to see if httpd's CoreDumpDirectory works properly on phab1001
  • 18:35 mutante: welcome new root shell user bstorm
  • 18:31 mutante: added bstorm to the 'wmf' and 'ops' LDAP groups (modify-ldap-groups on terbium) (T185493)
  • 18:30 ppchelko@tin: Finished deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content (duration: 12m 04s)
  • 18:18 ppchelko@tin: Started deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content
  • 18:07 gehel@tin: Finished deploy [wdqs/wdqs@d7eb899]: wdqs blazegraph + gui + updater upgrade (duration: 02m 36s)
  • 18:04 gehel@tin: Started deploy [wdqs/wdqs@d7eb899]: wdqs blazegraph + gui + updater upgrade
  • 17:52 ejegg: updated payments-wiki from 341cb573a1 to 39a7ef32e5
  • 17:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@d970b61]: Update mobileapps to 7a9fab3 (duration: 05m 45s)
  • 17:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@d970b61]: Update mobileapps to 7a9fab3
  • 16:10 marostegui: Renaming wikidata tables on s5 on eqiad - T184599
  • 16:03 marostegui: Renaming wikidata tables on s5 on codfw - T184599
  • 15:54 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c9c774e]: Update mobileapps to 1411ccb (duration: 06m 06s)
  • 15:48 mholloway-shell@tin: Started deploy [mobileapps/deploy@c9c774e]: Update mobileapps to 1411ccb
  • 15:26 elukey: temporary setting CoreDumpDirectory /srv/apache2_dump to httpd on phab1001 (+ httpd reload) to investigate core dumps for T182832
  • 14:46 hashar: European SWAT completed. I have not deployed matmarex patches to change Abkhaz collation ( https://gerrit.wikimedia.org/r/#/c/406185/ https://gerrit.wikimedia.org/r/#/c/406187/ )
  • 14:41 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder for Estonian Wikipedia (etwiki) - T186107 (duration: 00m 55s)
  • 14:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgNamespaceRobotPolicies on ptwiki's NS_USER to noindex - T185660 (duration: 00m 55s)
  • 14:31 hashar@tin: Synchronized wmf-config/throttle.php: Add throttle rules - T185794 T185811 (duration: 00m 55s)
  • 14:24 hashar@tin: Synchronized php-1.31.0-wmf.17/extensions/Flow/includes/Import/OptInController.php: OptInController catch both errors and exception - T184670 (duration: 00m 55s)
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Fix typo in arwikibooks rollbacker group - T185720 (duration: 00m 56s)
  • 14:14 hashar@tin: Synchronized wmf-config/throttle.php: Add throttle rule for an event - T185930 (duration: 00m 55s)
  • 14:11 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add 'rollbacker' group at arwikibooks - T185720 (duration: 00m 56s)
  • 13:20 marostegui: Rename dewiki tables on s8 master (db1071 - with no replication) before dropping them - T184599
  • 12:20 marostegui: Drop empty wikidata database from s5 master (db1070) - T184599
  • 12:17 marostegui: Drop old and renamed wikidata tables from s5 master (db1070) - T184599
  • 11:30 godog: expand smart metrics checking rollout with https://gerrit.wikimedia.org/r/#/c/403621/
  • 11:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify db1078 comment as it is the new candidate master for s3 (duration: 00m 55s)
  • 11:04 hashar: Upgraded jenkins-debian-glue to 0.18.4-wmf1 | T186494
  • 11:03 elukey: restart eventlogging/forwarder legacy-zmq on eventlog1001 due to slow memory leak over time (cached memory down to zero)
  • 10:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1078 (duration: 00m 55s)
  • 09:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1078 traffic (duration: 00m 55s)
  • 09:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1078 traffic - T186321 (duration: 00m 55s)
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 with low traffic - T186321 (duration: 00m 53s)
  • 08:44 marostegui: Stop MySQL on db1078, upgrade mariadb, kernel and socket location - T186321
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T186321 (duration: 00m 55s)
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 56s)
  • 07:45 marostegui: Deploy schema change on s8 primary master (db1071) - T174569
  • 07:43 elukey: install libjson-c2-dbg on phab1001 to allow better debugging of httpd/mod-php stuck process - T182832
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 03s)

2018-02-04

  • 22:40 elukey: restart aphlict.service on phab1001 to force it to pick up the new logfile (/var/log/aphlict/aphlict.log rather than the .log.1)
  • 06:18 _joe_: reduced raid resync speed on conf2* to 5000 KB/s
  • 04:33 _joe_: restarted etcdmirror on conf2002, failure caused by raid resyncs in codfw

2018-02-03

  • 03:55 legoktm: restarting zuul to drop 407165,3 from the queue

2018-02-02

  • 23:40 no_justification: gerrit: one last restart to try and force gerrit/phab session restart
  • 22:42 jynus: reloading m2 dbproxy
  • 22:08 no_justification: cobalt/gerrit2001: purged libbcprov-java libbcpkix-java, cleaned up old symlinks
  • 21:45 demon@tin: Finished deploy [gerrit/gerrit@98f5d9a]: Gerrit 2.14.6 (duration: 00m 14s)
  • 21:45 demon@tin: Started deploy [gerrit/gerrit@98f5d9a]: Gerrit 2.14.6
  • 21:42 no_justification: cobalt: disabling puppet so it doesn't restart gerrit
  • 21:41 no_justification: bringing down gerrit for upgrade
  • 20:54 demon@tin: Synchronized docroot/wikipedia.org/spec.yaml: expose swagger spec (duration: 00m 56s)
  • 20:47 elukey: truncated /var/log/aphlict/aphlict.log to 1G (was 26G) to avoid overhead for the upcoming first logrotate on phab1001
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original traffic for db1100 (duration: 00m 54s)
  • 16:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 (duration: 00m 55s)
  • 16:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 (duration: 00m 54s)
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1100 - T186321 (duration: 00m 54s)
  • 15:50 marostegui: Restart MySQL on db1100 - T186321
  • 15:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T186321 (duration: 00m 55s)
  • 15:34 moritzm: uploaded HHVM 3.18.5+dfsg+wmf5+icu57 to jessie-wikimedia/component/icu57 (HHVM 3.18.8 linked against an ICU 57 backport from stretch)
  • 15:25 mutante: ganeti1004 - stopped and started VM ununpentium
  • 14:53 akosiaris: reboot ganeti1005 after emptying it. T181121
  • 13:59 elukey: reboot meitnerium via gnt-instance reboot on ganeti1005 to pick up new disk config - T186020
  • 13:16 moritzm: installing w3m security updates on trusty
  • 12:57 moritzm: installing updated kernels on remaining jessie DB servers
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 11:57 godog: roll-restart nginx on thumbor and swift-proxy on ms-fe to apply https://gerrit.wikimedia.org/r/407411
  • 11:39 moritzm: uploaded php-wikidiff2 1.5.1+deb9u2 to apt.wikimedia.org (despite the source package name, this package only builds hhvm-wikidiff2 now as php-wikidiff2 is instead updated via stretch-backports, the old internal package will eventually be phased out when we move to PHP7)
  • 11:12 ema: cache_upload: repool cp4026 (varnish 5)
  • 11:07 ema: cache_upload: upgrade cp4026 to varnish 5
  • 10:43 ema: cache_upload: repool cp4025 (varnish 5)
  • 10:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 55s)
  • 10:39 ema: cache_upload: upgrade cp4025 to varnish 5
  • 10:24 ema: cache_upload: repool cp4024 (varnish 5)
  • 10:20 ema: cache_upload: upgrade cp4024 to varnish 5
  • 10:18 moritzm: installing ruby security updates on trusty
  • 09:57 godog: roll-upgrade thumbor to 1.11 - T178072 T185478 T185483 T185485 T183907 T179954
  • 09:46 gilles: Add thumborUrl to Swift config in PrivateSettings.php
  • 09:13 ema: cache_upload: repool cp4023 (varnish 5)
  • 09:08 ema: cache_upload: upgrade cp4023 to varnish 5
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 08:37 elukey: apt-get install php5-dbg on phab1001 as attempt to have a better gdb output for T182832
  • 08:35 ema: cache_upload: repool cp4022 (varnish 5)
  • 08:29 ema: cache_upload: upgrade cp4022 to varnish 5
  • 08:23 marostegui: Stop replication in sync db1089 - db1065 - T162807
  • 08:23 moritzm: installing curl security updates on trusty (Debian already updated)
  • 08:21 ema: cache_upload: repool cp4021 (varnish 5)
  • 08:14 ema: cache_upload: upgrade cp4021 to varnish 5
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 07:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 07:10 marostegui: Fixing data drifts on db1065 - T162807
  • 05:37 elukey: truncate /var/log/aphlict/aphlict.log to 25G as temp measure to avoid phab1001's root partition to fill up

2018-02-01

  • 23:37 mutante: creating new 100GB virtual disk for ganeti VM meitnerium (T186020)
  • 23:12 eileen: update civicrm revision changed from 849bba4186 to 71b1e35b99 (deploy civitoken)
  • 22:37 ejegg: updated payments-wiki from 40145892e7 to 341cb573a1
  • 21:56 raita: Removed 2FA from User:Jehochman
  • 21:52 raita: Removed 2FA from User:Superzerocool (on Mon, Jan 29): https://phabricator.wikimedia.org/T185731
  • 20:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool of db1083 (duration: 00m 55s)
  • 20:41 jynus: deployed modified query killer to enwiki replicas
  • 20:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: emergency depool of db1083 (duration: 00m 55s)
  • 19:19 chasemp: labservices1001:~# logrotate --force /etc/logrotate.conf
  • 19:17 chasemp: labservices1002:~# logrotate --force /etc/logrotate.conf
  • 19:04 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 [keeping static files] (duration: 01m 16s)
  • 19:02 demon@tin: Pruned MediaWiki: 1.31.0-wmf.15 (duration: 14m 55s)
  • 16:26 andrewbogott: apt-get install 'designate' on labservices1001 and 1002 — routine upgrade
  • 15:39 moritzm: upgrading nginx on mw1266-mw1299 (for T164456)
  • 15:27 moritzm: restarting apache/HHVM on deployment servers to pick up libxml2/curl security updates
  • 15:14 moritzm: installing curl security updates
  • 14:48 moritzm: installing tiff security updates
  • 14:44 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2019 (duration: 00m 57s)
  • 14:40 moritzm: restarting nginx on sodium to pick up libxml2 security update
  • 14:34 moritzm: restarting apache on rutherfordium to pick up libxml2 security update
  • 14:01 moritzm: restarting nginx on puppetdb hosts to pick up libxml2 security update
  • 13:56 moritzm: restarting nginx on meitnerium/archiva to pick up libxml2 security update
  • 13:42 gehel: restarting nginx on wdqs* for upgrade
  • 13:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1051 reason for depooling (duration: 00m 56s)
  • 13:23 akosiaris: force puppet run on all postgres servers for https://gerrit.wikimedia.org/r/407424
  • 13:20 jynus: stop and reimage es2019
  • 13:13 moritzm: restarting apache on krypton to pick up libxml2 security update
  • 13:13 gehel: restarting postgresql and nodejs services on maps*
  • 13:09 gehel: upgrade nging on elastic*
  • 12:54 moritzm: restarting nginx on debug proxies to pick up libxml2 security update
  • 12:53 moritzm: restarting apache on hafnium to pick up libxml2 security update
  • 12:06 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2018, depool es2019 (duration: 00m 57s)
  • 12:03 moritzm: restarting squid on URL downloaders to pick up libxml2 security update
  • 11:53 moritzm: installing libxml2 security updates
  • 10:33 godog: roll restart thumbor to lower subprocess timeout - T185479
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 53s)

2018-01-31

  • 23:56 mutante: restarting apache on phabricator server, same pattern as described in T182832
  • 23:06 bblack: re-pooling ulsfo in DNS - T185228
  • 23:04 bblack: re-pooling ulsfo in DNS
  • 23:00 bblack: restarting ulsfo varnish-fe processes
  • 22:55 bblack: un-downtiming various ulsfo things
  • 22:28 mepps: updated civicrm from c70f01cd83 to 849bba4186
  • 22:04 mepps: updated civicrm from c70f01cd83 to 63c918837c
  • 21:57 mholloway-shell@tin: Finished deploy [mobileapps/deploy@18d263a]: Update mobileapps to 3d717fa (duration: 06m 11s)
  • 21:51 mholloway-shell@tin: Started deploy [mobileapps/deploy@18d263a]: Update mobileapps to 3d717fa
  • 21:35 mutante: fixed icinga config for cp4024 parents
  • 20:29 demon@tin: Synchronized .gitmodules: consistency (duration: 00m 54s)
  • 20:25 demon@tin: Synchronized docroot/wikimedia.org/: bye bye firefox os. you will (not) be missed (duration: 00m 58s)
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4032.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4031.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4030.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4029.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4027.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4026.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4025.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4024.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4023.ulsfo.wmnet
  • 18:52 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4022.ulsfo.wmnet
  • 18:52 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4021.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4032.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4030.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4029.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4027.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4026.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4025.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4024.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4023.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4022.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4021.ulsfo.wmnet
  • 18:40 robh: putting all ulsfo servers into maint mode
  • 18:10 XioNoX: deactivating bgp session from ulsfo to office
  • 16:17 marostegui: Optimize wbc_entity_usage on s6 on db1102
  • 16:15 robh: depooling ulsfo for https://phabricator.wikimedia.org/T185228
  • 15:44 akosiaris: reimage ores100{1..9} T171851
  • 14:37 godog: bump prometheus global instance retention to 15 months - T160677
  • 12:25 marostegui: Fix replication on labsdb1004
  • 09:32 moritzm: rolling restart of thumbor/nginx to pick up libxml security update
  • 08:21 moritzm: installing clamav security update on fermium
  • 07:55 moritzm: installing libxml security updates
  • 07:48 marostegui: Stop MySQL on db1030 - T184397
  • 07:47 marostegui: Remove db1030 from tendril - T184397
  • 07:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 56s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 57s)
  • 07:08 marostegui: Force BBU relearn on db1051 - T186049
  • 06:19 elukey: restart varnish backend on cp4024 - failed fetches / 503s
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 46s)
  • 01:51 mutante: catchpoint: recycled gwicke's user and turned it into a user for volans, upgraded him to admin (T162857)
  • 00:55 krinkle@tin: Synchronized wmf-config: no-op, adding files for beta cluster (duration: 00m 59s)
  • 00:51 krinkle@tin: Synchronized wmf-config/profiler.php: no-op (comment-only) (duration: 00m 58s)

2018-01-30

  • 21:39 demon@tin: Synchronized docroot/noc/conf/open.dblist: (no justification provided) (duration: 00m 57s)
  • 21:38 demon@tin: Synchronized dblists/open.dblist: Adding open.dblist (duration: 00m 57s)
  • 19:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 (duration: 00m 57s)
  • 18:32 mutante: powercyling amslvs4, to be reinstalled as bast3003
  • 18:08 moritzm: installing PHP security updates
  • 15:52 moritzm: installing libxml2 security updates
  • 15:35 moritzm: installing libxcursor security updates
  • 15:30 jynus: stop and reimage es2018
  • 14:42 moritzm: installing curl security updates on app server canaries along with HHVM restart
  • 13:15 moritzm: installing rsync security updates on trusty
  • 12:15 moritzm: installing libxtst updates
  • 10:57 moritzm: installing ffmpeg security updates
  • 09:34 moritzm: installing wireshark security updates
  • 08:35 moritzm: installing libxml2 security updates
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 00:31 demon@tin: rebuilt and synchronized wikiversions files: not changing versions, testing something

2018-01-29

2018-01-28

  • 18:39 bblack: testme

2018-01-26

  • 16:32 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: BETA: Enable FileImporter on testwiki with open config PT 2/2 (duration: 00m 56s)
  • 16:30 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: BETA: Enable FileImporter on testwiki with open config PT 1/2 (duration: 00m 58s)
  • 06:40 niharika29@tin: Finished deploy [scholarships/scholarships@5d2fca4]: Update deadline for scholarships application (duration: 00m 03s)
  • 06:39 niharika29@tin: Started deploy [scholarships/scholarships@5d2fca4]: Update deadline for scholarships application
  • 04:31 urandom: bootstrapping restbase2009-c - T184100
  • 02:32 urandom: bootstrapping restbase2009-b - T184100

2018-01-25

  • 22:24 mutante: restarting gerrit service to apply a few small config changes https://gerrit.wikimedia.org/r/#/q/topic:gerrit-trivial-tweaks+(status:open+OR+status:merged)
  • 22:07 mutante: restarting apache on phabricator server
  • 22:06 urandom: bootstrapping restbase2009-a - T184100
  • 18:13 _joe_: restart hhvm on a few api appservers, high cpu load
  • 14:52 urandom: bootstrapping restbase2008-c - T184100
  • 07:44 urandom: bootstrapping restbase2008-b - T184100
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 32s)
  • 01:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es2011 (duration: 00m 56s)
  • 01:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2011 (duration: 00m 57s)
  • 00:24 urandom: bootstrapping restbase2008-a - T184100

2018-01-24

  • 23:15 ema: cp4025: restart varnish backend due to mbox lag
  • 19:57 jynus: starting es2011 reimage
  • 19:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2011 (duration: 00m 57s)
  • 18:35 no_justification: gerrit: restarting services, will be back momentarily
  • 18:32 urandom: bootstrapping restbase2007-c - T184100
  • 08:16 ema: cp4024: restart varnish-be due to 503s
  • 06:26 urandom: bootstrapping restbase2007-b - T184100
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 33s)
  • 01:14 matt_flaschen: SWAT complete
  • 01:10 matt_flaschen: Deployed 'T185304: NWE: Don't attempt to set selection on unattached textarea' in extensions/VisualEditor
  • 01:02 mattflaschen@tin: Synchronized php-1.31.0-wmf.17/extensions/VisualEditor/: (no justification provided) (duration: 00m 58s)
  • 00:40 mobrovac@tin: Finished deploy [zotero/translators@8f53531]: Update translators to 528296d (duration: 00m 08s)
  • 00:39 mobrovac@tin: Started deploy [zotero/translators@8f53531]: Update translators to 528296d
  • 00:08 urandom: bootstrapping restbase2007-a - T184100

2018-01-23

  • 17:37 robh: mc2036 offline until mainboard fix
  • 14:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Unify comments about sanitarium masters (duration: 00m 56s)
  • 14:36 zeljkof: EU SWAT finished
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the project namespace in Nepali Wikipedia (T184865) (duration: 00m 56s)
  • 14:17 zeljkof: continuing EU SWAT
  • 14:14 zeljkof: EU SWAT finished
  • 14:13 zfilipin@tin: Synchronized php-1.31.0-wmf.17/extensions/WikibaseQualityConstraints/: SWAT: Add missing DISTINCT to SPARQL query (T184705) (duration: 01m 02s)
  • 13:03 moritzm: installing libxtst, libxfixes, libxrandr, libxi security updates
  • 10:56 moritzm: installing libx11 security updates
  • 10:43 moritzm: installing sudo security updates
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 00m 56s)
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1089 (duration: 00m 56s)
  • 08:24 moritzm: installing gdk-pixbuf security updates
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 56s)
  • 06:50 elukey: restart varnish backend on cp4021, 503s and mailbox lag
  • 06:47 marostegui: Stop replication in sync on db2048 and db1089 - T162807
  • 06:23 marostegui: Stop replicaiton in sync db1089 and db1105:3311 - T162807
  • 06:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 00m 57s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 31s)

2018-01-22

  • 22:38 mutante: rebooting the-server-formerly-known-as-amslvs4 to PXE to reinstall it as bast3003. doesnt work
  • 21:02 ottomata: restarting archiva
  • 19:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add Draft namespace to newiki (T184157) (duration: 00m 56s)
  • 19:34 catrope@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionDetailsWidget.js: T184380 (duration: 00m 56s)
  • 19:31 catrope@tin: Synchronized php-1.31.0-wmf.17/extensions/InputBox/InputBox.hooks.php: T185367 (duration: 00m 58s)
  • 18:16 gehel@tin: Finished deploy [wdqs/wdqs@f59ed29]: (no justification provided) (duration: 02m 12s)
  • 18:15 gehel: updating wdqs GUI
  • 18:14 gehel@tin: Started deploy [wdqs/wdqs@f59ed29]: (no justification provided)
  • 17:11 joal@tin: Finished deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands) (duration: 10m 14s)
  • 17:01 joal@tin: Started deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands)
  • 16:51 volans: upgraded debdeploy and cumin to latest released on neodymium/sarin - T182575
  • 15:49 moritzm: upgrade image scalers in eqiad to HHVM 3.18.7
  • afk: restarting jenkins
  • 14:59 moritzm: upgrade mw1221-mw1235 to HHVM 3,18.7
  • 14:43 zeljkof: EU SWAT finished
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update officewiki logo, add HD logo for officewiki (T184575) (duration: 00m 56s)
  • 14:39 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update officewiki logo, add HD logo for officewiki (T184575) (duration: 00m 56s)
  • 14:36 elukey: truncate (again) /var/log/upstart/neutron-server.log on labtestnet2001
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow bureaucrats@mr.wiki to grant&revoke accountcreator (T184553) (duration: 00m 56s)
  • 14:26 moritzm: uploaded debdeploy 0.0.99.2 for jessie-wikimedia, stretch-wikimedia, trusty-wikimedia to apt.wikimedia.org
  • 14:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains (T184853) (duration: 00m 56s)
  • 14:12 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Remove $wgWBQualityConstraintsIncludeDetailInApi setting (T180614) (duration: 00m 56s)
  • 14:11 gehel: cleanup leftover logrotate configuration on wdqs*
  • 14:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable fine grained lua tracking for arwiki, fawiki, viwiki (T185032) (duration: 00m 57s)
  • 13:46 marostegui: Force BBU relearn on db1016 - T166344
  • 12:38 volans: upgraded cumin on labpuppetmasters hosts to 2.0.0
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1063 as vslow - T184397 (duration: 00m 56s)
  • 12:22 moritzm: upgrade mw1238-mw1258 to HHVM 3,18.7
  • 12:01 marostegui: Change x1 codfw topology: db2034 to replicate from eqiad T184888
  • 11:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2036 (duration: 00m 57s)
  • 11:38 volans: uploaded cumin_2.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 09:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:50 jynus: running heavy reads on db2043, db2036 to try to reproduce s3 codfw crash
  • 09:25 marostegui: Stop replication in sync db1099:3311 and db1089 - T162807
  • 09:21 marostegui: Stop MySQL on db1030 to clone db1063 - T184397
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1030 - T184397 (duration: 00m 56s)
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 00m 56s)
  • 08:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1067 weight (duration: 00m 56s)
  • 08:31 moritzm: upgrading video scalers to HHVM 3.18.7
  • 07:51 marostegui: Stop replication in sync db1089 and db1066 - T162807
  • 07:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067, depool db1066 - T162807 (duration: 00m 56s)
  • 07:11 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 07:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 56s)
  • 07:04 elukey: truncated /var/log/upstart/neutron-server.log on labtestnet2001 - / disk space exhausted
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1063 from s8 to s6 - T184397 (duration: 00m 58s)
  • 06:18 marostegui: Compress ruwiki on db1102 - T182450
  • 02:37 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 07m 24s)

2018-01-21

  • 17:21 marostegui: Compress frwiki and jawiki on db1102 - T182450
  • 12:03 marostegui: Defragment s2 on db1102 - T182450
  • 02:35 urandom: bootstrapping restbase2012-b - T184100

2018-01-20

  • 23:20 urandom: bootstrapping restbase2012-a - T184100
  • 17:36 elukey: forced bbu learn cycle on analytics1038 (cache policy flapping from WriteBack to WriteThrough)
  • 16:57 urandom: bootstrapping restbase2011-c - T184100
  • 12:53 urandom: bootstrapping restbase2011-b - T184100
  • 03:32 urandom: bootstrapping restbase2011-a - T184100

2018-01-19

  • 22:53 matt_flaschen: Ran (time foreachwikiindblist flow.dblist extensions/Flow/maintenance/FlowFixInconsistentBoards.php --force) 2>&1|tee --append ~/FlowFixInconsistentBoards_all_2018-01-19_actual_force.txt
  • 21:28 urandom: bootstrapping restbase2010-c - T184100
  • 19:43 mutante: ms-be3003 - power up via mgmt to check if still connected and usable as temp bastion (T184936)
  • 18:58 urandom: bootstrapping restbase2010-b - T184100
  • 17:58 chasemp: labcontrol1002:~# ip addr del 208.80.154.12/32 dev eth0
  • 17:58 chasemp: labcontrol1002:~# ip addr del 208.80.154.102/32 dev eth0
  • 17:55 chasemp: labcontrol1001:~# ip addr del 208.80.154.94/32 dev eth0
  • 17:50 reedy@tin: Synchronized dblists/s3.dblist: alphasort and remove dupes (duration: 01m 01s)
  • 17:11 jynus: stopping mariadb on db2016,17,18,19,23,28&29 T184090
  • 16:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool ddb1089 and depool db1067 (duration: 00m 56s)
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 - T162807 (duration: 00m 56s)
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 - T174569 (duration: 00m 56s)
  • 15:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 - T162807 (duration: 00m 56s)
  • 15:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Tune s1 and s3 database weights (duration: 00m 57s)
  • 15:38 anomie: Running migrateArchiveText.php on all wikis that need it (T184629)
  • 15:24 anomie: Running migrateArchiveText.php on metawiki (T184629)
  • 15:23 godog: bootstrap cassandra-a on restbase2010 - T184100
  • 14:48 anomie: Running migrateArchiveText.php on testwiki (T184629)
  • 14:31 moritzm: installing krb5 updates from jessie point release
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 56s)
  • 12:03 moritzm: installing imagemagick security updates
  • 11:46 moritzm: installing sensible-utils security update
  • 11:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: Decommission old codfw masters (duration: 00m 55s)
  • 11:20 moritzm: upgrading tor on radium to 0.3.2.9
  • 11:18 jynus@tin: Synchronized wmf-config/db-codfw.php: Decommission old codfw masters (duration: 00m 56s)
  • 11:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 56s)
  • 10:55 jynus: restarting es2002
  • 10:53 moritzm: updated tor packages on apt.wikimedia.org to 0.3.2.9-1~d80
  • 10:19 jynus: stop mariadb at db2018 to clone it away
  • 10:02 jynus: restarting es2001
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 54s)
  • 09:56 ema: cp4026 restart varnish-be because of mbox lag
  • 09:10 marostegui: Stop replication in sync db1089 and db1105:3311 - T162807
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 00m 57s)
  • 09:08 godog: start cassandra-a on restbase1015 - T184100
  • 07:11 marostegui: Stop x1 on dbstore2002 to copy its content to db2034 - T184888
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 06:31 marostegui: Stop replication in sync db1089 and db1099:3311 - T162807
  • 06:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 06:22 marostegui: Deploy schema change on db1109 - T174569
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T174569 (duration: 00m 57s)
  • 03:21 TimStarling: on bast1001: restarting bacula-fd with master key decryption enabled, restarting restore job
  • 01:20 TimStarling: attempting to restore home_pmtpa from bacula to bast1001
  • 00:19 ebernhardson: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T185246: Removing unused citizendium from $wgRelatedSitesPrefixes (duration: 00m 56s)
  • 00:11 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: T185250 Switch wiktionary sister search on enwiki to title only (step 2) (duration: 00m 56s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T185250 Switch wiktionary sister search on enwiki to title only (step 1) (duration: 00m 57s)

2018-01-18

  • 23:11 urandom: bootstrapping restbase1015-b -- T184100
  • 22:36 herron: added ruby-rgen-0.7.0-1 (backported package from jessie) to trusty-wikimedia apt repo (T182894)
  • 21:03 arlolra@tin: Finished deploy [parsoid/deploy@a95fede]: Update Parsoid config, again (duration: 09m 39s)
  • 20:53 arlolra@tin: Started deploy [parsoid/deploy@a95fede]: Update Parsoid config, again
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.17
  • 20:09 mutante: releases1001 - /srv/patches got created, initial manual rsync using /usr/local/sbin/sync-srv-patches created by rsync::quickdatacopy, mw patches exists on nightlies server now
  • 20:09 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/Score/includes/Score.php: SWAT: Always pass FileBackend instance to `new FileRepo()` T185204 (duration: 01m 12s)
  • 20:01 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: (no justification provided) (duration: 01m 09s)
  • 20:00 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: (no justification provided)
  • 20:00 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: (no justification provided) (duration: 00m 44s)
  • 19:59 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: (no justification provided)
  • 19:56 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: Updating Parsoid config (duration: 02m 01s)
  • 19:54 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: Updating Parsoid config
  • 19:53 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/VisualEditor/modules/ve-mw/ui/pages/ve.ui.MWTemplatePlaceholderPage.js: SWAT: Update TitleInput getTitle to getMWTitle (duration: 01m 09s)
  • 19:24 arlolra: Updated Parsoid to af06386 (T45094)
  • 19:20 ema: cache_upload: upgrade cp3049 to varnish 5
  • 19:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update linter stats for commonswiki less frequently T184280 (duration: 01m 13s)
  • 19:17 arlolra@tin: Finished deploy [parsoid/deploy@fcc2b63]: Updating Parsoid to af06386 (duration: 09m 32s)
  • 19:08 arlolra@tin: Started deploy [parsoid/deploy@fcc2b63]: Updating Parsoid to af06386
  • 19:00 ema: cache_upload: repool cp3046 (varnish 5)
  • 18:58 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: T184670: Hide Flow beta feature everywhere but testwiki (duration: 01m 10s)
  • 18:54 ema: cache_upload: upgrade cp3046 to varnish 5
  • 18:47 bsitzmann@tin: Finished deploy [mobileapps/deploy@669fb5b]: Update mobileapps to 2690899 (T184328 T184557 T177007 T184669 T177430 T185050) (duration: 07m 03s)
  • 18:40 bsitzmann@tin: Started deploy [mobileapps/deploy@669fb5b]: Update mobileapps to 2690899 (T184328 T184557 T177007 T184669 T177430 T185050)
  • 18:39 ema: cache_upload: repool cp3045 (varnish 5)
  • 18:33 ema: cache_upload: upgrade cp3045 to varnish 5
  • 18:23 mlitn@tin: Finished deploy [3d2png/deploy@74b1ed7]: Updating 3d2png repo (duration: 00m 50s)
  • 18:22 mlitn@tin: Started deploy [3d2png/deploy@74b1ed7]: Updating 3d2png repo
  • 18:00 ema: cache_upload: repool cp3044 (varnish 5)
  • 17:55 ema: cache_upload: upgrade cp3044 to varnish 5
  • 17:39 moritzm: rebooting sodium (and temporarily disable icinga-wm due to some expected spam due to clients failing to run apt-get update)
  • 17:33 jynus: starting compare.py on s3 codfw (it triggered db2036 crash before)
  • 17:31 ema: cache_upload: repool cp3039 (varnish 5)
  • 17:26 ema: cache_upload: upgrade cp3039 to varnish 5
  • 17:02 ema: cache_upload: repool cp3036 (varnish 5)
  • 16:55 ema: cache_upload: upgrade cp3036 to varnish 5
  • 15:54 ema: cache_upload: repool cp3048 (varnish 5)
  • 15:49 ema: cache_upload: upgrade cp3048 to varnish 5
  • 15:40 moritzm: rebooting labsdb1004 for kernel security update
  • 15:40 ema: cache_upload: repool cp3047 (varnish 5)
  • 15:34 ema: cache_upload: upgrade cp3047 to varnish 5
  • 15:33 moritzm: reboot labsdb1006 (OSM slave) for kernel security update
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 01m 12s)
  • 15:15 mforns@tin: Finished deploy [analytics/refinery@78f98d9]: deploying refinery to add ISO codes to pageviews by country (duration: 04m 12s)
  • 15:11 mforns@tin: Started deploy [analytics/refinery@78f98d9]: deploying refinery to add ISO codes to pageviews by country
  • 15:01 marostegui: Stop replication in sync db1089 and db1066 - T162807
  • 15:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 01m 11s)
  • 14:58 moritzm: installing bind security updates (we only use the client-side tools)
  • 14:58 volans: reprepro includedeb jessie-wikimedia python-requests-mock_1.3.0-3_all.deb
  • 14:45 ema: cache_upload: repool cp3038 (varnish 5)
  • 14:44 herron: disabling puppet agents during deploy of 404587, 404689
  • 14:39 ema: cache_upload: upgrade cp3038 to varnish 5
  • 14:39 godog: restart hhvm on mw1233
  • 14:31 _joe_: restarting hhvm on a few API appservers
  • 14:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T174569 (duration: 01m 12s)
  • 14:28 ema: cache_upload: repool cp3035 (varnish 5)
  • 14:25 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2043 to s3 master after db2036 crash (duration: 01m 12s)
  • 14:25 godog: restart hhvm on mw1227
  • 14:23 ema: cache_upload: upgrade cp3035 to varnish 5
  • 14:19 jynus: starting mysql on db2043
  • 14:17 jynus: stopping mysql on db2043
  • 14:10 zeljkof: EU SWAT finished
  • 14:10 ema: cache_upload: repool cp3037 (varnish 5)
  • 14:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikibooks (T185182) (duration: 01m 13s)
  • 13:54 ema: cache_upload: upgrade cp3037 to varnish 5
  • 13:49 moritzm: upgrade mw* servers in eqiad running 3.18.5+dfsg-1+wmf3 (recent installations) to 3.18.5+dfsg-1+wmf4
  • 13:19 jynus: changing topology of codfw s3 databases
  • 13:05 akosiaris: reboot poolcounter2001 for PCID/INVPCID CPU feature enabling
  • 13:03 akosiaris: reboot webperf1001 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:57 akosiaris: enable puppet across the fleet after nitrogen (puppetdb) reboot
  • 12:56 akosiaris: reboot nitrogen for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:52 jgleeson: turned on donations queue consumer process-control job (actual time of change 17/01/18 ~16:20)
  • 12:45 akosiaris: reboot seaborgium for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:43 elukey: bohrium rebooted for kernel upgrades
  • 12:43 akosiaris: disable puppet across the fleet for nitrogen (puppetdb) reboot
  • 12:40 elukey: set piwik in readonly mode and stopped mysql on bohrium (prep step for reboot)
  • 12:36 akosiaris: reboot chlorine.eqiad.wmnet etcd1003.eqiad.wmnet etcd1005.eqiad.wmnet fermium.wikimedia.org install1002.wikimedia.org krypton.eqiad.wmnet kubestagetcd1003.eqiad.wmnet logstash1009.eqiad.wmnet mwdebug1001.eqiad.wmnet sca1004.eqiad.wmnet for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 11:34 akosiaris: reboot logstash1008 etcd1002 kubestagetcd1002.eqiad.wmnet for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 11:12 ema: cp3046: restart varnish-be due to mbox lag
  • 11:06 volans: disabled puppet on tegmen to test impact on puppetdb - T170740
  • 10:57 akosiaris: reboot actinium.wikimedia.org aluminium.wikimedia.org argon.eqiad.wmnet boron.eqiad.wmnet bromine.eqiad.wmnet darmstadtium.eqiad.wmnet dbmonitor1001.wikimedia.org dubnium.wikimedia.org dysprosium.wikimedia.org etcd1001.eqiad.wmnet etcd1004.eqiad.wmnet fermium.wikimedia.org hassium.eqiad.wmnet kubestagetcd1001.eqiad.wmnet logstash1007.eqiad.wmnet meitnerium.wikimedia.org mendelevium.eqiad.wmnet mwdebug1002.eqiad.wmnet m
  • 10:45 ema: cp3034: restart varnishxcps and varnishmedia, they were both using 100% of a cpu core
  • 10:35 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.31.0-wmf.17$ mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintStatements.php --wiki wikidatawiki (T184720)
  • 10:30 akosiaris: reboot etherpad1001 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2034 from s1 as it will be in x1 - T184888 (duration: 01m 12s)
  • 10:25 mobrovac@tin: Finished deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541 (duration: 12m 18s)
  • 10:12 mobrovac@tin: Started deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541
  • 10:10 mobrovac@tin: Finished deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541 (duration: 02m 29s)
  • 10:07 mobrovac@tin: Started deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541
  • 10:07 moritzm: rebooting rdb1002/rdb1004/rdb1006/rdb1008 for kernel security update
  • 09:58 akosiaris: reboot etcd1006 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 09:43 ema: cache_upload: repooled cp3034 running varnish 5
  • 09:38 elukey: reboot thorium (analytics webserver) for security upgrade - This maintenance will cause temporary unavailability of the Analytics websites
  • 09:27 marostegui: !log Stop replication in sync db1089 and db2048 (codfw master) - T162807
  • 09:26 jynus: reimage es2003 to stretch
  • 09:21 elukey: reboot druid1001 for kernel upgrades
  • 09:20 akosiaris: reboot oresrdb2001 for PCID/INVPCID CPU feature enabling
  • 09:10 akosiaris: reboot alcyone pollux sca2004 poolcounter2002 serpens for PCID/INVPCID CPU feature enabling
  • 09:07 marostegui: Stop replication in sync db1089 db1067 - T162807
  • 08:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 13s)
  • 08:37 godog: bootstrap cassandra-c on restbase1013
  • 08:30 moritzm: reboot iron for kernel security update
  • 06:27 marostegui: Deploy schema change on s8 db1087 (sanitarium master) with replication (this will generate lag on labs servers) - T174569
  • 06:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T174569 (duration: 01m 12s)
  • 06:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 - T174569 (duration: 01m 13s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.16) (duration: 07m 18s)
  • 01:08 twentyafterfour: phabricator deployment finished without incident.
  • 01:01 twentyafterfour: Evening SWAT completed. Starting phabricator deployment of #phabricator-2018-07-17 [release/2017-01-17/1]
  • 01:00 twentyafterfour@tin: Finished scap: Evening SWAT (duration: 24m 29s)
  • 00:35 twentyafterfour@tin: Started scap: Evening SWAT

2018-01-17

  • 23:38 mutante: [terbium:~] $ echo 'https://annual.wikimedia.org' | mwscript purgeList.php
  • 22:54 urandom: bootstrapping restbase1013-b - T184100
  • 22:00 andrewbogott: rebooting californium, silver, labcontrol1001, labservices1001
  • 21:03 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.17 (duration: 01m 11s)
  • 20:57 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.17
  • 20:45 thcipriani@tin: Synchronized php-1.31.0-wmf.17/vendor/wikibase/data-model-services: Add missing files from wikibase/data-model-services 3.9.0 (duration: 01m 15s)
  • 20:41 thcipriani@tin: Synchronized php-1.31.0-wmf.17/includes/ServiceWiring.php: [MCR] RevisionStore::getTitle final logged fallback to master PART II (duration: 01m 12s)
  • 20:40 thcipriani@tin: Synchronized php-1.31.0-wmf.17/includes/Storage/RevisionStore.php: [MCR] RevisionStore::getTitle final logged fallback to master PART I (duration: 01m 04s)
  • 20:35 pnorman@tin: Finished deploy [kartotherian/deploy@ecdda41]: (no justification provided) (duration: 05m 44s)
  • 20:30 pnorman@tin: Started deploy [kartotherian/deploy@ecdda41]: (no justification provided)
  • 20:05 andrewbogott: rebooted labservices1002, labcontrol1002, labnet1002
  • 19:56 andrewbogott: rebooting labpuppetmaster1001
  • 19:46 andrewbogott: rebooting labpuppetmaster1002
  • 19:45 papaul: Powering down mw2140 for main board replacement
  • 18:20 niharika29@tin: Synchronized php-1.31.0-wmf.17/includes/EditPage.php: Update Save/Publish button flag from 'constructive' to 'progressive' https://gerrit.wikimedia.org/r/#/c/404733/ (duration: 01m 14s)
  • 18:09 moritzm: uploading HHVM 3.18.5+wmf4 for stretch-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out)
  • 18:08 ejegg: turned off main silverpop recipient data fetch job
  • 17:55 mutante: gerrit login page design changed (https://gerrit.wikimedia.org/r/402665) in case you were worried it was a fake page trying to steal your login, heh
  • 17:44 moritzm: resetting RAC on labsdb1004 (serial console inaccessible)
  • 17:17 chasemp: reboot labstore2003
  • 17:12 madhuvishy: Rebooting labstore2004
  • 17:08 godog: bootstrap cassandra-a on restbase1013
  • 17:06 ema: upgrade pybal on primary LVSs to 1.14.3 T184715, T184721
  • 16:52 ema: upgrade secondary LVSs to pybal 1.13.4 T184715, T184721
  • 16:33 XioNoX: routing ns2 to radon
  • 16:26 ema: reboot baham (codfw authdns) for kernel upgrade
  • 16:24 XioNoX: routing ns1 to eqiad
  • 16:17 chasemp: labmon1001:~# service grafana-server
  • 16:17 ema: reboot radon (eqiad authdns) for kernel upgrade
  • 16:13 jgleeson: updated civicrm from 354f32fe8a to c70f01cd83
  • 16:12 chasemp: labmon1001:~# /sbin/reboot
  • 16:09 XioNoX: routing ns0 to codfw (baham)
  • 16:07 moritzm: upgrading HHVM in codfw to 3.18.7 (wmf4)
  • 16:06 moritzm: upgrading nginx on mwdebug servers to 1.13.6-2+wmf1~jessie1
  • 16:05 jgleeson: turned off donations queue consumer process-control job
  • 16:00 ema: pybal 1.14.3 uploaded to apt.w.o
  • 15:51 chasemp: labstore1002:~# /sbin/reboot
  • 15:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 after fixing data drifts - T162807 (duration: 01m 12s)
  • 15:41 _joe_: dropping ruwiki htmlCacheUpdate records stuck int he old jobqueue
  • 15:36 moritzm: upgrading nginx on mw servers in codfw to 1.13.6-2+wmf1~jessie1
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1104 (duration: 01m 12s)
  • 14:57 moritzm: resetting RAC on labsdb1007 (serial console inaccessible)
  • 14:53 moritzm: resetting RAC on labsdb1006 (serial console inaccessible)
  • 14:38 chasemp: labstore1001:~# /sbin/reboot
  • 14:27 zeljkof: EU SWAT finished
  • 14:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create "eliminator" user group on ur.wikipedia (T184607) (duration: 01m 12s)
  • 14:14 moritzm: repooling chromium
  • 14:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft Namespace in enwikiversity (T184957) (duration: 01m 12s)
  • 14:07 moritzm: rebooting chromium for kernel security update
  • 14:04 gehel: restart of elasticsearch / cirrus eqiad completed (cluster still recovering)
  • 14:03 moritzm: depooling chromium
  • 13:51 chasemp: reboot labstore2003
  • 13:46 akosiaris: reboot sca2003 webperf2001 planet2001 poolcounter2002 mx2001 kubetcd200{1,2,3} install2002 dbmonitor2001 alsafi acrux hassaleh diadem nihal pybal-test200{1,2,3} releases2001 tureis for PCID, INVPCID
  • 13:45 chasemp: labstore2002:~# sudo update-grub && /sbin/reboot
  • 13:40 chasemp: labstore2001:~# /sbin/reboot
  • 13:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 (duration: 01m 13s)
  • 13:31 akosiaris: reboot acrab for PCID,INVPCID enabling
  • 13:22 marostegui: Deploy schema change on db1099:3318 - https://phabricator.wikimedia.org/T174569
  • 13:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 - T174569 (duration: 01m 12s)
  • 13:17 moritzm: upgrading app server canaries to 3.18.5+wmf4
  • 13:12 marostegui: Fixing drifts on db1065 - T162807
  • 12:28 moritzm: uploading HHVM 3.18.5+wmf4 for jessie-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out)
  • 12:10 moritzm: updating HHVM in deployment-prep to 3.18.5+wmf4
  • 11:40 godog: bootstrap cassandra-b on restbase1016
  • 11:28 moritzm: rearmed keyholder on neodymium
  • 11:24 moritzm: rebooting neodymium for kernel security update
  • 11:19 _joe_: restarted nginx on mw1346, was in a bad state
  • 10:51 moritzm: reset RAC on chromium, serial console is inaccessible
  • 10:42 moritzm: repooling hydrogen
  • 10:39 moritzm: rebooting hydrogen for kernel security update
  • 10:34 moritzm: depooling hydrogen again
  • 10:22 moritzm: repooling hydrogen (and pdns-recursor restarted), experiment concluded
  • 10:14 moritzm: depooling hydrogen (and keeping pdns-recursor stopped for a few minutes to check whether problems with load-balanced recdns traffic are still an issue)
  • 10:11 moritzm: reset RAC on hydrogen, serial console was inaccessible
  • 10:01 godog: start cassandra-a on restbase1016
  • 09:52 elukey: reboot druid1002 for kernel upgrades
  • 09:46 elukey: removed upstart config for brrd on eventlog1001 (failing and spamming syslog, old leftover?)
  • 09:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Full repool db1101:3318 (duration: 01m 11s)
  • 09:30 moritzm: rebooting flerovium and furud for kernel security update
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101:3318 (duration: 01m 12s)
  • 09:14 godog: reimage restbase1016 - T184100
  • 09:06 elukey: reboot analytics1003 for kernel upgrades
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 01m 11s)
  • 08:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3318 (duration: 15m 42s)
  • 08:44 elukey: reboot stat100[456] for kernel upgrades
  • 07:48 elukey: restart varnish backend on cp4024 (ton of 503s, icinga alerting for mailbox lag)
  • 07:46 oblivian@neodymium: conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw12([0-1][0-9]|20)\.eqiad\.wmnet
  • 07:45 _joe_: depooling mw1209-1220 from the appserver cluster for decommissioning, T185004
  • 06:47 marostegui: Remove labsdb1001 and labsdb1003 from tendril - T184832
  • 06:40 marostegui: Stop MySQL on labsdb1001 (already dead) and labsdb1003 - T184832
  • 06:29 marostegui: Stop replication in sync on db1089 and s1 codfw master (db2048) - T162807
  • 06:28 marostegui: Deploy schema change on db1104 - T174569
  • 06:21 marostegui: Upgrade mariadb and kernel on db1104
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T174569 (duration: 01m 14s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.16) (duration: 07m 11s)
  • 00:28 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T182616 Remove cirrus AB test config for hewiki (duration: 01m 09s)
  • 00:26 ebernhardson@tin: Synchronized fc-list: SWAT: T184664 Updating fonts list and sorting it (duration: 01m 12s)
  • 00:21 ebernhardson@tin: Synchronized fc-list: SWAT: T184664 Updating fonts list and sorting it (duration: 01m 12s)
  • 00:10 ebernhardson@tin: Synchronized php-1.31.0-wmf.16/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T182616 Turn off cirrus AB test on hewiki (duration: 01m 12s)
  • 00:08 ebernhardson@tin: Synchronized php-1.31.0-wmf.17/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T182616 Turn off cirrus AB test on hewiki (duration: 01m 14s)

2018-01-16

  • 22:57 niharika29@tin: Finished deploy [scholarships/scholarships@728d203]: Update privacy statement and delete invalidated translation files. T184659 (duration: 00m 02s)
  • 22:57 niharika29@tin: Started deploy [scholarships/scholarships@728d203]: Update privacy statement and delete invalidated translation files. T184659
  • 22:53 thcipriani@tin: rebuilt and synchronized wikiversions files: group0 to 1.31.0-wmf.17
  • 22:40 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for htmlCacheUpdate jobs for all wikis but en, commons and wikidata - T182023 (duration: 01m 12s)
  • 22:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@19b9bdd]: Switch htmlCacheUpdates for all but en, commons, wikidata T182023 (duration: 00m 35s)
  • 22:39 ppchelko@tin: Started deploy [cpjobqueue/deploy@19b9bdd]: Switch htmlCacheUpdates for all but en, commons, wikidata T182023
  • 22:19 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/WikimediaMessages/WikimediaMessages.hooks.php: Update access to ORES isModelEnabled() (duration: 01m 13s)
  • 22:19 ottomata: apt-get install librdkafka1=0.9.4-1~jessie1 librdkafka++1=0.9.4-1~jessie1 on scb* to put librdkafka back at node-rdkafka compat version (somehow this was upgraded yesterday...very dangerous!!)
  • 22:16 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.17 and rebuild l10n cache (duration: 25m 45s)
  • 21:50 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.17 and rebuild l10n cache
  • 21:15 andrewbogott: rebooting labvirt1014 and 1015
  • 21:04 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.16
  • 20:59 andrewbogott: rebooting labvirt1013
  • 20:42 demon@tin: Finished scap: wmf.17 files, no bootstrap of i18n tho (x2) (duration: 06m 33s)
  • 20:35 demon@tin: Started scap: wmf.17 files, no bootstrap of i18n tho (x2)
  • 20:34 demon@tin: scap aborted: wmf.17 files, no bootstrap of i18n tho (duration: 08m 59s)
  • 20:34 andrewbogott: rebooting labvirt1011
  • 20:32 herron: re-enabling puppet agents
  • 20:25 demon@tin: Started scap: wmf.17 files, no bootstrap of i18n tho
  • 20:24 herron: temporarily disabling puppet agents while troubleshooting puppet crl
  • 20:21 andrewbogott: rebooting labvirt1010
  • 20:07 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.16
  • 20:03 andrewbogott: rebooting labvirt1009
  • 19:46 andrewbogott: rebooting labvirt1008
  • 19:46 thcipriani@tin: Synchronized php-1.31.0-wmf.16/includes/Storage/RevisionStore.php: RevisionStore, fix loadSlotContent with no $blobFlags T184749 (duration: 01m 13s)
  • 19:30 twentyafterfour: restarted wikibugs (several attempts, eventually it worked)
  • 18:50 chasemp: reboot labvirt1020
  • 18:44 chasemp: reboot labvirt1019
  • 18:35 andrewbogott: rebooting labvirt1007
  • 18:30 arlolra@tin: Finished deploy [parsoid/deploy@1026fd2]: Updating Parsoid to 231bfff (duration: 13m 13s)
  • 18:25 herron: removing ganeti VM puppetcompiler1001
  • 18:19 moritzm: rebooting labmon1002 for kernel security update
  • 18:17 arlolra@tin: Started deploy [parsoid/deploy@1026fd2]: Updating Parsoid to 231bfff
  • 17:59 moritzm: rebooting labnet100[34] and labcontrol100[34] for kernel security update
  • 17:52 herron: re-enabled puppet agents
  • 17:50 andrewbogott: rebooting labvirt1005
  • 17:45 herron: disabled puppet agents troubleshooting T184444
  • 17:31 andrewbogott: rebooting labvirt1004
  • 17:11 andrewbogott: upgrading and rebooting labvirt1002
  • 17:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1092 original weight (duration: 01m 12s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1092 (duration: 01m 12s)
  • 16:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1092 (duration: 01m 12s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 - T174569 (duration: 01m 08s)
  • 16:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1101:3317 (duration: 01m 12s)
  • 16:11 oblivian@neodymium: conftool action : set/pooled=no; selector: cluster=api_appserver,name=mw120[1-8]\.eqiad\.wmnet
  • 16:10 _joe_: depooling mw1201-1208 from the API cluster, T185004
  • 16:09 moritzm: rebooting praseodymium for kernel security update
  • 16:08 godog: bootstrap cassandra-c on restbase1018
  • 16:04 chasemp: add arturo to acl*operations-team
  • 16:03 moritzm: rebooting labweb* hosts for kernel security update
  • 15:57 andrewbogott: rebooting labvirt1001
  • 15:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1101:3317 after kernel upgrade (duration: 01m 12s)
  • 15:44 elukey: reboot druid1003 for kernel upgrades
  • 15:41 moritzm: rebooting achernar for kernel security update
  • 15:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3317 after kernel upgrade (duration: 01m 12s)
  • 15:31 moritzm: rebooting acamar for kernel security update
  • 15:30 marostegui: Deploy schema change on db1101:3318 - T174569
  • 15:11 marostegui: Upgrade mariadb and kernel on db1101
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 db1101:3318 for schema change, mariadb upgrade and kernel upgrade - T162807 (duration: 01m 12s)
  • 14:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 01m 09s)
  • 14:52 moritzm: rebooting graphite1001 for kernel security update
  • 14:35 moritzm: rebooting graphite1003 for kernel security update
  • 14:25 moritzm: powercycling labtestservices2003, stuck in reboot
  • 14:18 moritzm: powercycling labtestservices2001, stuck in reboot
  • 14:13 zeljkof: EU SWAT finished
  • 14:12 elukey: reboot druid100[56] for kernel upgrades
  • 14:11 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Restrict sending mails to new users" config change (T184470) (duration: 01m 13s)
  • 14:09 godog: bootstrap cassandra-b on restbase1018
  • 14:01 moritzm: rebooting labtest* hosts for kernel security update
  • 13:56 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=druid1004.*.wmnet
  • 13:54 moritzm: rebooting graphite1002 for kernel security update
  • 13:52 elukey: reboot druid1004 for kernel upgrades
  • 13:46 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=druid1004.*.wmnet
  • 13:46 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=druid1004*.wmnet
  • 13:44 moritzm: rebooting graphite2002 for kernel security update
  • 13:31 oblivian@neodymium: conftool action : set/weight=25; selector: cluster=api_appserver,name=mw134[3-8[B]\.eqiad\.wmnet
  • 13:28 moritzm: rebooting graphite2001 for kernel security update
  • 13:20 oblivian@neodymium: conftool action : set/pooled=yes; selector: cluster=api_appserver,name=mw134[3-7]\.eqiad\.wmnet
  • 12:56 elukey: reboot kafka100[23] for kernel upgrades
  • 11:59 ariel@tin: Finished deploy [dumps/dumps@c165ca0]: enable 7z prefetch files for page content dumps (duration: 00m 04s)
  • 11:59 ariel@tin: Started deploy [dumps/dumps@c165ca0]: enable 7z prefetch files for page content dumps
  • 11:51 moritzm: rebooting mc2* hosts for kernel security update
  • 11:27 elukey: reboot kafka1001 for kernel upgrades
  • 11:12 moritzm: reboot maerlant for kernel security update
  • 11:08 moritzm: uploaded HHVM 3.18.7 for stretch-wikimedia to apt.wikimedia.org
  • 10:59 godog: roll-restart swift object server - T167400
  • 10:57 moritzm: reboot nescio for kernel security update
  • 10:13 godog: start cassandra-a on restbase1018 - T184100
  • 09:56 moritzm: upgrading canary app servers to HHVM 3.18.7
  • 09:50 marostegui: Stop replication in sync db1089 - db1105:3311 - T162807
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 12s)
  • 09:38 _joe_: started refreshLinks additional jobs for commonswiki,ruwiki
  • 09:30 oblivian@neodymium: conftool action : set/weight=10; selector: cluster=api_appserver,name=mw13(39|4[12]).eqiad.wmnet
  • 09:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 01m 12s)
  • 09:24 oblivian@neodymium: conftool action : set/pooled=yes:weight=1; selector: cluster=api_appserver,name=mw13(39|4[12]).eqiad.wmnet
  • 09:16 moritzm: installing libxml2 security updates on mw* servers (so that it gets picked up along the HHVM 3.18.7 rollout)
  • 09:12 moritzm: installing krb5 security updates (we're just using rev deps)
  • 09:08 jynus: upgrade and reboot db1031 after switchover
  • 08:49 moritzm: rearmed key holder on sarin
  • 08:45 moritzm: rebooting sarin for kernel security update
  • 08:38 marostegui: Stop replication in sync db1089 - db1099:3311 - T162807
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 01m 12s)
  • 08:28 jynus: master x1 eqiad failover has finished
  • 08:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Promote db1055 as the new x1 master (duration: 00m 49s)
  • 08:17 jynus: setting db1031 (x1 master) as read only
  • 08:11 jynus: start x1 eqiad master failover
  • 08:02 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 and db1056 after maintenance (duration: 00m 49s)
  • 07:42 jynus: moving replication topology of x1 replicas
  • 07:34 marostegui: Deploy schema change on dbstore1001 (s8) - T174569
  • 07:30 marostegui: Stop replication in sync db1066 and db1089 - T162807
  • 07:30 jynus: upgrade and reboot db1056
  • 07:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 and db1089 - T162807 (duration: 01m 13s)
  • 07:29 oblivian@neodymium: conftool action : set/weight=25; selector: name=mw1340.eqiad.wmnet
  • 07:17 jynus: upgrade and reboot db1055
  • 07:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 and db1056 for maintenance (duration: 01m 12s)
  • 07:03 marostegui: Deploy schema change on dbstore1002 (s8) - T174569
  • 06:32 marostegui: Deploy schema change on db1092 - T174569
  • 06:24 marostegui: Upgrade kernel on db1092
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T174569 (duration: 01m 32s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 10s)

2018-01-15

  • 23:40 demon@tin: Synchronized wmf-config/InitialiseSettings.php: turn educationprogram back on for cs.wikipedia -- turns out there was no consensus and a patch should never have been written 😡 (duration: 01m 13s)
  • 18:50 _joe_: pooled mw1340 as an api appserver
  • 18:43 oblivian@puppetmaster1001: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:42 oblivian@neodymium: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:34 oblivian@neodymium: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:01 moritzm: uploading HHVM 3.18.7 (3.18.5+dfsg-1+wmf3) for jessie-wikimedia to apt.wikimedia.org
  • 17:44 moritzm: updating HHVM in deployment-prep to HHVM 3.18.7
  • 17:08 godog: bootstrap cassandra-c on restbase1017
  • 16:53 jynus: upgrade and restart db2018
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 and db1089 - T162807 (duration: 01m 12s)
  • 16:49 jynus: finished codfw s3 master switchover
  • 16:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover s3 codfw master from db2018 to db2036 (duration: 01m 12s)
  • 16:41 _joe_: restarting hhvm on mw1227, threads stuck in HPHP::jit::enterTCImpl
  • 16:31 marostegui: Force WB on db2033 - T184888
  • 16:24 jynus: restarting db2036 to set as master
  • 16:20 jynus: starting codfw s3 master switchover
  • 15:55 marostegui: Stop replication in sync db1067 and db1089 - T162807
  • 15:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 12s)
  • 15:44 jynus: upgrade and restart db2074
  • 15:33 jynus: upgrade and restart db2057
  • 15:08 jynus: upgrade and restart db2050
  • 14:58 jynus: upgrade and restart db2043
  • 14:46 jynus: upgrade and restart db2036
  • 14:41 zeljkof: EU SWAT finished
  • 14:40 zfilipin@tin: Synchronized php-1.31.0-wmf.16/extensions/ContentTranslation: SWAT: CX1: Fix translation view UI overlaps (T184662 T184130) (duration: 01m 16s)
  • 14:08 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable lua fine grained usage tracking in some wikis (T184322) (duration: 01m 14s)
  • 14:05 moritzm: reboot rdb* hosts in codfw for kernel security update
  • 13:41 gehel: starting rolling reboot of elasticsearch / cirrus eqiad for kernel upgrade
  • 13:38 elukey: reboot eventlog1001 for kernel updates
  • 13:20 elukey: reboot kafka2003 for kernel upgrades
  • 12:04 jynus: upgrade and restart db2017
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 01m 12s)
  • 11:54 moritzm: rebooting ores1* for kernel security update
  • 11:36 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw13(3[8-9]|4[0-9]).*
  • 11:21 godog: upload scap 3.7.6-1 - T127762
  • 11:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 14s)
  • 11:09 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 14s)
  • 11:08 godog: bootstrap cassandra-a on restbase1017
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 10:52 gehel: lowering disk watermark on elasticsearch eqiad to shuffle shards around
  • 10:51 jynus: s2 codfw master swithover finished
  • 10:51 hashar: Upgrading zuul to 2.5.1 on contint1001 / contint2001 | T158243
  • 10:51 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover codfw s2 master from db2017 to db2035 (duration: 01m 12s)
  • 10:50 elukey: reboot kafka2002 for kernel updates
  • 10:48 hashar: Upgrading zuul to 2.5.1 on contint1001 / contint2001
  • 10:27 jynus: upgrade and restart db2035
  • 10:22 jynus: starting codfw s2 master switchover
  • 10:16 jynus: start proxysql on terbium
  • 10:15 moritzm: reboot wasat for kernel security update
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 01m 09s)
  • 09:58 elukey: rolling reboots of aqs hosts (1005->1009) for kernel updates
  • 09:45 marostegui: Deploy schema change on s8 codfw master (db2045) with replication (this will generate lag on s8 codfw) - T174569
  • 09:32 elukey: reboot kafka2001 for kernel updates
  • 09:11 hashar: upgrading Zuul on contint2001 (zuul-merger) | https://gerrit.wikimedia.org/r/#/c/356181/
  • 09:09 hashar: upgrading Zuul on contint1001 | https://gerrit.wikimedia.org/r/#/c/356181/
  • 09:07 elukey: reboot aqs1004 for kernel updates
  • 08:44 jynus: disconnecting codfw -> eqiad replication for x1
  • 08:42 moritzm: reboot wezen for kernel security update
  • 08:22 moritzm: rebooting bast1001 for kernel security update
  • 08:15 moritzm: rebooting terbium for kernel security update
  • 08:11 ema: lvs400[56]: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 07:58 _joe_: reenabling puppet on all systems where it was previously enabled, after various testing
  • 07:50 _joe_: forcing puppet run on the puppetmasters to force pluginsync for function change
  • 07:41 _joe_: disabling puppet in all of production before merging https://gerrit.wikimedia.org/r/402345
  • 07:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Replace db1063 with db1087 as vslow in s8 (duration: 01m 12s)
  • 07:11 marostegui: Deploy schema change on silver (labswiki) and labtestweb2001 (labtestwiki) - T174569
  • 06:52 marostegui: Upgrade MariaDB on db1065
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 to fix data drifts on the archive table - T162807 (duration: 01m 13s)
  • 06:13 marostegui: Deploy schema change on db1070 (s5 master) - T174569
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 07m 50s)

2018-01-12

  • 20:07 mutante: mw1227 hhvm-restart
  • 20:07 mutante: mw1227 - high load: hhvm-dump-debug > /root/hhvm-dump-debug-2017012.log | Backtrace saved as /tmp/hhvm.2203.bt.
  • 19:19 ejegg: disabled Omnimail recipient load backfill job
  • 19:09 bblack: leftover cruft from expired digicert-2016 certs all cleaned up now :)
  • 19:08 jynus: upgrade and restart db2091
  • 18:32 jynus: upgrade and restart db2088
  • 18:28 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1014.eqiad.wmnet
  • 17:59 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1014.eqiad.wmnet
  • 17:58 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1012.eqiad.wmnet
  • 17:41 jynus: upgrade and restart db2064
  • 17:34 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1012.eqiad.wmnet
  • 17:33 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1010.eqiad.wmnet
  • 17:31 demon@tin: Synchronized docroot/mediawiki/: prettier keys page (duration: 01m 13s)
  • 17:28 cwd: re-enabled payments,civi,listener,p-c
  • 17:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 09s)
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1010.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1009.eqiad.wmnet
  • 17:06 cwd: disabled payments/civi/listener
  • 17:06 cwd: disabled process-control jobs
  • 17:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1089 (duration: 01m 12s)
  • 16:52 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1009.eqiad.wmnet
  • 16:52 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1008.eqiad.wmnet
  • 16:46 jynus: upgrade and restart db2063
  • 16:34 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1008.eqiad.wmnet
  • 16:34 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1007.eqiad.wmnet
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 (duration: 01m 12s)
  • 16:19 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1007.eqiad.wmnet
  • 16:18 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 16:15 jynus: upgrade and restart db2056
  • 15:44 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 15:44 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2005.codfw.wmnet
  • 15:39 jynus: upgrade and restart db2049
  • 14:27 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2005.codfw.wmnet
  • 13:24 jynus: upgrade and restart db2041
  • 12:55 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 12:37 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 12:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2003.codfw.wmnet
  • 12:27 jynus: stop db2035 replication for maintenance
  • 12:23 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2035 for maintenance (duration: 01m 13s)
  • 12:15 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2003.codfw.wmnet
  • 12:14 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 11:42 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 11:41 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1066 (duration: 01m 12s)
  • 11:07 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 10:50 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2007.codfw.wmnet
  • 10:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1105:3311 and slowly repool db1066 (duration: 01m 13s)
  • 10:33 twentyafterfour@tin: Finished deploy [phabricator/deployment@61f1099]: (no justification provided) (duration: 03m 49s)
  • 10:33 elukey: reboot analytics1066->69 for kernel updates
  • 10:30 twentyafterfour@tin: Started deploy [phabricator/deployment@61f1099]: (no justification provided)
  • 10:29 twentyafterfour@tin: Finished deploy [phabricator/deployment@61f1099]: (no justification provided) (duration: 00m 07s)
  • 10:29 twentyafterfour@tin: Started deploy [phabricator/deployment@61f1099]: (no justification provided)
  • 10:24 moritzm: reboot job runners in codfw for kernel security update (along with update to HHVM 3.18.6)
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight db1105:3311 (duration: 01m 13s)
  • 10:11 godog: upload scap 3.7.5-1 - T184774
  • 10:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 and db1100 (duration: 01m 22s)
  • 10:02 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw2140.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1100, db1105:3311, db1105:3312 (duration: 01m 23s)
  • 09:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1100 (duration: 01m 22s)
  • 09:11 godog: reboot ms-be2023 - sdn failed and raid controller isn't happy
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1105:3311 and db1105:3312 (duration: 01m 23s)
  • 09:07 elukey: reboot analytics1063->65 for kernel updates
  • 09:04 marostegui: Upgrade kernel on db1100
  • 09:00 elukey: forced remount of /mnt/hdfs on stat1005 after OOM
  • 08:46 moritzm: reboot remaining API servers in codfw for kernel security update (along with update to HHVM 3.18.6)
  • 08:14 moritzm: reboot video scalers in codfw for kernel security update
  • 07:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3312 with low weight (duration: 01m 22s)
  • 07:01 marostegui: Stop replication in sync db1089 db1105:3311 - T162807
  • 06:46 marostegui: Update mariadb and kernel on db1105 - T184256
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311, db1105:3312 - T162807 T184256 (duration: 01m 22s)
  • 06:24 marostegui: Deploy schema change on db1100 - T174569
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T174569 (duration: 01m 22s)
  • 00:51 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/WikibaseQualityConstraints/extension.json: SWAT: Declare dependency on jquery.makeCollapsible (duration: 01m 21s)
  • 00:43 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/WikibaseQualityConstraints/modules/ui/ConstraintReportGroup.less: SWAT: Do not hide default [Expand] link (duration: 01m 22s)
  • 00:40 thcipriani@tin: Synchronized php-1.31.0-wmf.16/extensions/WikibaseQualityConstraints/modules/ui/ConstraintReportGroup.less: SWAT: Do not hide default [Expand] link (duration: 01m 24s)

2018-01-11

  • 22:35 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/403774
  • 22:04 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403762/
  • 20:57 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403753/
  • 20:52 andrewbogott: rebooting labvirt1003
  • 20:12 twentyafterfour@tin: rebuilt and synchronized wikiversions files: Rollback group1 to wmf.15 due to T184749 refs T180749
  • 20:12 andrewbogott: rebooting labvirt1017 for kernel upgrade
  • 20:04 catrope@tin: Finished scap: SWAT (duration: 30m 12s)
  • 19:58 gehel: elasticsearch / cirrus / codfw rolling reboot completed. Cluster still recovering
  • 19:34 catrope@tin: Started scap: SWAT
  • 19:20 catrope@tin: Synchronized php-1.31.0-wmf.16/includes/: Deprecate old interwiki search result widget (duration: 02m 17s)
  • 19:09 catrope@tin: Synchronized php-1.31.0-wmf.16/extensions/Flow/modules/styles/flow/widgets/editor/mw.flow.ui.EditorWidget.less: T184631 (duration: 01m 22s)
  • 18:06 ema: lvs4007: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 18:00 jynus: upgrade and restart db1102- it may add some minutes of lag to some wikis on wikireplicas
  • 17:32 jynus: shutting down db1059 for maintenance
  • 16:57 akosiaris: upgrade apertium on scb100* nodes done
  • 16:55 godog: start rolling restart of restbase-test / restbase-dev cluster
  • 16:54 jynus: upgrade and restart db1095- it may add some minutes of lag to some wikis on wikireplicas
  • 16:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1099:3311 (duration: 01m 22s)
  • 16:28 moritzm: rebooting mwlog2001 for kernel security update
  • 16:19 moritzm: rebooting mwlog1001 for kernel security update
  • 16:05 moritzm: rebooting notebook1001 for kernel security update
  • 16:05 akosiaris: upgrade apertium on scb200* nodes
  • 15:59 moritzm: reboot lithium for kernel security update
  • 15:51 moritzm: reboot oxygen for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight (duration: 01m 21s)
  • 15:28 moritzm: reboot ruthenium for kernel security update
  • 15:26 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 15:18 akosiaris: clear trusty-wikimedia from apertium packages. The apertium services is a long time now on jessie and all users should have migrated by now. If not, they should
  • 15:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight (duration: 01m 23s)
  • 15:08 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1008.eqiad.wmnet
  • 15:05 moritzm: rolling reboot of prometheus in eqiad for kernel security update
  • 15:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1007.eqiad.wmnet
  • 14:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067,db1099:3318,db1099:3311, depool db1066 (duration: 01m 19s)
  • 14:58 marostegui: Upgrade mariadb and kernel on db1066
  • 14:47 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1007.eqiad.wmnet
  • 14:47 godog: continue swift frontend eqiad roll-restart, ms-fe1007 / ms-fe1008
  • 14:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Promote db2040 as the new codfw-s7 master (duration: 01m 22s)
  • 14:40 moritzm: rolling reboot of prometheus in codfw for kernel security update
  • 14:37 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1271.eqiad.wmnet
  • 14:36 joal@tin: Finished deploy [analytics/refinery@ed8ecbc]: Patching interlanguage link and manually add a jar to our collection (duration: 04m 10s)
  • 14:36 jynus: running scap pull on mw1271
  • 14:32 joal@tin: Started deploy [analytics/refinery@ed8ecbc]: Patching interlanguage link and manually add a jar to our collection
  • 14:26 moritzm: powercycling mw1271
  • 14:25 zeljkof: EU SWAT finished
  • 14:17 jynus: upgrade and restart db2029
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create extendedconfirmed for kowiki (T184675) (duration: 01m 23s)
  • 14:14 akosiaris: set migration_downtime to 2000ms for seaborgium
  • 14:01 moritzm: reboot hafnium for kernel security update
  • 14:00 moritzm: reboot tungsten for kernel security update
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3318 weight (duration: 01m 15s)
  • 13:56 jynus: perform master switchover of s7 codfw
  • 13:42 moritzm: rebooting ores2* for kernel security update
  • 13:34 jynus: upgrade and restart db2077
  • 13:34 moritzm: rebooting bast2001 for kernel security update
  • 13:31 moritzm: migrating instances off ganeti1001 for subsequent reboot for kernel security update
  • 13:27 moritzm: failover the ganeti master in eqiad to ganeti1004
  • 12:39 volans: Icinga failover back to einsteinium completed - T170353
  • 12:38 moritzm: rearmed keyholder on naos
  • 12:36 moritzm: migrating instances off ganeti1007 for subsequent reboot for kernel security update
  • 12:34 moritzm: rebooting naos for kernel security update
  • 12:28 volans: Start Icinga failover back to einsteinium - T170353
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 with low weight (duration: 01m 44s)
  • 12:07 marostegui: Stop replication in sync db1089 db1099:3311 - T162807
  • 12:03 moritzm: migrating instances off ganeti1006 for subsequent reboot for kernel security update
  • 11:33 moritzm: migrating instances off ganeti1005 for subsequent reboot for kernel security update
  • 11:14 moritzm: migrating instances off ganeti1004 for subsequent reboot for kernel security update
  • 11:07 moritzm: reboot remaining job runners in eqiad for kernel security update (along with update to HHVM 3.18.6)
  • 11:02 akosiaris: upload cg3_1.0.0~r12254-1+wmf1_amd64 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:02 moritzm: migrating instances off ganeti1003 for subsequent reboot for kernel security update
  • 10:56 akosiaris: upload apertium_3.4.2~r68466-3+wmf1_amd64to apt.wikimedia.org/jessie-wikimedia/main T181464
  • 10:54 akosiaris: set kvm:migration_downtime to 30ms for both eqiad/codfw ganeti clusters. Then set migration_downtime 30000 for nitrogen/nihal
  • 10:52 moritzm: rearmed keyholder on tin
  • 10:47 moritzm: rebooting tin for kernel security update
  • 10:43 marostegui: Upgrade and restart db1099:3311 and db1099:3318
  • 10:41 jynus: upgrade and restart db2068
  • 10:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1110 original weight (duration: 01m 04s)
  • 10:27 moritzm: rolling reboot of sca/zotero clusters for kernel security update
  • 10:23 jynus: upgrade and restart db2061
  • 10:20 akosiaris: upload hfst_3.13.0~r3461-1+wmf1_amd64 to apt.wikimedia.org/jessie-wikimedia/main T181463
  • 10:14 moritzm: migrating instances off ganeti1002 for subsequent reboot for kernel security update
  • 10:07 jynus: upgrade and restart db2054
  • 10:06 moritzm: rebooting rhenium for kernel security update
  • 10:00 elukey: reboot analytics1059-61 for kernel updates
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 weight (duration: 01m 06s)
  • 09:41 moritzm: reboot bast4002 for kernel security update
  • 09:34 elukey: reboot analytics1055->1058 for kernel updates
  • 09:32 godog: cleanup ores metrics older than 30d - T169969
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low weight - T174569 (duration: 01m 08s)
  • 09:24 gehel: relforge reboot completed
  • 09:08 gehel: reboot of relforge* for kernel upgrade
  • 09:04 elukey: reboot analytics1051->1054 for kernel updates
  • 09:00 gehel: logstash rolling restart completed
  • 08:57 moritzm: reboot remaining mediawiki app servers in eqiad for kernel security update (along with update to HHVM 3.18.6)
  • 08:55 marostegui: Upgrade db1110 kernel - T184256
  • 08:36 moritzm: powercycling wtp2013 (apparently didn't come back up after reboot)
  • 08:27 marostegui: Fix data drifts on enwiki.archive on codfw - T162807
  • 08:21 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=logstash1007.eqiad.wmnet
  • 08:17 gehel: rolling restart of logstash for kernel upgrade
  • 07:50 marostegui: Deploy schema change on db1110 - T174569
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T174569 (duration: 01m 03s)
  • 07:47 moritzm: reboot remaining mediawiki API servers for kernel security update (along with update to HHVM 3.18.6)
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T174569 (duration: 01m 03s)
  • 07:24 marostegui: Drop external_user table from s3 - T184247
  • 07:17 foks: Removed 2FA from Amjaabc
  • 07:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 db1099:3318 - T162807 T184256 (duration: 01m 02s)
  • 06:32 marostegui: Deploy schema change on db1082.s5 with replication (this will generate lag on labs) - T174569
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T174569 (duration: 01m 02s)
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T174569 (duration: 01m 03s)
  • 06:21 marostegui: Upgrade mariadb+kernel on db1089
  • 06:17 marostegui: Force BBU relearn on db1059 - T184160
  • 02:37 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 11m 10s)
  • 00:57 urandom: bootstrapping restbase1011-c -- T184100

2018-01-10

  • 23:50 eileen: civicrm revision changed from 429a5c5385 to 354f32fe8a, deploy contact change, contact search fixes, install cleanup
  • 21:57 twentyafterfour: group1 looks stable. This concludes the MediaWiki train for today.
  • 21:54 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.16 (duration: 01m 02s)
  • 21:53 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.16
  • 21:47 twentyafterfour@tin: Finished scap: group0 to 1.31.0-wmf.16 refs T180749 (duration: 38m 29s)
  • 21:09 twentyafterfour@tin: Started scap: group0 to 1.31.0-wmf.16 refs T180749
  • 20:49 twentyafterfour@tin: Synchronized php-1.31.0-wmf.16: Sync wmf.16 to deploy multiple patches from addshore refs T180749 (duration: 10m 23s)
  • 20:14 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: Update eventstreams with newer service-template-node: T171011 (duration: 04m 11s)
  • 20:09 otto@tin: Started deploy [eventstreams/deploy@ee854df]: Update eventstreams with newer service-template-node: T171011
  • 20:09 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: Update eventstreams deploy test to scb2002: T171011 (duration: 00m 24s)
  • 20:09 otto@tin: Started deploy [eventstreams/deploy@ee854df]: Update eventstreams deploy test to scb2002: T171011
  • 20:05 jynus: upgrade and restart dbstore2002
  • 20:00 jynus: upgrade and restart dbstore2001
  • 19:45 jynus: upgrade and restart db2047
  • 19:32 urandom: bootstrapping restbase1011-b -- T184100
  • 19:22 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for Paris University and sort other by date T184618 (duration: 01m 03s)
  • 19:00 jynus: upgrade and restart db1059
  • 18:45 chasemp: reboot labtestvirt2002.codfw.wmnet w/ new kernel
  • 18:40 andrewbogott: upgrading labvirt1018 kernel and rebooting
  • 18:23 jynus: upgrade and restart db2040
  • 17:59 jynus: upgrade and restart db2087
  • 17:48 andrewbogott: installing linux-image-generic-lts-xenial on labtestvirt2003
  • 17:44 jynus: upgrade and restart db2086
  • 16:55 elukey: reboot analytics1047->50 for kernel updates
  • 16:43 akosiaris: wtp* rolling restarts for meltdown finished
  • 16:39 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 16:38 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 16:35 godog: bounce thumbor-instances on thumbor1001
  • 16:26 anomie: Running cleanupUsersWithNoId.php on dewiki and wikidatawiki
  • 16:22 ottomata: restarting kafka jumbo brokers to apply java.security certpath restrictions
  • 16:08 godog: roll-restart swift frontend in eqiad for kernel upgrade
  • 16:06 moritzm: migrating instances off ganeti2001 for subsequent reboot for kernel security update
  • 16:05 moritzm: switched ganeti master node in codfw to ganeti2004
  • 16:03 marostegui: Deploy schema change on db1096.s5 - https://phabricator.wikimedia.org/T174569
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T174569 (duration: 01m 02s)
  • 15:59 godog: start cassandra-a on restbase1011
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T174569 (duration: 01m 03s)
  • 15:32 moritzm: rebooting yubico auth servers for kernel security update
  • 15:14 moritzm: reboot netmon1002 / netmon2001 for kernel security update
  • 14:54 ema: codfw LVSs: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:51 godog: start cassandra-a on restbase1011 - T184100
  • 14:50 zeljkof: EU SWAT finished
  • 14:50 jynus: dropping dewiki from dbstore2001:3318 T184599
  • 14:47 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: translationadmin: remove configuration equal to CommonSettings.php (T184314) (duration: 01m 02s)
  • 14:46 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: translationadmin: typo fix (duration: 01m 03s)
  • 14:42 chasemp: new meltdown images are live in cloud land
  • 14:34 jynus: dropping wikidatawiki from dbstore2001:3315 T184599
  • 14:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Lift the cap on IP address to create accounts on mrwiki (T184579) (duration: 01m 04s)
  • 14:05 moritzm: migrating instances off ganeti2002 for subsequent reboot for kernel security update
  • 13:37 moritzm: migrating instances off ganeti2003 for subsequent reboot for kernel security update
  • 13:26 _joe_: restarting pybal on lvs2003
  • 13:03 mobrovac@tin: Finished deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974 (duration: 08m 00s)
  • 12:55 mobrovac@tin: Started deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974
  • 12:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T174569 (duration: 01m 03s)
  • 12:54 marostegui: Deploy schema change on db1097:3315 - https://phabricator.wikimedia.org/T174569
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T174569 (duration: 01m 03s)
  • 12:38 moritzm: migrating instances off ganeti2004 for subsequent reboot for kernel security update
  • 12:19 moritzm: migrating instances off ganeti2005 for subsequent reboot for kernel security update
  • 12:11 moritzm: rebooting einsteinium for kernel security update
  • 11:51 moritzm: migrating instances off ganeti2006 for subsequent reboot for kernel security update
  • 11:45 godog: downtime decomissioned restbase cassandra 2 hosts
  • 11:39 moritzm: rebooting mw1201-mw1208 for kernel security update (along with update to HHVM 3.18.6)
  • 11:33 marostegui: Deploy schema change on db1106 - T174569
  • 11:26 elukey: reboot analytics1044->47 for kernel updates
  • 11:23 moritzm: migrating instances off ganeti2007 for subsequent reboot for kernel security update
  • 11:19 volans: Icinga failover to tegmen completed - T170353
  • 11:12 moritzm: migrating instances off ganeti2008 for subsequent reboot for kernel security update
  • 11:07 volans: start failovering of Icinga to tegmen - T170353
  • 10:55 elukey: reboot analytics1040->43 for kernel updates
  • 10:29 godog: reimage restbase1011 to test HBA mode - T184100
  • 10:16 moritzm: rebooting bast4001 for kernel security update
  • 10:06 elukey: rebooting analytics1035 (hadoop worker node and hdfs journal node) for kernel updates
  • 10:02 moritzm: rebooting tegmen for kernel security update
  • 09:50 godog: shut cassandra 2 on restbase legacy nodes - T184100
  • 09:40 moritzm: rebooting kubernetes workers (plus staging hosts) for kernel security update
  • 09:39 ema: eqiad LVSs: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:32 marostegui: Upgrade kernel on db1067
  • 09:27 godog: stop restbase on cassandra 2 nodes - T184100
  • 09:15 marostegui: Deploy schema change on db1051 - T174569
  • 09:12 moritzm: rebooting radium (tor relay) for kernel security update
  • 08:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 05s)
  • 08:38 marostegui: Deploy schema change on s5 dbstore1001 - T174569
  • 08:33 moritzm: rebooting mw1299-mw1306 (job runners) for kernel security update (along with update to HHVM 3.18.6)
  • 08:28 hashar: contint1001: upgraded Zuul 2.5.0-8-gcbc7f62-wmf4jessie1 .. 2.5.0-8-gcbc7f62-wmf6 | T158243
  • 08:13 marostegui: Deploy schema change on s5 dbstore1002 - T174569
  • 07:44 moritzm: rebooting mw1262-mw1275 for kernel security update (along with update to HHVM 3.18.6)
  • 07:37 marostegui: Drop external_user from wikidatawiki - T184247
  • 06:17 marostegui: Deploy schema change on s5 codfw master (db2052) with replication (this will generate lag on codfw) - T174569
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 02s)
  • 01:39 mutante: mw1226 - high load - hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1739PST.log ; restart-hhvm
  • 00:43 mutante: rebooting gerrit server for kernel upgrade
  • 00:18 mutante: rebooting phabricator server for kernel upgrade

2018-01-09

  • 22:52 godog: ms-be1033 truncate unrotated and big server.log
  • 22:22 aaron@tin: Synchronized php-1.31.0-wmf.16/includes/Setup.php: 68b4bbf (duration: 01m 15s)
  • 22:20 mutante: netmon2001 - arming keyholder for rancid
  • 21:10 mepps: updated SmashPig from 45aa62650c to 778e8f87b4
  • 20:57 twentyafterfour@tin: Finished scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2) (duration: 36m 34s)
  • 20:21 twentyafterfour@tin: Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2)
  • 20:14 twentyafterfour@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="test2wiki" --outdir="/tmp/scap_l10n_3984299293" --threads=10 --lang en --quiet' returned non-zero exit status 1 (duration: 02m 44s)
  • 20:13 mutante: netmon2001 - rebooting
  • 20:12 twentyafterfour@tin: Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749
  • 20:04 mutante: gerrit2001 - rebooting
  • 20:00 mutante: phab2001 - reboot for upgrade
  • 19:20 mepps: rolledback SmashPig from 0c45b1a684 to 45aa62650c
  • 19:07 mepps: updated SmashPig from 45aa62650c to 0c45b1a684
  • 18:42 mutante: ms-fe3002,ms-fe3001 - powering down, removing from puppet and icinga, ms-be* removing from puppet/icinga (T169518)
  • 18:38 mutante: ms-fe3001 - shutting down for decom, removed from puppet
  • 18:38 mutante: mw1227 still not showing recovery, using restart-hhvm
  • 18:29 mutante: mw1227 killed it one more time and also restarted apache.. now load going down
  • 18:26 mutante: mw1227 hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1024PST.log ; then killed hhvm and restarted it with systemctl
  • 17:56 twentyafterfour: MediaWiki Train: Branching 1.31.0-wmf.16
  • 17:41 moritzm: rebooting image scalers in codfw for kernel security update (along with HHVM update)
  • 17:30 volans: re-enabled Icinga event handlers on RAID checks for lvs3001
  • 17:17 ema: failover traffic back to lvs3001, raid rebuilt
  • 17:15 godog: depool restbase cassandra 2 nodes - T184100
  • 16:35 cmjohnson1: disabling pupppet for decom on mw1180-1200
  • 16:28 volans: disabled Icinga event handlers on RAID checks for lvs3001, WIP on the host
  • 16:18 gehel: starting cluster reboot for elasticsearch / cirrus codfw
  • 16:09 bd808: data-services: added s8.{analytics,web}.db.svc.eqiad.wmflabs and aliases (T181643, T184179)
  • 16:09 elukey: re-started mysql on dbstore1002 (and slave replication) after hw maintenance
  • 15:44 godog: roll-restart swift frontends in codfw and eqiad
  • 15:40 akosiaris@tin: Finished deploy [servermon/servermon@10e165e]: Testing scap check (duration: 00m 02s)
  • 15:40 akosiaris@tin: Started deploy [servermon/servermon@10e165e]: Testing scap check
  • 15:31 gehel: reboot maps-test* for kernel upgrade
  • 15:30 elukey: stop mysql on dbstore1002 as prep step for shutdown (stop all slaves, mysql stop)
  • 15:23 herron: puppet master reboots complete. re-enabling puppet agents
  • 15:18 ema: lvs3001 disk swap: failover traffic to lvs3003 T166965
  • 15:10 elukey: reboot analytics1028 (hadoop worker and hdfs journal node) for kernel updates
  • 15:07 anomie: Creating MCR tables on all wikis (T183486)
  • 15:01 herron: temporarily disabling puppet agents and rebooting puppet masters for security updates
  • 15:00 elukey: reboot kafka-jumbo1006 for kernel updates
  • 14:59 ema: lvs3001: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267, replace sdb T166965
  • 14:48 moritzm: rolling reboot of scb in eqiad for kernel security update
  • 14:41 elukey: reboot kafka-jumbo1005 for kernel updates
  • 14:36 godog: upgrade and roll-restart thumbor in codfw/eqiad - T182656 T183907 T169144
  • 14:32 elukey: reboot kafka1023 for kernel updates
  • 14:21 elukey: reboot kafka-jumbo1004 for kernel updates
  • 14:14 moritzm: rolling reboot of scb in codfw for kernel security update
  • 14:14 ema: lvs3003: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:07 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Save -> Publish on remaining Wikinewses which haven't updated - https://gerrit.wikimedia.org/r/#/c/403077/ (duration: 00m 53s)
  • 14:06 ema: lvs3002: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:04 elukey: reboot kafka1022 for kernel updates
  • 14:01 godog: copy poolcounter from jessie-wikimedia into stretch-wikimedia - T183385
  • 13:51 elukey: reboot kafka-jumbo1003 for kernel updates
  • 13:34 moritzm: rebooting remaining video scalers in eqiad for kernel security update (along with HHVM update)
  • 13:10 elukey: reboot kafka1020 for kernel updates
  • 13:07 mobrovac@tin: Finished deploy [restbase/deploy@837f5a9]: Force deploy on all targets - T184110 (duration: 07m 23s)
  • 13:00 mobrovac@tin: Started deploy [restbase/deploy@837f5a9]: Force deploy on all targets - T184110
  • 12:58 moritzm: rebooting labnodepool* for kernel security update
  • 12:55 akosiaris@tin: Finished deploy [servermon/servermon@10e165e]: Update servermon (duration: 00m 02s)
  • 12:54 akosiaris@tin: Started deploy [servermon/servermon@10e165e]: Update servermon
  • 12:23 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1014.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1012.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1011.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1010.eqiad.wmnet
  • 12:19 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1009.eqiad.wmnet
  • 12:17 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1008.eqiad.wmnet
  • 12:17 moritzm: rebooting scb2001 for kernel security update
  • 12:09 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1007.eqiad.wmnet
  • 12:07 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 12:05 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2005.codfw.wmnet
  • 12:04 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 12:03 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2003.codfw.wmnet
  • 11:58 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 11:56 godog: roll-restart restbase c3 nodes in codfw/eqiad
  • 11:50 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:43 moritzm: rebooting app servers mw1238-mw1258 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 11:25 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 11:17 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:03 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 11:03 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 11:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 10:59 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 10:07 ema: cp3041 soft lockup, rebooting
  • 10:03 elukey: reboot kafka-jumbo1002 for kernel updates
  • 09:59 ema: failover traffic lvs3002 -> lvs3004 (new kernel)
  • 09:51 ema: lvs3004: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:35 elukey: reboot kafka1014 for kernel updates
  • 09:32 godog: deploy restbase to cassandra 3 nodes
  • 09:11 godog: roll restart swift in eqiad for kernel upgrade
  • 08:39 moritzm: rebooting app servers in codfw for kernel security update
  • 08:15 jynus: stopping dbstore2001:s5 for cloning to s8
  • 06:32 _joe_: restarting pdfrender on scb1003
  • 06:29 marostegui@tin: Synchronized docroot/noc/conf/s8.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 48s)
  • 06:27 marostegui@tin: Synchronized dblists/s8.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 50s)
  • 06:26 marostegui@tin: Synchronized dblists/s5.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 53s)
  • 06:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove read_only from s5 and s8 T177208 T181645 (duration: 00m 27s)
  • 06:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Splitting s5 and s8 T177208 T181645 (duration: 00m 50s)
  • 06:07 jynus: stopping slave and reseting on db1071
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Set s5 on read-only to start failover T177208 T181645 (duration: 00m 50s)
  • 05:12 marostegui: Start pre-failover tasks T177208 T181645
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 05m 31s)
  • 00:43 mutante: phabricator servers: upgraded php5-*, openssh
  • 00:17 mutante: netmon1002/2001 - upgraded php7.0 related packages | krypton (webserver_misc_apps) - upgraded php5 packages
  • 00:08 mutante: contint1001/2001 - upgraded php5-related packages
  • 00:06 mutante: releases1001/2001 - upgraded kernel image, planet - upgraded openssl et al

2018-01-08

  • 23:56 mutante: rutherfordium (people.wm.org) - upgrading PHP5
  • 21:52 bsitzmann@tin: Finished deploy [mobileapps/deploy@1bfd4b0]: Update mobileapps to d20915c (T184430 T184429) (duration: 05m 33s)
  • 21:47 bsitzmann@tin: Started deploy [mobileapps/deploy@1bfd4b0]: Update mobileapps to d20915c (T184430 T184429)
  • 21:30 arlolra: Updated Parsoid to e133312 (T182349, T183893, T159985)
  • 21:22 arlolra@tin: Finished deploy [parsoid/deploy@1dac474]: Updating Parsoid to e133312 (duration: 10m 31s)
  • 21:12 arlolra@tin: Started deploy [parsoid/deploy@1dac474]: Updating Parsoid to e133312
  • 21:05 mutante: new Wikipedia lanuage: "inh" - recreating/reloading DNS zones to add "inh" (Ingush) from langs.tmpl (T184374) https://wikitech.wikimedia.org/wiki/Add_a_wiki#DNS
  • 20:09 ejegg: rolled back smashpig payments listener from 0e703f502d to 45aa62650c
  • 19:34 ottomata: rebooting analytics1002 and then analytics1001 to apply proxyuser changes and kernel update
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove language button from Wikidata and MediaWiki T183665 (duration: 00m 51s)
  • 19:04 ejegg: updated SmashPig payments listener from 45aa62650c to 0e703f502d
  • 18:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2040 (duration: 00m 50s)
  • 18:10 gehel@tin: Finished deploy [wdqs/wdqs@c680f55]: (no justification provided) (duration: 02m 03s)
  • 18:08 gehel@tin: Started deploy [wdqs/wdqs@c680f55]: (no justification provided)
  • 16:57 milimetric@tin: Finished deploy [analytics/refinery@f99e7dd]: Update and re-run interlanguage job (duration: 11m 28s)
  • 16:45 milimetric@tin: Started deploy [analytics/refinery@f99e7dd]: Update and re-run interlanguage job
  • 16:36 jynus: stopping replication on db2040
  • 16:28 cormacparle: About to run refreshFileHeaders.php on all wikis to fix https://phabricator.wikimedia.org/T178849
  • 15:23 elukey: reboot kafka1013 for kernel updates
  • 15:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Fix db2039 comments (duration: 00m 50s)
  • 15:12 ema: cache_upload: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 15:03 hashar@tin: Synchronized dblists/group1-wikipedia.dblist: Add test2wiki as a group1 wiki - T182326 (duration: 00m 50s)
  • 14:57 gehel: rolling reboot of maps servers for kernel upgrade
  • 14:56 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking in hewiki - T172914 (duration: 00m 50s)
  • 14:51 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add Translation: namespace on Punjabi Wikisource - T179807 (duration: 00m 50s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Turn on mapframe for Arabic Wikipedia - T183764 (duration: 00m 51s)
  • 14:33 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add new namespace aliases on zhwiki - T183711 (duration: 00m 50s)
  • 14:28 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable commons import in tawikisource - T181774 (duration: 00m 48s)
  • 14:27 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Update logo for chrwiki, add the HD version T180553 (duration: 00m 50s)
  • 14:26 ema: cache_text: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:25 hashar@tin: Synchronized static/images/project-logos: Update logo for chrwiki, add the HD version T180553 (duration: 00m 51s)
  • 14:23 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Move wiktionary HD logo to wiktionaries - T183922 (duration: 00m 50s)
  • 14:21 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgKartographerStaticMapframe for lvwiki - T183981 (duration: 00m 51s)
  • 14:16 hashar@tin: Synchronized wmf-config/Wikibase-production.php: Don’t check constraints on example properties - T183267 (duration: 00m 51s)
  • 13:50 moritzm: rebooting mw image scalers in eqiad for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 13:42 gehel: rolling restart of wdqs servers for kernel upgrades
  • 13:41 elukey: reboot analytics10[36-39] for kernel updates
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1109 original status (duration: 00m 50s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up db1109 (duration: 00m 52s)
  • 13:07 joal@tin: Finished deploy [analytics/aqs/deploy@ab85797]: Add pageview top-by-country endpoint (duration: 17m 57s)
  • 13:05 moritzm: rebooting mw1259/mw1260 (video scalers) for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 12:59 elukey: reboot kafka1012 for kernel updates
  • 12:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 27s)
  • 12:49 joal@tin: Started deploy [analytics/aqs/deploy@ab85797]: Add pageview top-by-country endpoint
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 59s)
  • 12:37 fdans@tin: Finished deploy [analytics/aqs/deploy@ab85797]: (no justification provided) (duration: 00m 16s)
  • 12:37 fdans@tin: Started deploy [analytics/aqs/deploy@ab85797]: (no justification provided)
  • 12:35 moritzm: rebooting mw1209-mw1220 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 12:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: revert warm up s8 future hosts - T177208 (duration: 02m 58s)
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 52s)
  • 12:18 akosiaris@tin: Finished deploy [servermon/servermon@b9832c5]: Update servermon (duration: 00m 02s)
  • 12:18 akosiaris@tin: Started deploy [servermon/servermon@b9832c5]: Update servermon
  • 12:01 moritzm: rebooting mw1221-mw1235 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 11:38 moritzm: rebooting mwdebug* for kernel security update
  • 11:28 godog: puppet node deactivate wtp10[568] - T177374
  • 11:06 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 11:05 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 10:50 godog: roll restart swift in codfw for kernel upgrades
  • 10:40 akosiaris@tin: Finished deploy [servermon/servermon@53b81d8]: Update servermon (duration: 00m 02s)
  • 10:40 akosiaris@tin: Started deploy [servermon/servermon@53b81d8]: Update servermon
  • 10:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 and db1089 - T162807 (duration: 00m 50s)
  • 10:26 hashar: Started docker on contint1001 / contint2001 . They were missing the overlay/overlayfs kernel modules | T184410
  • 10:04 elukey: drain + reboot analytics1029,1031->1034 for kernel updates
  • 10:03 jynus: fixing wrong events on db2039, db1071,db2023, db2045, db2052, db1100
  • 09:53 godog: Flashing Smart Array P840 in Slot 3 [ 4.52 -> 6.06 ] on ms-be2037 - T184390 T141756
  • 09:46 hashar: rebooting CI
  • 09:46 godog: reboot ms-be2037 - T184390
  • 09:39 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 to mw1261,mw2251,mw1276 and all videoscalers (Recently rebooted/reimaged)
  • 09:38 hashar: upgrading contint1001 / contint1002 | T184267
  • 09:24 ema: cache_misc: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:17 _joe_: starting 3 manual loops for consuming refreshLinks jobs for ruwiki
  • 09:14 marostegui: Force BBU relearn on db1059 - T184160
  • 08:30 moritzm: installing remaining openssl updates
  • 07:24 marostegui: Stop MySQL on db1039 for decommission - T184262
  • 07:17 marostegui: Remove db1039 from tendril - T184262
  • 07:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1039 as it will be decommissioned - T184262 (duration: 00m 50s)
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1039 as it will be decommissioned - T184262 (duration: 00m 50s)
  • 06:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 00m 51s)
  • 06:32 marostegui: Disable BBU auto-learn on db1011
  • 06:17 marostegui: Deploy schema change on s7 primary master (db1062) - T174569
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 17s)

2018-01-07

  • 20:25 demon@tin: Synchronized wmf-config/interwiki.php: auto-sync with my plugin was busted 🙃 (duration: 00m 50s)
  • 19:56 demon@tin: Synchronized php-1.31.0-wmf.15/maintenance/Maintenance.php: fix stuff (duration: 00m 51s)
  • 19:32 demon@tin: Finished scap: Delete alswik(ibooks|iquote|tionary), mowik(ipedia|tionary) (duration: 21m 32s)
  • 19:10 demon@tin: Started scap: Delete alswik(ibooks|iquote|tionary), mowik(ipedia|tionary)
  • 08:52 elukey: re-enabled puppet on db110[78] - eventlogging_sync restarted on db1108 (analytics-slave) - T168414

2018-01-06

  • 08:09 elukey: re-enable eventlogging mysql consumers after database maintenance - T168414
  • 06:59 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw[1329-1333] (new appservers, was 120)
  • 06:49 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw1335 (new jobrunner, was 120)

2018-01-05

  • 22:27 tgr: T184263 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=eswiki --logwiki=metawiki "Mega849" "Mega809"
  • 20:47 demon@tin: Pruned MediaWiki: 1.31.0-wmf.12 [keeping static files] (duration: 02m 11s)
  • 18:15 otto@tin: Finished deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed) (duration: 00m 19s)
  • 18:15 otto@tin: Started deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed)
  • 18:14 otto@tin: Finished deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed) (duration: 02m 11s)
  • 18:11 otto@tin: Started deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed)
  • 16:40 jynus: upgrade and restart labsdb1010
  • 16:29 akosiaris@tin: Finished deploy [servermon/servermon@cf88f3f]: Update servermon to 3c8538a (duration: 00m 02s)
  • 16:29 akosiaris@tin: Started deploy [servermon/servermon@cf88f3f]: Update servermon to 3c8538a
  • 16:07 akosiaris@tin: Finished deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a (duration: 00m 02s)
  • 16:07 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 16:06 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 16:05 akosiaris@tin: Finished deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a (duration: 00m 23s)
  • 16:04 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 15:50 marostegui: Upgrade db2071 kernel - T184256
  • 15:48 moritzm: rebooting multatuli for kernel update
  • 15:41 marostegui: Upgrade db2072 (mariadb and kernel) - T184256
  • 14:25 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 14:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 14:17 gehel: reboot maps1002 for kernel upgrade
  • 14:03 fdans@tin: (no justification provided)
  • 13:57 elukey@tin: Finished deploy [analytics/aqs/deploy@792c95d]: Add pageviews by country endpoint (duration: 01m 12s)
  • 13:56 elukey@tin: Started deploy [analytics/aqs/deploy@792c95d]: Add pageviews by country endpoint
  • 13:54 ema: upgrade cp3046 to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 13:53 fdans@tin: Finished deploy [analytics/aqs/deploy@792c95d]: (no justification provided) (duration: 00m 18s)
  • 13:52 fdans@tin: Started deploy [analytics/aqs/deploy@792c95d]: (no justification provided)
  • 13:37 gehel: rebooting wdqs1003 for kernel upgrade
  • 13:24 fdans@tin: Finished deploy [analytics/aqs/deploy@792c95d]: (no justification provided) (duration: 01m 32s)
  • 13:22 fdans@tin: Started deploy [analytics/aqs/deploy@792c95d]: (no justification provided)
  • 13:22 gehel: rebooting elastic1017 for kernel upgrade
  • 13:19 fdans: deploying Analytics Query Service
  • 12:44 elukey: reboot kafka-jumbo1001 for kernel updates
  • 12:43 ema: upgrade cp3007 to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 12:04 jynus: upgrade and restart labsdb1011
  • 12:03 ema: reboot cp1008 into linux 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 10:15 godog: reboot restbase2004 to test kernel upgrade
  • 10:14 jynus: reboot labsdb1009
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 - T163190 (duration: 00m 27s)
  • 09:20 elukey: drain and reboot analytics1030 for kernel updates
  • 09:11 godog: reboot ms-be1014 to test update stretch kernel
  • 08:54 elukey: ran git checkout modules/role/manifests/puppetmaster/standalone.pp on labs-puppetmaster.wikimedia.org to unblock sync from prod
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T163190 (duration: 00m 28s)
  • 07:37 _joe_: rebooting mw1276 toio, kernel upgrade
  • 07:25 _joe_: rebooting mw1261
  • 06:49 marostegui: Stop replication in sync on db1039 and db1098:3317 - T163190
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 - T163190 (duration: 00m 27s)
  • 06:24 marostegui: Deploy schema change on db1094 - T174569
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T163190 (duration: 00m 51s)
  • 03:54 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Undeploy EducationProgram from test2wiki (duration: 00m 48s)

2018-01-04

  • 23:30 apergos: rebooted releases1001 and 2001 (new kernel)
  • 22:09 moritzm: uploaded linux-meta 1.16 for jessie-wikimedia to apt.wikimedia.org (which installs the new KPTI-enabled kernel with the new ABI)
  • 22:03 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.15
  • 22:00 twentyafterfour: No blockers remain for T180748, proceeding to deploy wmf.15 to all wikis
  • 21:53 twentyafterfour@tin: Synchronized php-1.31.0-wmf.15/extensions/TitleBlacklist/TitleBlacklistPreAuthenticationProvider.php: Deploy 332fab0 to stop logspam and unblock the train (duration: 01m 02s)
  • 21:37 moritzm: uploaded linux-4.9.65-3+deb9u1~bpo8+2 for jessie-wikimedia to apt.wikimedia.org (provides KPTI backport)
  • 21:35 twentyafterfour@tin: Synchronized php-1.31.0-wmf.15/includes/parser/Parser.php: Deploy 601cf9d (duration: 01m 03s)
  • 21:33 twentyafterfour: deploying patches to unblock the train
  • 21:25 moritzm: reboot multatuli for kernel update
  • 20:06 twentyafterfour: There are still open blockers for wmf.15 - see T180748 .. attempting to resolve them to unblock the train.
  • 20:03 twentyafterfour: preparing to deploy the train (filling in for no_justification)
  • 19:51 joal@tin: Finished deploy [analytics/refinery@a69a2cd]: Regular analytics deploy (duration: 04m 38s)
  • 19:46 joal@tin: Started deploy [analytics/refinery@a69a2cd]: Regular analytics deploy
  • 18:58 bsitzmann@tin: Finished deploy [mobileapps/deploy@8bcffa9]: Update mobileapps to a4ba9fd (T182330 T177430 T170690 T182652 T184198) (duration: 06m 01s)
  • 18:52 bsitzmann@tin: Started deploy [mobileapps/deploy@8bcffa9]: Update mobileapps to a4ba9fd (T182330 T177430 T170690 T182652 T184198)
  • 18:27 jynus: upgrade and restart labsdb1009
  • 17:42 moritzm: upgrading HHVM on eqiad video scalers to 3.18.6
  • 17:40 demon@tin: Finished deploy [gerrit/gerrit@1e1a79d]: deploying hooks plugin (duration: 00m 10s)
  • 17:40 demon@tin: Started deploy [gerrit/gerrit@1e1a79d]: deploying hooks plugin
  • 16:38 jynus: upgrade and restart db2089 (s5/s6)
  • 16:14 jynus: upgrade and restart db2087 (s6/s7)
  • 15:44 jynus: upgrade and restart db2076
  • 15:36 jynus: upgrade and restart db2067
  • 15:31 demon@tin: Synchronized php-1.31.0-wmf.15/extensions/ActiveAbstract/: unbreak, T184177 (duration: 01m 02s)
  • 15:17 jynus: upgrade and restart db2060
  • 15:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 - T163190 (duration: 01m 02s)
  • 15:03 moritzm: upgrading HHVM on eqiad image scalers to 3.18.6
  • 14:54 jynus: restart db2046 database to move socket location
  • 14:24 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Adding Movepage-summary to wgForceUIMsgAsContentMsg T183848 (duration: 01m 02s)
  • 14:11 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Restrict sending mails to new users T182541 (duration: 01m 02s)
  • 13:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T163190 (duration: 01m 01s)
  • 13:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T163190 (duration: 01m 02s)
  • 13:35 marostegui: Stop replication in sync db1079 db1101:3317 T163190
  • 13:17 moritzm: upgrading HHVM on mw1180-mw1220 to 3.18.6
  • 12:53 moritzm: upgrading HHVM on mwdebug* to 3.18.6
  • 12:45 mobrovac@tin: Finished deploy [restbase/deploy@66b7efe]: Switch Mathoid to Cassandra 3 and drop Cassandra 2 references - T179419 (duration: 04m 05s)
  • 12:41 mobrovac@tin: Started deploy [restbase/deploy@66b7efe]: Switch Mathoid to Cassandra 3 and drop Cassandra 2 references - T179419
  • 12:07 mobrovac@tin: Finished deploy [mathoid/deploy@c9957ce]: Mathoid v0.7.1 - T172767 (duration: 05m 05s)
  • 12:02 mobrovac@tin: Started deploy [mathoid/deploy@c9957ce]: Mathoid v0.7.1 - T172767
  • 12:00 moritzm: upgrading HHVM on API canaries (mw1276-mw1279) to HHVM 3.18.6
  • 10:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T163190 (duration: 01m 01s)
  • 10:39 marostegui: Stop replication in sync on db1079 and db1101:3317 - T163190
  • 10:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T163190 (duration: 01m 02s)
  • 10:16 mobrovac@tin: Finished deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557 (duration: 02m 38s)
  • 10:14 mobrovac@tin: Started deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557
  • 09:58 jynus: restart and upgrade db2053
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 - T163190 (duration: 03m 09s)
  • 09:38 moritzm: rebooting mw1307 and wtp1025 for kernel update
  • 09:13 moritzm: rebooting kubernetes1001 for kernel update
  • 08:57 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw133[67] (new jobrunners)
  • 08:53 marostegui: Fixing inconsistencies on s7 - T163190
  • 08:48 marostegui: Deploy schema change on db1069 (s7) - T174569
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T174569 (duration: 01m 02s)
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T174569 (duration: 01m 02s)
  • 06:48 marostegui: Deploy schema change on db1079 (s7) with replication enabled - this will generate lag on labs replicas - T174569
  • 06:27 marostegui: Deploy schema change on db1068 (s4) master - T174569
  • 06:23 marostegui: Issue a BBU re-learn cycle on db1059 - T184160
  • 02:49 legoktm@tin: Synchronized php-1.31.0-wmf.15/extensions/Flow/Hooks.php: Fix CheckUser type check thingy - T182834 (duration: 01m 01s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 07m 50s)
  • 01:50 ladsgroup@tin: Synchronized dblists/group0.dblist: SWAT: Move testwiki2 from group0 to group1 (T182326) (duration: 01m 02s)

2018-01-03

  • 23:02 twentyafterfour: restarted apache on phab1001 to clear hung workers (refs T182832)
  • 22:31 bd808@tin: Finished deploy [striker/deploy@69f1b15]: Enhance membership request workflow and fix Diffusion repo creation (T168027, T182142) (duration: 00m 31s)
  • 22:31 bd808@tin: Started deploy [striker/deploy@69f1b15]: Enhance membership request workflow and fix Diffusion repo creation (T168027, T182142)
  • 21:41 ejegg: re-enabled ingenico audit
  • 21:27 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.15 (duration: 01m 01s)
  • 21:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.15
  • 21:26 twentyafterfour: deploying 1.31.0-wmf.15 to "Group 1" wikis
  • 21:01 ottomata: deleting stale topics from main kafka clusters: T149594
  • 20:56 mutante: uranium - revoked puppet cert, node deactivate, removing from DNS (T183209)
  • 20:50 mutante: uranium (ex-ganglia-web) is going into eternal downtime on Icinga.. shutdown -h RIP (T183209)
  • 20:23 thcipriani: updateCollation for eswiki running in screen as thcipriani on terbium
  • 20:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Do not enable lua fine grained tracking for any wiki T172914 (duration: 01m 02s)
  • 20:16 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/VisualEditor/lib/ve: SWAT: Update VE core submodule to master T182907 T183590 (duration: 01m 06s)
  • 20:06 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Close wikimania2017.wikimedia.org PART II T182493 (duration: 01m 04s)
  • 20:04 thcipriani@tin: Synchronized dblists/closed.dblist: SWAT: Close wikimania2017.wikimedia.org PART I T182493 (duration: 01m 02s)
  • 19:53 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Extension:Translate default permissions for Wikimedia wikis T178793 (duration: 01m 02s)
  • 19:42 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set category collation to uca-es-u-kn for eswiki T183802 (duration: 01m 02s)
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Setup some namespace aliases for eswiki T183612 (duration: 01m 02s)
  • 19:14 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration deboosting scientific articles on Wikidata T183510 (duration: 01m 02s)
  • 18:53 volans: restarted ircecho on einsteinium
  • 18:37 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf2+deb9u1 for stretch-wikimedia to apt.wikimedia.org
  • 18:25 ottomata: deploying change to produce statsv metrics to main kafka clusters from varnishkafka. statsv on hafnium will be restarted to consume from main. might cause a short blip in statsv metrics.
  • 18:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Decom db2028 (duration: 01m 01s)
  • 18:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Decom db2028, repool pc2005 (duration: 01m 01s)
  • 17:47 otto@tin: Finished deploy [statsv/statsv@362d1a9]: statsv (duration: 00m 02s)
  • 17:47 otto@tin: Started deploy [statsv/statsv@362d1a9]: statsv
  • 17:35 godog: upload prometheus-jmx-exporter 0.10-3 to jessie/stretch
  • 17:35 demon@tin: Synchronized php-1.31.0-wmf.15/extensions/Wikibase: I9da46c36 (duration: 02m 00s)
  • 17:35 jynus: restart and upgrade db2046
  • 17:07 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2251.*.wmnet
  • 17:02 jynus: performing schema change on db2039 (s6)
  • 16:51 papaul: powering down pc2005 for maintenance
  • 16:18 otto@tin: Finished deploy [statsv/statsv@0a86be8]: revert (duration: 00m 02s)
  • 16:18 otto@tin: Started deploy [statsv/statsv@0a86be8]: revert
  • 16:18 otto@tin: Finished deploy [statsv/statsv@c390cdf]: no-op deployment of statsv with support for multiple topics (duration: 00m 03s)
  • 16:18 otto@tin: Started deploy [statsv/statsv@c390cdf]: no-op deployment of statsv with support for multiple topics
  • 16:14 papaul: powering down mw2251 for memory replacement and firmware uprade
  • 16:02 urandom: drop unused keyspaces in legacy restbase cluster - T183745
  • 15:51 niharika29@tin: Finished deploy [scholarships/scholarships@ec05ae7]: Remove outdated i18n files (duration: 00m 02s)
  • 15:51 niharika29@tin: Started deploy [scholarships/scholarships@ec05ae7]: Remove outdated i18n files
  • 15:48 jynus: stop pc2005's database for maintenance T183750
  • 15:46 jynus@tin: Synchronized wmf-config/db-codfw.php: "Depool" pc2005 (duration: 01m 02s)
  • 15:38 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2251.*.wmnet
  • 15:28 niharika29@tin: Finished deploy [scholarships/scholarships@ec05ae7]: Update i18n files (duration: 00m 02s)
  • 15:28 niharika29@tin: Started deploy [scholarships/scholarships@ec05ae7]: Update i18n files
  • 15:14 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover s6-master db2028 to db2039 (duration: 01m 01s)
  • 15:08 jynus: stopping db2028's mysql to apply new config
  • 15:01 godog: roll-restart thumbor in eqiad after upgrade - T183907
  • 15:00 ottomata: restarting kafka-jumbo brokers to enable tls version and cipher suite restrictions
  • 14:55 jynus: switchover db2028 to db2039 as codfw-s6-master
  • 14:39 godog: rollout python-thumbor-wikimedia 1.8 - T183907
  • 14:30 zeljkof: EU SWAT finished
  • 14:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Translation NS for kowikisource (T183836) (duration: 01m 00s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add patrol to Image-reviewer on Commons (T183835) (duration: 01m 02s)
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T174569 (duration: 01m 02s)
  • 13:17 moritzm: upgrading mw1261-mw1265 to HHVM 3.18.5+dfsg-1+wmf2
  • 13:07 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf2 (including the fixes from 3.18.6) for jessie-wikimedia to apt.wikimedia.org
  • 12:53 moritzm: importing linux 4.9.65-3+deb9u1~bpo8+1 for jessie-wikimedia to apt.wikimedia.org
  • 12:14 mobrovac@tin: Finished deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557 (duration: 02m 10s)
  • 12:12 mobrovac@tin: Started deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557
  • 11:57 moritzm: upgrading app servers in deployment-prep to hhvm 3.18.5+dfsg-1+wmf2 (which contains the patches from 3.18.6)
  • 11:52 jynus: upgrade and restart db2039
  • 11:49 jynus: disabling puppet on db2039 and db2028 in preparation for gerrit:401706 deployment
  • 11:47 akosiaris: boot ganeti1006. It exhibited page allocation stalls on Jan 1. T181121
  • 11:39 marostegui: Deploy schema change on db1086 - T174569
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T174569 (duration: 01m 01s)
  • 11:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 - T174569 (duration: 01m 02s)
  • 11:32 mobrovac@tin: Finished deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557 (duration: 02m 15s)
  • 11:29 mobrovac@tin: Started deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557
  • 11:28 mobrovac@tin: Finished deploy [mathoid/deploy@91648aa]: (no justification provided) (duration: 00m 40s)
  • 11:28 mobrovac@tin: Started deploy [mathoid/deploy@91648aa]: (no justification provided)
  • 11:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1336.*.eqiad.wmnet
  • 11:00 mobrovac@tin: Started restart [changeprop/deploy@3c4f51d]: Pick up the new RESTBase DNS
  • 10:48 mobrovac@tin: Started restart [mobileapps/deploy@bf85a55]: Pick up the new RESTBase DNS
  • 10:45 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase,name=codfw
  • 09:57 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1337.*.eqiad.wmnet
  • 08:59 elukey: stop eventlogging mysql insertion on eventlog1001 to allow db1107 maintenance - T168414
  • 06:57 marostegui: Deploy schema change on db1101:3317 - T174569
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 - T174569 (duration: 01m 01s)
  • 06:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 - T174569 (duration: 01m 10s)
  • 06:47 kartik@tin: Finished deploy [cxserver/deploy@66e384e]: Update cxserver to cc01477 (duration: 04m 49s)
  • 06:43 kartik@tin: Started deploy [cxserver/deploy@66e384e]: Update cxserver to cc01477
  • 06:37 marostegui: Deploy schema change on s1 master db1052 - T174569
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 06m 56s)
  • 01:26 eileen: civicrm updated civicrm revision changed from ffa9d7fc7a to 429a5c5385, config revision is a7b9b58595
  • 01:18 eileen: update process-control to use different reference to civicrm_root (symlinks) process-control config revision is a7b9b58595
  • 01:01 reedy@tin: Synchronized php-1.31.0-wmf.15/resources/src/mediawiki/mediawiki.editfont.css: T182320 (duration: 01m 01s)
  • 00:59 reedy@tin: Synchronized php-1.31.0-wmf.15/extensions/Flow: T182320 (duration: 01m 18s)
  • 00:58 reedy@tin: Synchronized php-1.31.0-wmf.15/extensions/CodeMirror: T182320 (duration: 00m 59s)
  • 00:51 eileen: rollback smashPig SmashPig revision changed from ab7802d5b3 to 45aa62650c (locked), config revision is 4a4c61ae1b
  • 00:38 reedy@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters/dm/: RCFilters (duration: 01m 02s)
  • 00:36 reedy@tin: Synchronized php-1.31.0-wmf.15/resources/src/mediawiki.rcfilters/dm/: RCFilters (duration: 01m 02s)
  • 00:09 reedy@tin: Synchronized wmf-config/CirrusSearch-common.php: Lower ElasticSearch index refresh interval for Wikidata to 5s (duration: 01m 02s)
  • 00:06 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Add wmgCirrusSearchRefreshInterval (duration: 01m 02s)

2018-01-02

  • 22:04 herron: upgrading trusty puppet agents to puppet 4
  • 21:00 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.15
  • 20:59 demon@tin: Synchronized php-1.31.0-wmf.15/includes/Setup.php: Aaron made me do it (duration: 01m 04s)
  • 20:48 ottomata: restarting kafka-jumbo brokers for version 1.0 upgrade
  • 19:15 demon@tin: Finished scap: wmf.15 bootstrap (duration: 34m 55s)
  • 18:46 subbu: started linter-reparse script on terbium to reprocess itwiki pages (safe to kill -9 the script at any point)
  • 18:40 demon@tin: Started scap: wmf.15 bootstrap
  • 18:37 ebernhardson: T183053 update index.refresh_interval for wikidatawiki_{content,general} on eqiad to 5s
  • 18:30 jgleeson: Updating Smashpig from 45aa62650c to ab7802d5b3
  • 18:21 arlolra@tin: Finished deploy [parsoid/deploy@4d55952]: Updating Parsoid to 28d7734 (duration: 11m 57s)
  • 18:20 awight@tin: Finished deploy [ores/deploy@eb0f776]: Update ORES service to eb0f776: T182614 (duration: 19m 55s)
  • 18:10 moritzm: rebooting multatuli for kernel test
  • 18:09 arlolra@tin: Started deploy [parsoid/deploy@4d55952]: Updating Parsoid to 28d7734
  • 18:00 awight@tin: Started deploy [ores/deploy@eb0f776]: Update ORES service to eb0f776: T182614
  • 17:28 demon@tin: Pruned MediaWiki: 1.31.0-wmf.11 (duration: 01m 24s)
  • 17:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.10 (duration: 01m 29s)
  • 17:05 ejegg: updated payments-wiki from e91db27108 to 40145892e7
  • 17:01 jynus: add missing mysql grants to db1103:s4
  • 16:53 jynus: add missing mysql grants to db1097:s4
  • 16:51 herron: restarted exim and spamd services on fermium, mx1001 and mx2001 for openssl update
  • 16:48 elukey@puppetmaster1001: conftool action : set/weight=30; selector: name=mw13(29|3[0-3]).*.eqiad.wmnet
  • 16:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1029 (duration: 00m 51s)
  • 15:55 moritzm: installing openssl updates on restbase* hosts
  • 15:53 elukey@puppetmaster1001: conftool action : set/weight=20; selector: name=mw13(29|3[0-3]).*.eqiad.wmnet
  • 15:33 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1335.*.eqiad.wmnet
  • 15:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1055 & db1056 x1 weight (duration: 00m 50s)
  • 15:16 akosiaris: boot ganeti1008 with older 4.4 kernel and migrate multiple VMs to it. T181121
  • 15:05 zeljkof: EU SWAT finished
  • 15:04 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Set watchcreations preference to true by default on Commons (T178750) (duration: 00m 51s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set watchcreations preference to true by default on Commons (T178750) (duration: 00m 51s)
  • 14:54 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mapframe on lvwiki (T183661) (duration: 00m 51s)
  • 14:42 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Revert "Switch Wikipedias from $wgLogoHD to direct using of a SVG" (T178942) (duration: 00m 51s)
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Switch Wikipedias from $wgLogoHD to direct using of a SVG" (T178942) (duration: 00m 51s)
  • 14:33 zfilipin@tin: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:32 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Switch Wikipedias from $wgLogoHD to direct using of a SVG (T178942) (duration: 01m 59s)
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoreview/editor at ruwikt (T183719) (duration: 00m 51s)
  • 14:12 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create rollbacker user group for ruwiktionary (T183655) (duration: 00m 52s)
  • 14:07 foks: removed 2FA for Martin_Urbanec
  • 14:00 moritzm: installing further openssl updates
  • 13:49 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1333.*.eqiad.wmnet
  • 13:48 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1332.*.eqiad.wmnet
  • 13:47 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1331.*.eqiad.wmnet
  • 13:45 marostegui: Deploy alter table db1098:3317 - T174569
  • 13:45 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1330.*.eqiad.wmnet
  • 13:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 - T174569 (duration: 00m 51s)
  • 13:42 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1329.*.eqiad.wmnet
  • 13:41 elukey: enable live traffic for new appservers mw1329->mw1333 (T165519)
  • 13:00 moritzm: installing openssl updates on remaining mw* hosts in eqiad
  • 12:25 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 51s)
  • 12:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 & db1056 as x1 replicas (2nd try) (duration: 00m 51s)
  • 12:21 akosiaris: empty ganeti1008 for kernel downgrade. T181121
  • 12:11 jynus: add missing mysql grants to db1055 and db1056
  • 11:42 moritzm: installing ncurses security updates
  • 11:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Revert: Repool db1055 & db1056 as x1 replicas (duration: 00m 51s)
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 50s)
  • 11:31 mobrovac@tin: Finished deploy [citoid/deploy@ee0bdf4]: Update to service template v0.5.4 - T151394 (duration: 04m 19s)
  • 11:30 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 50s)
  • 11:27 mobrovac@tin: Started deploy [citoid/deploy@ee0bdf4]: Update to service template v0.5.4 - T151394
  • 09:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=api_appserver,name=mw1(1.*|200).eqiad.wmnet
  • 09:44 _joe_: setting api_appservers in the mw1180-1200 range to pooled=inactive, T183895
  • 09:43 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw1(1.*|200).eqiad.wmnet
  • 09:37 _joe_: setting appservers in the mw1180-1200 range to pooled=inactive, T183895
  • 09:28 godog: reboot ms-be1033 - T183724
  • 08:52 _joe_: restarting also mw1226-8, mw1223, mw1201,mw1203, mw1205-7
  • 08:36 _joe_: likewise for mw1285,mw1235,mw1232
  • 08:29 _joe_: restarting hhvm on mw1280,1282 for the same reasons
  • 08:26 _joe_: restarting hhvm on mw1317, multiple threads stuck in HPHP::jit::enterTCImpl
  • 08:23 elukey: restart druid coordinators on druid* to pick up new jvm settings
  • 08:19 _joe_: restarting hhvm on mw1313, concurrency HPHP::VariableUnserializer::unserializeVariant
  • 08:06 marostegui: Deploy alter table on db1039 (already depooled) - T174569
  • 07:56 marostegui: Deploy schema change on dbstore1001.s7 - T174569
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 52s)
  • 06:42 marostegui: Stop db1110 and dbstore1002.s5 replication in sync
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 to reimport dewiki.langlinks on dbstore1002 (duration: 00m 50s)

2018-01-01

2017-12-30

  • 18:22 ejegg: disabled failing silverpop data fetch jobs

2017-12-29

  • 18:31 mutante: gerrit: restart service to apply change from Dec 20th, avoid logspam due to unapplied config change
  • 15:38 ariel@tin: Finished deploy [dumps/dumps@51d4fd6]: enable pageslogging dump in parallel jobs (duration: 00m 02s)
  • 15:38 ariel@tin: Started deploy [dumps/dumps@51d4fd6]: enable pageslogging dump in parallel jobs
  • 14:46 apergos: restarted puppetdb on nitrogen, puppetdb refusing to hand out facts again
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 52s)
  • 08:19 marostegui: Stop replication in sync on dbstore1002.s5 and db1110
  • 08:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 to reimport dewiki.imagelinks on dbstore1002 (duration: 00m 52s)
  • 07:43 marostegui: Fixing dbstore1002 dewiki.imagelinks

2017-12-28

  • 14:18 marostegui: Power cycle pc2005 as it is down

2017-12-27

  • 19:54 godog: power reset ms-be1033
  • 19:47 godog: reboot ms-be1033 - T183724
  • 17:00 twentyafterfour: restarting apache on phab1001
  • 15:56 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Bounce Electron, stuck - T174916

2017-12-26

  • 16:33 mobrovac@tin: Finished deploy [mathoid/deploy@63b2ddc]: Bring back Mathoid in codfw to v0.6.5 in sync with eqiad - T179419 T172767 (duration: 01m 44s)
  • 16:31 mobrovac@tin: Started deploy [mathoid/deploy@63b2ddc]: Bring back Mathoid in codfw to v0.6.5 in sync with eqiad - T179419 T172767
  • 16:27 mutante: re-generated DNS zones to add new language lfn (Lingua Franca Nova) (T183561)
  • 12:00 Amir1: ladsgroup@terbium:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki hifwiktionary (T180785)
  • 11:59 Amir1: ladsgroup@terbium:~$ mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki aawiktionary --site-group wiktionary (T180785)
  • 11:05 ariel@tin: Finished deploy [dumps/dumps@8c44d0a]: re-enable file size reporting for dump files still being written (duration: 00m 02s)
  • 11:05 ariel@tin: Started deploy [dumps/dumps@8c44d0a]: re-enable file size reporting for dump files still being written
  • 08:34 akosiaris: migrate logstash1007 off ganeti1006 T181121
  • 08:21 akosiaris: migrate logstash1007 off ganeti1006
  • 03:52 madhuvishy: restart pdfrender on scb1002
  • 03:50 madhuvishy: restart pdfrender on scb1004
  • 03:32 madhuvishy: restart pdfrender on scb1003 following icinga page
  • 03:28 madhuvishy: restart pdfrender on scb1001 following icinga page

2017-12-24

  • 16:10 _joe_: restarted pdfrenderer on scb1002

2017-12-23

2017-12-22

  • 18:31 mobrovac@tin: Finished deploy [mathoid/deploy@7c5f8e2]: Better handling of chem rules (codfw only) - T183557 (duration: 02m 45s)
  • 18:28 mobrovac@tin: Started deploy [mathoid/deploy@7c5f8e2]: Better handling of chem rules (codfw only) - T183557
  • 16:42 demon@tin: Pruned MediaWiki: 1.31.0-wmf.11 [keeping static files] (duration: 01m 16s)
  • 16:05 demon@tin: Synchronized README: forcing co-master sync (duration: 00m 51s)
  • 15:56 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 51s)
  • 15:52 demon@tin: Pruned MediaWiki: 1.31.0-wmf.10 [keeping static files] (duration: 01m 18s)
  • 15:50 demon@tin: Pruned MediaWiki: 1.31.0-wmf.10 [keeping static files] (duration: 01m 47s)
  • 14:48 bblack: esams TLS cert switch from digicert-2016 to digicert-2017
  • 14:18 mobrovac@tin: Finished deploy [mathoid/deploy@6c29c09]: Update Mathoid to v0.7.0 in CODFW only to prefill storage, take 2 - T179419 T172767 (duration: 04m 23s)
  • 14:13 mobrovac@tin: Started deploy [mathoid/deploy@6c29c09]: Update Mathoid to v0.7.0 in CODFW only to prefill storage, take 2 - T179419 T172767
  • 14:05 mobrovac@tin: Finished deploy [mathoid/deploy@7f39b71]: Update Mathoid to v0.7.0 in CODFW only to prefill storage - T179419 T172767 (duration: 02m 57s)
  • 14:02 mobrovac@tin: Started deploy [mathoid/deploy@7f39b71]: Update Mathoid to v0.7.0 in CODFW only to prefill storage - T179419 T172767
  • 13:37 mobrovac: restbase depool restbase2008 for T179419
  • 13:17 bblack: repooling cp4032 - T183176
  • 06:44 jynus: start manual backup of db1029 onto db1056

2017-12-21

  • 19:25 robh: cp4032 going offline for memory swap
  • 17:24 jynus: starting manual backup of x1-master onto db1055
  • 17:01 volans: debugging Icinga notes_url (no side effect expected but logging it in case there will be) T170353
  • 16:56 volans: restarted ircecho to pick up gerrit/399658
  • 15:43 joal@tin: Finished deploy [analytics/refinery@92f9318]: Deploying for new pageview_top_bycountry job (duration: 04m 40s)
  • 15:39 joal@tin: Started deploy [analytics/refinery@92f9318]: Deploying for new pageview_top_bycountry job
  • 12:39 elukey: restart druid historical/broker to apply new jvm settings to druid public workers (druid100[456] - https://gerrit.wikimedia.org/r/399617)
  • 12:07 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1333.*.eqiad.wmnet
  • 11:43 _joe_: refreshing puppet facts on the puppet compilers
  • 10:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1055 & db1056 (duration: 00m 51s)
  • 10:52 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1055 & db1056 (duration: 00m 51s)
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1082 and db1100 original weight - T161294 (duration: 00m 51s)
  • 10:15 elukey: rolling restart of eventbus on kafka* for openssl security updates
  • 10:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 and start restoring original weight for db1082 - T161294 (duration: 00m 48s)
  • 09:50 elukey: restart eventbus on kafka2001 for openssl updates
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 - T161294 (duration: 00m 51s)
  • 09:30 elukey: restart zookeeper on conf100[2,3] for jvm updates - T179943
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 - T161294 (duration: 00m 51s)
  • 09:20 elukey: restart zookeeper on conf1001 for jvm updates - T179943
  • 08:51 elukey: run kafka preferred-replica-election after maintenance of kafka1023 (fully bootstrapped now)
  • 08:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T161294 (duration: 00m 51s)
  • 08:34 marostegui: Stop replication in sync on db1100 and dbstore1002 - T161294
  • 08:23 elukey: repool mw1277 after investigation
  • 07:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1100- T161294 (duration: 00m 52s)
  • 07:21 marostegui: Upgrade MariaDB on db1100
  • 07:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 and db1096:3315 - T161294 (duration: 00m 51s)
  • 06:55 _joe_: restarting hhvm on mw1231 and mw1208
  • 06:50 marostegui: Stop replication in sync on dbstore1002 - db1100 - T161294
  • 06:42 marostegui: Stop replication in sync on db1100 - db2052 - T161294
  • 06:36 marostegui: Remove some old files in dbstore1001:/srv/tmp to address the WARNING alert
  • 06:35 marostegui: Stop replication in sync db1100 and db1071 - T161294
  • 06:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T161294 (duration: 00m 51s)
  • 03:59 mutante: mw1315 kill and restart hhvm | mw1312 stop and start hhvm
  • 03:55 mutante: mw1290 kill and restart hhvm | mw1230 stop and start hhvm
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 05m 31s)
  • 01:16 eileen: update CiviCRM civicrm revision changed from 9d6ba74f57 to ffa9d7fc7a, config revision is da6fa4cce9

2017-12-20

  • 23:26 twentyafterfour: restarting apache on phab1001 to deploy a hotfix for T144184
  • 21:42 mepps: updated dash from 114131713e to b01458b260
  • 21:20 otto@tin: Finished deploy [analytics/refinery@548dad7]: deploying refinery v0.0.56 with JsonRefine fixes to allow Popups schema to be refined. This is a no-op for everything else (duration: 04m 51s)
  • 21:15 otto@tin: Started deploy [analytics/refinery@548dad7]: deploying refinery v0.0.56 with JsonRefine fixes to allow Popups schema to be refined. This is a no-op for everything else
  • 20:31 ebernhardson: T183053 update elasticsearch settings for wikidatawiki_content on codfw to use: index.refresh_interval=5s
  • 20:18 XioNoX: remove local-as from cr2-esams IX4
  • 19:46 XioNoX: remove local-as from cr2-esams IX6
  • 17:57 _joe_: depooling mw1277 for further investigation
  • 17:51 demon@tin: Synchronized wmf-config/InitialiseSettings.php: T172875 (second try, forgot to pull first (duration: 00m 51s)
  • 17:48 elukey: new mw jobrunner in production (mw1334) - T165519
  • 17:47 demon@tin: Synchronized wmf-config/InitialiseSettings.php: T172875 (duration: 00m 51s)
  • 17:46 demon@tin: Synchronized wmf-config/Wikibase-production.php: T180614 (duration: 00m 51s)
  • 17:43 elukey: restart zookeeper on conf2003 for jvm updates - T179943
  • 17:42 elukey: restart hhvm on mw1316
  • 17:38 demon@tin: Synchronized wmf-config/InitialiseSettings.php: sandbox link on Atikamekw 'pedia (duration: 00m 52s)
  • 17:38 elukey: restart hhvm on mw1234
  • 17:32 elukey: restart zookeeper on conf2002 for jvm updates - T179943
  • 17:20 demon@tin: Synchronized wmf-config/CommonSettings.php: more n0-0ps (duration: 00m 52s)
  • 17:16 demon@tin: Synchronized multiversion/submodules.json: no-op (duration: 00m 51s)
  • 17:09 demon@tin: Synchronized README: noop (duration: 00m 51s)
  • 16:27 ema: lvs1003: stop pybal, clean ipvs services, start pybal
  • 16:25 ema: lvs1006: stop pybal, clean ipvs services, start pybal
  • 15:54 moritzm: rolling restart of scb* hosts in eqiad to pick up openssl update
  • 15:47 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1334.eqiad.wmnet
  • 15:38 moritzm: rolling restart of remaining scb* hosts in codfw to pick up openssl update
  • 14:26 ema: start pybal on lvs2003
  • 14:24 ema: stop pybal on lvs2003, clean IPVS table after traffic failover to get rid of trendingedits `TCP 10.2.1.9:6699 wrr`
  • 14:14 moritzm: upgrading openssl on wdqs (along with system service restarts)
  • 14:13 ema: bounce pybal on lvs2006 and clean IPVS table
  • 14:06 elukey: temporarily shutdown kafka on kafka1023 to move some topic partitions on different disk partition (disk space usage alerts)
  • 13:49 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1276.eqiad.wmnet
  • 13:38 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1276.eqiad.wmnet
  • 13:38 moritzm: upgrading openssl on wdqs (along with system service restarts)
  • 13:35 moritzm: rolling restart of aqs to pick up openssl security update
  • 13:20 moritzm: upgrading openssl on aqs/druid clusters (along with system service restarts)
  • 13:11 jynus: restart dbproxy1008 to test workaround is working on cold restart
  • 13:01 moritzm: upgrading openssl on hadoop cluster (along with system service restarts)
  • 12:34 marostegui: Enable notifications for db1100 - T161294
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 and db1109 - T161294 (duration: 00m 51s)
  • 12:18 moritzm: installing iproute2 bugfix update from stretch point relesae
  • 11:50 godog: create k8s-staging LVs in prometheus/eqiad - T163692
  • 11:37 moritzm: upgrading openssl on elastic* (along with system service restarts)
  • 11:27 akosiaris: repool ganeti1006, rebalance row_A ganeti nodegroup. T181121
  • 11:18 jynus: restart dbproxy1005 to test cold service start
  • 11:11 marostegui: Stop replication in sync on db1100 and db1071 - T161294
  • 10:57 arturo: remove old kernel packages from silver.wikimedia.org to free space
  • 09:56 jynus: restart dbproxy1001 to test cold service start
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T174569 (duration: 00m 51s)
  • 09:24 jynus: disable puppet on dbproxies for gerrit:399359 deployment
  • 08:44 moritzm: powercycling ganeti1005
  • 08:30 moritzm: installing rsync security updates
  • 08:24 jynus: starting reimage of dbproxy1008
  • 08:11 marostegui: Stop replication in sync on db1100 and dbstore1002 - T161294
  • 07:56 jynus: starting reimage of dbproxy1007
  • 07:35 jynus: starting reimage of dbproxy1005
  • 07:24 jynus: upgrading and restarting dbproxy1004
  • 07:09 marostegui: Stop replication in sync on db1096:3315 and db1100 - T161294
  • 07:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T161294 (duration: 00m 51s)
  • 06:53 marostegui: Stop replication in sync on db1101:3318 and db1109 - T161294
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 depool db1101:3318 - T161294 (duration: 00m 51s)
  • 06:19 marostegui: Deploy schema change on db1105:3311 - T174569
  • 06:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T174569 (duration: 00m 52s)
  • 06:06 Jamesofur: reset mailman password for tawikisource T183329
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 05m 45s)

2017-12-19

  • 21:51 XioNoX: removing local-as AS43821 from ams transits - T167840
  • 19:49 mutante: deleted ganglia.wikimedia.org from DNS - webserver was already down since yesterday - not used anymore (T177225)
  • 19:23 mutante: webperf1001/webperf2001 - rebooting for kernel upgrades (not used yet)
  • 19:18 mutante: gerrit2001 - reboot for kernel upgrade
  • 18:14 moritzm: installing zsh update from stretch point release
  • 17:11 jynus: purging ferm from dbproxy1002, 3, 6, 9, 10 and 11
  • 17:00 moritzm: installing libxv security updates on jessie
  • 16:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T161294 (duration: 00m 51s)
  • 16:45 marostegui: Defragment s7 databases on db1102 - https://phabricator.wikimedia.org/T172169
  • 16:38 awight@tin: Synchronized wmf-config/CirrusSearch-labs.php: wmf-config/CommonSettings-labs.php wmf-config/db-labs.php wmf-config/InitialiseSettings-labs.php wmf-config/interwiki-labs.php wmf-config/jobqueue-labs.php wmf-config/mc-labs.php wmf-config/mobile-labs.php wmf-config/Wikibase-labs.php Sync out labs config changes (duration: 00m 51s)
  • 16:34 elukey: manually started eventlogging cleaner on db1107 to purge/sanitize data up to 90 days ago (tmux is running for user eventlogcleaner) - T108850
  • 16:22 moritzm: installing ncurses updates from jessie point release
  • 15:26 jgleeson: switched back on donations queue consumer and thank you mailer service
  • 15:25 chasemp: labvirt10[19|20] aptitude install linux-image-4.4.0-81-generic linux-image-extra-4.4.0-81-generic; sudo update-grub; /sbin/reboot T172538
  • 15:10 marostegui: Stop replication in sync on db1109 and db1099:3318 - https://phabricator.wikimedia.org/T161294
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 and db1109 - T161294 (duration: 00m 51s)
  • 15:09 jgleeson: updated civicrm from e0ee2d189c to 9d6ba74f57
  • 15:04 jgleeson: switched off donations queue consumer and thank you mailer service in preparation for new release
  • 15:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T161294 (duration: 00m 52s)
  • 14:49 moritzm: restarting hhvm on canary app servers to pick up security updates for openssl, icu and libx11
  • 14:44 moritzm: installing request-tracker4 update from jessie point release on ununpentium
  • 14:28 jynus: disabling puppet on dbproxies for 399164 deploy
  • 13:57 moritzm: upgrading pdns-recursor on achernar/acamar to 4.0.4+deb9u3~bpo8+1 (security fix)
  • 13:13 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1317.eqiad.wmnet
  • 13:05 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1318.eqiad.wmnet
  • 13:05 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw133[0-1].eqiad.wmnet
  • 12:12 hashar: CI: switching mwgate-composer-php70 job from Nodepool to Docker | https://gerrit.wikimedia.org/r/#/c/398921/
  • 11:59 hashar: CI: switching composer-php55 / composer-package-php55 jobs from Nodepool to Docker | https://gerrit.wikimedia.org/r/#/c/398920/
  • 11:29 moritzm: upgrading pdns-recursor on maerlant to 4.0.4+deb9u3~bpo8+1 (security fix)
  • 11:28 hashar@tin: Finished deploy [docker-pkg/deploy@09087ad]: Bumping Jinja2 2.9.6..2.10 (duration: 00m 30s)
  • 11:28 hashar@tin: Started deploy [docker-pkg/deploy@09087ad]: Bumping Jinja2 2.9.6..2.10
  • 11:16 moritzm: uploaded pdns-recursor 4.0.4+deb9u3~bpo8+1 to apt.wikimedia.org
  • 10:54 ariel@tin: Finished deploy [dumps/dumps@2bafffe]: allow dump runs in specified wiki list order, rather than by longest to wait (duration: 00m 02s)
  • 10:54 ariel@tin: Started deploy [dumps/dumps@2bafffe]: allow dump runs in specified wiki list order, rather than by longest to wait
  • 10:52 moritzm: upgrading pdns-recursor on nescio to 4.0.4+deb9u3~bpo8+1 (security fix)
  • 10:47 elukey: restart zookeeper on conf2001 for jvm updates - T179943
  • 10:45 jynus: disabling puppet on dbproxies for 398450 deploy
  • 10:38 godog: rollout updated version of prometheus-nutcracker-exporter
  • 09:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T161294 (duration: 00m 51s)
  • 08:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T161294 (duration: 00m 51s)
  • 08:35 moritzm: reimaging mw1317 (video scaler) to stretch
  • 08:28 marostegui: Stop replication in sync on db2045 and db1109 - T161294
  • 08:21 moritzm: installing openssl security updates
  • 08:05 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2246.codfw.wmnet
  • 08:05 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2119.codfw.wmnet
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T161294 (duration: 00m 51s)
  • 06:53 mobrovac@tin: Finished deploy [restbase/deploy@2b75a64]: Bug fix: Add the time_to_live config option to the Parsoid module (duration: 04m 26s)
  • 06:51 marostegui: Stop replication in sync on db1106 and db2052 - T161294
  • 06:49 mobrovac@tin: Started deploy [restbase/deploy@2b75a64]: Bug fix: Add the time_to_live config option to the Parsoid module
  • 06:40 marostegui: Stop replication in sync on db1106 and dbstore1002 s5 - T161294
  • 06:29 marostegui: Stop replication in sync on db1100 and db1106 - T161294
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T161294 (duration: 00m 53s)
  • 06:09 marostegui: Deploy schema change on db1065 (s1 sanitarium master) with replication, so some lag will be generated on labs - T174569
  • 05:18 andrewbogott: restarting slapd on seaborgium (in response to ldap complaints on the grid master)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 05m 22s)
  • 00:44 mutante: einsteinium: sudo systemctl restrart ircecho (alias kick-icinga-wm)

2017-12-18

  • 22:34 ejegg: updated payments-wiki from f594dfa763 to e91db27108
  • 21:00 mutante: uranium - apt-get remove ganglia-webfrontend, apache2
  • 20:53 mutante: ganglia.wikimedia.org shut down just now after a deprecation period - service is out of commission - T177225
  • 20:53 chasemp: reboot labtestvirt2003
  • 20:49 mutante: install1002/2002 - killing all ganglia processes, decoming aggregators
  • 20:48 bawolff@tin: Synchronized php-1.31.0-wmf.12/extensions/TemplateData/TemplateDataBlob.php: T118682 (duration: 00m 52s)
  • 19:49 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4032.ulsfo.wmnet
  • 18:42 moritzm: installing xml2 updates from stretch point release
  • 18:28 moritzm: installing libxkbcommon updates from stretch point release
  • 18:11 moritzm: installing python updates from stretch point release
  • 17:51 elukey: run kafka preferred-replica-election on the analytics cluster to allow kafka1023 (new node) to become a partition leader
  • 16:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.11 [keeping static files] (duration: 01m 18s)
  • 16:17 demon@tin: Pruned MediaWiki: 1.31.0-wmf.8 (duration: 04m 58s)
  • 16:15 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw133[0-7].eqiad.wmnet
  • 16:09 thcipriani@tin: Synchronized README: noop sync to test scap 3.7.4-3 (duration: 03m 02s)
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T174569 (duration: 03m 03s)
  • 15:47 marostegui: Stop MySQL on db1111 to copy its content to db1112 - T180788
  • 15:45 jynus: stop and upgrade db1107 T183123
  • 15:37 marostegui: Stop db1100 and dbstore1002 in sync - T161294
  • 15:28 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1329.eqiad.wmnet
  • 15:23 moritzm: uploaded prometheus-blazegraph-exporter, prometheus-wdqs-updater-exporter and prometheus-pdns-exporter to apt.wikimedia.org
  • 15:16 chasemp: reboot labtestvirt2003
  • 14:41 ema: upgrade pinkunicorn to latest jessie point release (8.10) T182656
  • 14:13 elukey: temporarily stopped mysql consumers on eventlog1001 to ease a mysql backup on db1107 - T183123
  • 13:58 jynus: starting one-time backup of eventlogging database on db1107:/srv/backups T183123
  • 13:29 marostegui: Stop replicaiton in sync on db1109 and db2045 - T161294
  • 13:25 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1307.eqiad.wmnet
  • 13:25 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1313.eqiad.wmnet
  • 13:21 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: mw1313.eqiad.wmnet
  • 13:20 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1260.eqiad.wmnet
  • 13:20 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: mw1260.eqiad.wmnet
  • 13:18 marostegui: Stop replication in sync on db1100 and db2052 - T161294
  • 13:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T161294 (duration: 03m 06s)
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T161294 (duration: 03m 03s)
  • 12:58 marostegui: Stop replication in sync on db1106 and db1100 - T161294
  • 12:35 marostegui: Deploy schema change on db1099:331 and db1067 - T174569
  • 12:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T174569 (duration: 03m 05s)
  • 12:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T174569 (duration: 03m 06s)
  • 12:20 akosiaris: build scap 3.7.4-3 and upload to jessie-wikimedia, stretch-wikimedia, trusty-wikimedia. T183046, T182347
  • 11:37 hashar: restarting Jenkins CI to upgrade the monitoring plugin
  • 11:24 _joe_: ran cleanup script on scb* T180384
  • 11:20 mobrovac: stopping the trending edits service - T180384
  • 11:18 moritzm: reimaging mw1307 (video scaler) to stretch
  • 11:08 _joe_: rolling restart of pybal on the low-traffic balancers
  • 11:04 _joe_: disabled notifications for trendingedits.svc T180384
  • 10:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 - T161294 (duration: 00m 56s)
  • 10:30 marostegui: Stop replication on db1109 and db2045 in sync - T161294
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T161294 (duration: 00m 56s)
  • 09:47 marostegui: Deploy schema change on db1066 - T174569
  • 09:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T174569 (duration: 00m 56s)
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T174569 (duration: 00m 57s)
  • 09:40 moritzm: installing openssl security updates
  • 09:39 marostegui: Deploy schema change on db1055 (already depooled) - T174569
  • 09:28 gehel: removing initial import datafiles from maps[12]001
  • 08:58 marostegui: Stop replication in sync on db1100 and db2052 - T161294
  • 08:57 elukey: rolling restart of the Yarn nodemanagers (hadoop) on analytics10[456]* to pick up new settings - T182276
  • 08:30 Jamesofur: insert decryption key for 2017 Arb elections
  • 08:08 volans: powercycling ganeti1005
  • 06:57 _joe_: reeanbling puppet across servers with scap
  • 06:44 marostegui: Defragment s2 databases on db1102 - T172169
  • 06:43 _joe_: restarted hhvm on mw1283, still the same kind of lockups
  • 06:33 marostegui: Deploy schema change on db1073 - T174569
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T174569 (duration: 00m 57s)
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T161294 (duration: 00m 56s)
  • 06:17 marostegui: Stop replication in sync on db1106 and db1100 - T161294
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T161294 (duration: 00m 57s)
  • 03:02 eileen: re-enable multiqueue_consumer process-control config revision is 1ae3778278
  • 03:00 eileen: civicrm revision changed from 43d3f4d739 to e0ee2d189c
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 05m 57s)
  • 01:17 cwd: disabled fredge multiqueue consumer
  • 00:49 eileen: civicrm revision changed from 798e24671b to 43d3f4d739

2017-12-16

  • 17:27 no_justification: gerrit: Back, might see a few transient puppet failures if git pulls happened during the d/t, but should all recover
  • 17:21 no_justification: gerrit: halting service momentarily for account reindexing
  • 15:00 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 00m 58s)
  • 12:29 ariel@tin: Finished deploy [dumps/dumps@95dbfe6]: revert previous deploy (duration: 00m 02s)
  • 12:29 ariel@tin: Started deploy [dumps/dumps@95dbfe6]: revert previous deploy
  • 12:16 ariel@tin: Finished deploy [dumps/dumps@faf7de8]: use cat to recombine gzipped files (duration: 00m 02s)
  • 12:16 ariel@tin: Started deploy [dumps/dumps@faf7de8]: use cat to recombine gzipped files
  • 02:04 mutante: labweb1002 - manually downgrade to scap 3.7.4-1 (disabled puppet)
  • 01:49 mutante: reimported scap 3.7.4-1 into APT (jessie-wikimedia) after fixing md5/sha sums in .dsc and .changes files to match orig.tar.gz | copied it from jessie-wikimedia to trusty and stretch-wikimedia. all distributions downgraded to 3.7.4-1 (T183046)
  • 01:42 mutante: reimported scap 3.7.4-1 into APT (jessie-wikimedia) after fixing md5/sha sums in .dsc and .changes files to match orig.tar.gz
  • 01:15 mutante: reprepro removing scap 3.7.4-2 package, attempting to reimport 3.7.4-1 package
  • 01:11 mutante: reprepro remove trusty-wikimedia scap
  • 00:44 demon@tin: Synchronized php-1.31.0-wmf.12/extensions/LoginNotify/includes/LoginNotify.php: T182867 (duration: 00m 57s)
  • 00:40 demon@tin: Synchronized README: Testing again, this time with feeling (duration: 00m 56s)
  • 00:28 mutante: re-disabled puppet on labweb1001/labweb1002 (as it was before)
  • 00:26 mutante: re-enabling puppet on scap hosts
  • 00:25 demon@tin: Synchronized README: Testing (duration: 00m 57s)
  • 00:10 mutante: no more scap 3.7.4-2 found across 'R:Package = scap' (T183046)

2017-12-15

  • 23:55 mutante: downgrading scap from 3.7.4-2 to 3.7.4-1 where it is installed - cumin -b 10 -s 5 'R:Package = scap' 'if dpkg -l scap | grep "3.7.4.2" && file /var/cache/apt/archives/scap_3.7.4-1_all.deb; then puppet agent --disable; apt-get remove --yes -q scap ; dpkg -i /var/cache/apt/archives/scap_3.7.4-1_all.deb ; fi' targeting 478 hosts (T183046)
  • 23:37 mutante: aqs1004, analytics1003, downgraded scap to 3.7.4-1
  • 22:55 mutante: tin - apt-get remove scap ; dpkg -i /var/cache/apt/archives/scap_3.7.4-1_all.deb
  • 19:39 chasemp: reboot labtestvirt2003
  • 18:15 jgleeson: rolled back to civicrm to 798e2467 to investigate prometheus bug
  • 17:32 jynus: stop, upgrade and reboot labsdb1009
  • 17:18 jynus: reloading dbproxy1010
  • 16:44 chasemp: labtestvirt2003:~# /sbin/reboot to pickup new kernel
  • 16:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1084 original weight (duration: 00m 57s)
  • 16:10 elukey: re-enable piwik on bohrium after mysql backup restore
  • 15:11 chasemp: reimage labtestvirt2003.codfw.wmnet
  • 13:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and restore original weight for db1081 (duration: 00m 56s)
  • 13:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1081 (duration: 00m 56s)
  • 13:20 chasemp: disable puppet across eqiad lab* things to land a bit of code gracefully
  • 13:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 56s)
  • 12:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 with low weight - T180788 (duration: 00m 57s)
  • 10:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T174569 (duration: 00m 56s)
  • 10:59 jynus: reloading dbproxy1011
  • 10:50 marostegui: Stop MySQL on db1084 to clone db1111 - T180788
  • 10:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 to clone db1111 - T180788 (duration: 00m 56s)
  • 10:31 elukey: rolling restart of yarn nodemanagers on an103* to apply new config - T182276
  • 10:14 moritzm: uploaded prometheus-dns-rec-exporter 0.3 to apt.wikimedia.org
  • 09:50 elukey: restore piwik database on bohrium after mysql corruption - piwik disabled
  • 09:26 marostegui: Deploy schema change on db1080 - T174569
  • 09:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T174569 (duration: 00m 56s)
  • 09:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 - T174569 (duration: 00m 56s)
  • 08:50 moritzm: powercycling ganeti1005
  • 08:38 marostegui: Stop MySQL on db1034 to decommission it - T182556
  • 08:26 marostegui: Remove db1034 from tendril - T182556
  • 06:50 marostegui: Deploy schema change on db1083 - T174569
  • 06:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T174569 (duration: 00m 56s)
  • 06:43 marostegui: Deploy schema change on dbstore1001 (s1) - T174569
  • 06:29 marostegui: Fix dbstore1001 s5 replication
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 and db1109 - T161294 (duration: 01m 16s)
  • 03:01 mutante: db2016 thru db2019 - had to manually kill gmond process to decom ganglia, other db codfw hosts: didnt need it | running puppet on db205* and others in codfw to remove all ganglia (T177225)
  • 01:23 catrope@tin: Synchronized php-1.31.0-wmf.12/extensions/ORES: Fix broken join conditions (T182936) (duration: 00m 57s)
  • 01:22 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 01:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Don't discount file searches on commonswiki (duration: 00m 57s)
  • 01:11 catrope@tin: Synchronized php-1.31.0-wmf.11: Pulling in today's cherry-picks into wmf.11 too (duration: 10m 49s)
  • 00:44 catrope@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters/ui/mw.rcfilters.ui.MenuSelectWidget.js: T182711 (duration: 00m 56s)
  • 00:33 catrope@tin: Synchronized php-1.31.0-wmf.12/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTargetEvents.js: Track editor mode on save events (T182610) (duration: 00m 56s)
  • 00:25 catrope@tin: Synchronized php-1.31.0-wmf.12/resources/lib/oojs-ui/oojs-ui-core.js: Backport OOjs UI fix for T182359, T182395 (duration: 00m 57s)
  • 00:14 RoanKattouw: updateCollation.php finished on sewiki (T181503)
  • 00:13 RoanKattouw: Running updateCollation.php on sewiki
  • 00:13 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Set category collation for sewiki to uppercase-se (duration: 00m 57s)

2017-12-14

  • 22:58 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.12 (#2)
  • 22:47 demon@tin: Synchronized php-1.31.0-wmf.12/extensions/ORES/includes/Hooks/ApiHooksHandler.php: fix undefined property (duration: 01m 08s)
  • 22:17 demon@tin: rebuilt and synchronized wikiversions files: rollback, ORES breaking stuff on enwiki
  • 22:08 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.12
  • 21:53 ejegg: updated CiviCRM from 1086291bb3 to 41a23f9fc3
  • 20:33 awight: stress testing ores*
  • 20:30 demon@tin: Pruned MediaWiki: 1.31.0-wmf.10 [keeping static files] (duration: 03m 32s)
  • 20:28 jgleeson: turned back on donations queue consume service and thank you mailer service
  • 20:21 awight@tin: Started restart [ores/deploy@b67bba7]: (non-production) Restart ORES services on ores*
  • 20:19 thcipriani@tin: Finished scap: SWAT: Update en/i18n message for multiple-unclosed-formatting-tags (duration: 30m 55s)
  • 20:02 jgleeson: updated civicrm from 798e24671b to 1086291bb3
  • {{safesubst:SAL entry|1=20:02 urandom: lowering cassandra compaction throughput to 5MB/s, restbase101{2,4}-{a,b,c}}}
  • 19:58 jgleeson: Temporarily disabled donations queue consume service and thank you mailer service
  • 19:48 thcipriani@tin: Started scap: SWAT: Update en/i18n message for multiple-unclosed-formatting-tags
  • 19:43 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Define new throttle rule and cleaning expired rules T182889 (duration: 01m 08s)
  • 19:29 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgNamespaceRobotPolicies for wikidata T181525 (duration: 01m 04s)
  • 19:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove single editor tab for plwiki T181045 (duration: 01m 09s)
  • 19:02 awight@tin: Finished deploy [ores/deploy@b67bba7]: Redeploy ORES to scb1001 (duration: 01m 04s)
  • 19:01 awight@tin: Started deploy [ores/deploy@b67bba7]: Redeploy ORES to scb1001
  • 19:01 bsitzmann@tin: Finished deploy [mobileapps/deploy@bf85a55]: Update mobileapps to ff74bb1 (T182868 T182774) (duration: 12m 26s)
  • 18:54 awight@tin: Started restart [ores/deploy@b67bba7]: Restart ORES services
  • 18:51 arlolra: Updated Parsoid to ca20680 (T182793, T182774)
  • 18:48 bsitzmann@tin: Started deploy [mobileapps/deploy@bf85a55]: Update mobileapps to ff74bb1 (T182868 T182774)
  • 18:41 arlolra@tin: Finished deploy [parsoid/deploy@13b5cb5]: (no justification provided) (duration: 09m 19s)
  • 18:32 arlolra@tin: Started deploy [parsoid/deploy@13b5cb5]: (no justification provided)
  • 18:24 elukey: replace kafka1018 with kafka1023 (Analytics Kafka cluster)
  • 18:11 awight@tin: Started deploy [ores/deploy@b67bba7]: Update ORES service to b67bba77acb
  • 16:34 bd808: Scholarships: Set application start and close dates via web UI (T181072)
  • 16:25 niharika29@tin: Finished deploy [scholarships/scholarships@872381d]: Deploy wikimania scholarships app for 2018 T181072 (duration: 00m 02s)
  • 16:25 niharika29@tin: Started deploy [scholarships/scholarships@872381d]: Deploy wikimania scholarships app for 2018 T181072
  • 16:07 bd808: Scholarships: updated database schema with 20171212-add-scholarship-orgs-field.sql (T181072)
  • 15:48 marostegui: Deploy schema change on dbstore1002 (s1) - T174569
  • 15:25 jynus: stop, upgrade and restart labsdb1011
  • 15:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T174569 (duration: 01m 08s)
  • 14:21 hashar@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters: Swat for RCFilters https://gerrit.wikimedia.org/r/#/c/398242/ https://gerrit.wikimedia.org/r/#/c/398239/ (duration: 01m 09s)
  • 13:47 chasemp: I'm reverting https://gerrit.wikimedia.org/r/#/c/394541/ as it broke the database puppet for labservices (used for powerdns backend)
  • 13:42 godog: upgrade grafana to 4.6.3 - T182294
  • 13:41 elukey: update facts for puppet compiler to pick up new hosts
  • 13:32 mobrovac@tin: Finished deploy [restbase/deploy@187d8ba]: Remove Trending Edits end point and stop storing feed results in Cassandra - T180384 T179412 (duration: 05m 37s)
  • 13:26 mobrovac@tin: Started deploy [restbase/deploy@187d8ba]: Remove Trending Edits end point and stop storing feed results in Cassandra - T180384 T179412
  • 13:10 marostegui: Deploy schema change on db1089 (s1) - T174569
  • 13:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T174569 (duration: 01m 07s)
  • 12:56 awight: stress testing on ores1*.eqiad.wmnet cluster, T182249
  • 12:42 awight@tin: Finished deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster (duration: 00m 59s)
  • 12:41 awight@tin: Started deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster
  • 12:40 awight@tin: Finished deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster (duration: 00m 45s)
  • 12:39 awight@tin: Started deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster
  • 12:18 gehel: re-initialize cassandra on maps-test2001 - T182583
  • 12:11 jynus: disable puppet on all databases to deploy safely https://gerrit.wikimedia.org/r/398246
  • 12:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 - T174569 (duration: 01m 08s)
  • 11:27 awight@tin: Finished deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster (duration: 00m 20s)
  • 11:27 awight@tin: Started deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster
  • 11:24 awight@tin: Finished deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster (duration: 00m 05s)
  • 11:23 awight@tin: Started deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster
  • 11:23 awight@tin: Finished deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster (duration: 00m 11s)
  • 11:23 awight@tin: Started deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster
  • 11:16 akosiaris@tin: Finished deploy [ores/deploy@b67bba7]: T181661 (duration: 00m 03s)
  • 11:16 akosiaris@tin: Started deploy [ores/deploy@b67bba7]: T181661
  • 11:14 akosiaris@tin: Finished deploy [ores/deploy@b67bba7]: T181661 (duration: 00m 03s)
  • 11:14 akosiaris@tin: Started deploy [ores/deploy@b67bba7]: T181661
  • 11:13 akosiaris@tin: Finished deploy [ores/deploy@b67bba7]: T181661 (duration: 00m 55s)
  • 11:12 akosiaris@tin: Started deploy [ores/deploy@b67bba7]: T181661
  • 09:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 (duration: 01m 08s)
  • 08:39 marostegui: Reload dbproxy1011 config
  • 07:18 marostegui: Stop replication and set read-only on labsdb1003 - T142807
  • 07:08 marostegui: Stop replication in sync on db1109 and db1101:3318 - T161294
  • 07:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 and db1109 - T161294 (duration: 01m 07s)
  • 07:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 and db1109 - T161294 (duration: 01m 08s)
  • 06:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 and db1109 - T161294 (duration: 01m 09s)
  • 06:35 marostegui: Deploy schema change on db1091 (s4) - T174569
  • 06:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T174569 (duration: 01m 08s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.11) (duration: 08m 05s)
  • 01:46 twentyafterfour: no phabricator deployment tonight, not enough time to prepare and test the update due to a short outage earlier this evening.
  • 00:12 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T182534 (duration: 01m 08s)

2017-12-13

  • 23:54 twentyafterfour: restarted phd on phab1001 (for good measure)
  • 23:23 reedy@tin: Synchronized php-1.31.0-wmf.12/extensions/Flow/Hooks.php: unbreak CheckUser (duration: 01m 08s)
  • 23:22 foks: deleted 22 illegal images from server
  • 22:51 volans: restarting apache2 on phab1001, phabricator timing out
  • 22:23 ppchelko@tin: Finished deploy [cpjobqueue/deploy@044cd23]: Fix sha1-based deduplication (duration: 00m 34s)
  • 22:23 ppchelko@tin: Started deploy [cpjobqueue/deploy@044cd23]: Fix sha1-based deduplication
  • 22:03 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e62d8e3]: Update mobileapps to ddddebb (duration: 05m 24s)
  • 21:58 mholloway-shell@tin: Started deploy [mobileapps/deploy@e62d8e3]: Update mobileapps to ddddebb
  • 21:54 ejegg: updated civicrm from 85a8526eb8 to 798e24671b
  • 21:37 mholloway-shell@tin: Finished deploy [mobileapps/deploy@b8082da]: Update mobileapps to bf67c97 (duration: 04m 19s)
  • 21:33 mholloway-shell@tin: Started deploy [mobileapps/deploy@b8082da]: Update mobileapps to bf67c97
  • 21:24 ppchelko@tin: Finished deploy [restbase/deploy@a993556]: Do not fallback if the revision is not specified T182770 (duration: 04m 04s)
  • 21:20 ppchelko@tin: Started deploy [restbase/deploy@a993556]: Do not fallback if the revision is not specified T182770
  • 20:27 demon@tin: Synchronized php-1.31.0-wmf.12/extensions/WikimediaEvents/extension.json: James_F made me do it (duration: 01m 08s)
  • 20:14 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.12
  • 20:10 demon@tin: Synchronized php: symlink bump for wmf.12 (duration: 01m 07s)
  • 19:37 ebernhardson@tin: Synchronized php-1.31.0-wmf.12/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: SWAT: VE trackSubscriber: Add timing data for 'loaded' state (duration: 01m 07s)
  • 19:35 ppchelko@tin: Finished deploy [restbase/deploy@3f4bedc]: Remove references to Cassandra 2 from Parsoid storage T179417 (duration: 04m 43s)
  • 19:34 ebernhardson@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters/mw.rcfilters.UriProcessor.js: SWAT: T182734: RCLFilters: support target page with a subpage (duration: 01m 07s)
  • 19:30 ppchelko@tin: Started deploy [restbase/deploy@3f4bedc]: Remove references to Cassandra 2 from Parsoid storage T179417
  • 19:25 ebernhardson@tin: Synchronized php-1.31.0-wmf.12/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: SWAT: VE trackSubscriber: data isn't required (duration: 01m 08s)
  • 19:14 ebernhardson@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters/: SWAT: T182788: RCFilters: Fix live update (duration: 01m 08s)
  • 19:07 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Turn on cirrus MLR for 4 more wikis (duration: 01m 09s)
  • 19:05 mutante: phab[12]001,mx[12]001,mendelevium,fermium: rm /usr/local/bin/exim-to-gmetric and remove root's crontab lines to follow-up gerrit:382916
  • 18:31 mutante: releases2001: /srv/mediawiki# rm -rf extensions/ skins/ vendor/ | clean up removed repos, let puppet clone, to match releases1001 and fix puppet run
  • 17:33 awight@tin: Finished deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster (duration: 00m 59s)
  • 17:32 awight@tin: Started deploy [ores/deploy@b67bba7]: (non-production) Update ORES on new cluster
  • 17:30 moritzm: installing wireshark security updates
  • 16:30 marostegui: Deploy schema change on s4 on dbstore1001 - T174569
  • 16:19 godog: try again deleting obsolete cassandra metrics from graphite2002 - T181964
  • 15:50 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: (no justification provided) (duration: 02m 22s)
  • 15:48 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: (no justification provided)
  • 15:47 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: (no justification provided) (duration: 01m 30s)
  • 15:46 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: (no justification provided)
  • 15:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1034 from config - T182556 (duration: 01m 07s)
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1034 from config - T182556 (duration: 01m 09s)
  • 15:39 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: (no justification provided) (duration: 04m 46s)
  • 15:34 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: (no justification provided)
  • 15:21 chasemp: remove and purge vblade-persist and runit from labstore1004 T182781
  • 15:02 jynus: upgrade and restart db1067
  • 15:02 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 (duration: 01m 07s)
  • 14:43 zeljkof: EU SWAT finished
  • 14:38 zfilipin@tin: Synchronized portals: SWAT: Bumping portals to master (T128546) (duration: 01m 09s)
  • 14:37 zfilipin@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T128546) (duration: 01m 08s)
  • 14:29 dcausse@tin: Synchronized php-1.31.0-wmf.12/extensions/Wikibase: T182293 Extract names of search fields as constants (duration: 02m 05s)
  • 14:22 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T182293 [cirrus] tune wikidata similarity configuration 2/2 (duration: 01m 07s)
  • 14:20 dcausse@tin: Synchronized wmf-config/Wikibase.php: T182293 [cirrus] tune wikidata similarity configuration 1/2 (duration: 01m 12s)
  • 14:03 akosiaris: gnt-node remove ganeti1006 T181121
  • 14:01 elukey: restart Yarn nodemanagers on analytics102[8,9] to apply new settings - T182276
  • 13:21 moritzm: uploaded prometheus-pdns-rec-exporter 0.2-1 to apt.wikimedia.org
  • 13:10 marostegui: Deploy schema change on s4 - dbstore1002 - T174569
  • 12:53 marostegui: Deploy alter table on s4 db1064 (sanitarium master) with replication, this will generate lag on labs replicas - T174569
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 - T174569 (duration: 01m 36s)
  • 12:19 moritzm: uploaded prometheus-ircd-exporter and prometheus-pdns-rec-exporter to apt.wikimedia.org
  • 11:59 elukey: forced remount of /mnt/hdfs after OOM event on stat1005
  • 11:41 akosiaris: empty ganeti1006 for reimage after ltpstress and bios upgrades T181121
  • 11:16 mobrovac@tin: Finished deploy [graphoid/deploy@7979a40]: Update to service-template-node v0.5.4 (duration: 03m 56s)
  • 11:12 mobrovac@tin: Started deploy [graphoid/deploy@7979a40]: Update to service-template-node v0.5.4
  • 10:19 mobrovac@tin: Finished deploy [recommendation-api/deploy@ac66089]: Update to service-template-node v0.5.4 (duration: 02m 20s)
  • 10:17 mobrovac@tin: Started deploy [recommendation-api/deploy@ac66089]: Update to service-template-node v0.5.4
  • 09:41 godog: upload prometheus-elasticsearch-exporter to jessie-wikimedia - T181627
  • 08:38 jynus: migrate away from ganeti1008
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 - T174569 (duration: 01m 11s)
  • 06:25 marostegui: Deploy schema change on db1103:3314 - T174569
  • 06:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 - T174569 (duration: 01m 08s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.11) (duration: 07m 02s)
  • 00:49 ebernhardson@tin: Synchronized php-1.31.0-wmf.11/extensions/WikimediaEvents/: SWAT: T182616: turn on second mlr ab test for hewiki (duration: 01m 08s)
  • 00:44 ebernhardson@tin: Synchronized php-1.31.0-wmf.12/extensions/WikimediaEvents/: SWAT: T182616: turn on second mlr ab test for hewiki (duration: 01m 08s)
  • 00:40 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Turn on cirrus MLR for most wikis with >1% of search traffic (duration: 01m 08s)
  • 00:35 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5832a8c]: Update mobileapps to bfc3588 (duration: 04m 48s)
  • 00:32 brion: restarted requeueTranscodes.php on terbium for mp3 audio generation backfill (had dropped DB connection)
  • 00:31 mholloway-shell@tin: Started deploy [mobileapps/deploy@5832a8c]: Update mobileapps to bfc3588
  • 00:05 ebernhardson@tin: Synchronized wmf-config/: SWAT: T182616: Setup MLR AB test for hewiki (duration: 01m 10s)

2017-12-12

  • 22:30 mholloway-shell@tin: Finished deploy [mobileapps/deploy@ea8f05d]: Update mobileapps to 94f267b (duration: 05m 10s)
  • 22:24 mholloway-shell@tin: Started deploy [mobileapps/deploy@ea8f05d]: Update mobileapps to 94f267b
  • 22:02 urandom: setting compaction throughput to 5 MB/s, restbase1010
  • 21:31 mutante: releases1001 - rm mediawiki core repo and let puppet try to recreate it (follow-up issue after gerrit:397891)
  • 21:28 mobrovac@tin: Finished deploy [restbase/deploy@dceab2e]: Switch to Parsoid content v1.6.0 and switch to Cassandra 3 storage - T179417 (duration: 04m 16s)
  • 21:27 demon@tin: Synchronized php-1.31.0-wmf.12/extensions/GlobalBlocking/: (no justification provided) (duration: 01m 07s)
  • 21:26 mholloway-shell@tin: Finished deploy [mobileapps/deploy@28bfda3]: Update mobileapps to d0ee651 (duration: 01m 45s)
  • 21:24 demon@tin: Synchronized php-1.31.0-wmf.11/extensions/GlobalBlocking/: (no justification provided) (duration: 01m 08s)
  • 21:24 mholloway-shell@tin: Started deploy [mobileapps/deploy@28bfda3]: Update mobileapps to d0ee651
  • 21:24 mholloway-shell@tin: Finished deploy [mobileapps/deploy@b2d5b8e]: Update mobileapps to 172abc7 (duration: 00m 30s)
  • 21:24 mholloway-shell@tin: Started deploy [mobileapps/deploy@b2d5b8e]: Update mobileapps to 172abc7
  • 21:23 mobrovac@tin: Started deploy [restbase/deploy@dceab2e]: Switch to Parsoid content v1.6.0 and switch to Cassandra 3 storage - T179417
  • 21:21 ppchelko@tin: Finished deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries, take 4 (duration: 02m 45s)
  • 21:18 ppchelko@tin: Started deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries, take 4
  • 21:15 ppchelko@tin: Finished deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries, take 3 (duration: 02m 04s)
  • 21:13 ppchelko@tin: Started deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries, take 3
  • 21:13 ppchelko@tin: Finished deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries, take 2 after failed content rerender (duration: 05m 05s)
  • 21:12 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Electron hanging - T174916
  • 21:08 ppchelko@tin: Started deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries, take 2 after failed content rerender
  • 21:08 ppchelko@tin: Finished deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries (duration: 05m 51s)
  • 21:02 ppchelko@tin: Started deploy [restbase/deploy@dceab2e]: Bump expected parsoid version, but do not switch summaries
  • 20:55 ppchelko@tin: Finished deploy [restbase/deploy@d3ca789]: Revert deployment for using MCS for summaries (duration: 01m 36s)
  • 20:53 ppchelko@tin: Started deploy [restbase/deploy@d3ca789]: Revert deployment for using MCS for summaries
  • 20:53 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.12
  • 20:51 ppchelko@tin: Finished deploy [restbase/deploy@506047c]: Update expected Parsoid version, switched summary to MCS (duration: 03m 30s)
  • 20:48 ppchelko@tin: Started deploy [restbase/deploy@506047c]: Update expected Parsoid version, switched summary to MCS
  • 20:44 mholloway-shell@tin: Finished deploy [mobileapps/deploy@b2d5b8e]: Update mobileapps to 172abc7 (duration: 04m 48s)
  • 20:39 mholloway-shell@tin: Started deploy [mobileapps/deploy@b2d5b8e]: Update mobileapps to 172abc7
  • 20:35 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0a9d635]: Update mobileapps to 035608d (duration: 02m 32s)
  • 20:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@0a9d635]: Update mobileapps to 035608d
  • 20:31 mholloway-shell@tin: Finished deploy [mobileapps/deploy@2690678]: Update mobileapps to 5b8796d (duration: 40m 04s)
  • 19:57 arlolra: Updated Parsoid to 741fc5d (T114072, T181226, T21910, T152540, T103714, T97093, T118520, T181229, T182338, T182170, T169006)
  • 19:51 mholloway-shell@tin: Started deploy [mobileapps/deploy@2690678]: Update mobileapps to 5b8796d
  • 19:43 arlolra@tin: Finished deploy [parsoid/deploy@98139cb]: (no justification provided) (duration: 14m 57s)
  • 19:35 anomie: Running cleanupUsersWithNoId.php on all wikis (this will take a while), see T181731
  • 19:12 arlolra: Parsoid deploy aborted and rolled back to 01c1fc3 while RESTBase fixes an issue
  • 19:06 arlolra@tin: Finished deploy [parsoid/deploy@98139cb]: Updating Parsoid to 741fc5d (duration: 05m 33s)
  • 19:00 arlolra@tin: Started deploy [parsoid/deploy@98139cb]: Updating Parsoid to 741fc5d
  • 18:48 aaron@tin: Synchronized php-1.31.0-wmf.12/includes/Setup.php: 058c17e702eb0 (duration: 01m 09s)
  • 18:21 moritzm: uploaded prometheus-ircd-exporter to apt.wikimedia.org
  • 17:52 demon@tin: Finished scap: bootstrap wmf.12 (duration: 29m 35s)
  • 17:22 demon@tin: Started scap: bootstrap wmf.12
  • 17:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.7 (duration: 08m 45s)
  • 17:12 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: T181661 (duration: 00m 03s)
  • 17:12 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: T181661
  • 17:11 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: T181661 (duration: 00m 36s)
  • 17:11 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: T181661
  • 17:03 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: T181661
  • 16:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1084 and db1081 original weight (duration: 00m 56s)
  • 16:40 jynus: restart and upgrade db1059 (phabricator passive db)
  • 16:35 gehel@tin: Finished deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2001 (duration: 00m 19s)
  • 16:35 gehel@tin: Started deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2001
  • 16:34 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging on maps-test2001 (duration: 00m 20s)
  • 16:34 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging on maps-test2001
  • 16:30 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging on maps-test2002 (duration: 00m 20s)
  • 16:30 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging on maps-test2002
  • 16:28 gehel@tin: Finished deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2002 (duration: 00m 18s)
  • 16:28 gehel@tin: Started deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2002
  • 16:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1084 (duration: 00m 53s)
  • 16:24 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2072 (duration: 00m 55s)
  • 16:22 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: T181661
  • 16:21 akosiaris: failover boron to ganeti1008
  • 16:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1084 (duration: 00m 56s)
  • 15:59 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: T181661 (duration: 00m 04s)
  • 15:59 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: T181661
  • 15:59 akosiaris@tin: Finished deploy [ores/deploy@b4f2b02]: T181661 (duration: 00m 09s)
  • 15:58 akosiaris@tin: Started deploy [ores/deploy@b4f2b02]: T181661
  • 15:46 jynus: stop, upgrade and reboot db2072
  • 15:34 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2072 (duration: 00m 56s)
  • 15:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 - T174569 (duration: 01m 01s)
  • 15:31 marostegui: Deploy schema change on s4 db1097:3314 - T174569
  • 15:25 gehel: powercycling maps-test2003
  • 15:24 elukey: rename notebook1002 -> kafka1023 - step 3, replace notebook1002 with kafka1023 in the puppet config
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 with low weight (duration: 01m 18s)
  • 15:06 marostegui: Upgrade MySQl on db1084
  • 15:02 elukey: clear recdns records related to notebook1002/kafka1023 (rec_control wipe-cache kafka1023.eqiad.wmnet kafka1023.mgmt.eqiad.wmnet notebook1002.eqiad.wmnet 14.5.64.10.in-addr.arpa 104.3.65.10.in-addr.arpa) - T181518
  • 14:58 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1260.eqiad.wmnet
  • 14:46 elukey: start rename notebook1002 -> kafka1023 - step 2, dns config (host already shutdown) - T181518
  • 14:39 hashar: Rebuild operations-puppet-tests-docker image based on c76d892 | T178620 and /cache being owned by root
  • 14:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2071 (duration: 00m 56s)
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1086 (duration: 00m 56s)
  • 14:20 zeljkof: EU SWAT finished
  • 14:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS aliases for zh_wikiquote (T181374) (duration: 00m 56s)
  • 14:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Lift account registration on en.wiki (T182665) (duration: 00m 56s)
  • 14:02 herron: upgrading codfw puppet agents
  • 13:59 jynus: stop, upgrade and reboot db2071
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1086 (duration: 00m 56s)
  • 13:56 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 00m 57s)
  • 13:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1086 (duration: 00m 57s)
  • 12:44 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 12:10 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: scb1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 12:10 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: scb1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 12:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 (duration: 00m 55s)
  • 11:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 with low weight - T163190 (duration: 00m 55s)
  • 11:27 jynus: stop, upgrade and reboot db1092
  • 11:26 marostegui: Upgrade MySQL and kernel on db1086
  • 11:26 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 (duration: 00m 56s)
  • 11:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 57s)
  • 11:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 - T174569 (duration: 00m 57s)
  • 11:11 marostegui: Deploy schema change on db1084 (s4) - T174569
  • 11:05 marostegui: Deploy schema change on db1056 (s4) already depooled - T174569
  • 11:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 - T174569 (duration: 00m 56s)
  • 10:46 gehel@tin: Finished deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2003 (duration: 00m 10s)
  • 10:46 gehel@tin: Started deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2003
  • 10:26 jynus: stop, upgrade and reboot db2066
  • 10:24 gehel@tin: Finished deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2004 (duration: 00m 18s)
  • 10:24 gehel@tin: Started deploy [kartotherian/deploy@6e223df]: new kartotherian packaging on maps-test2004
  • 10:23 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 56s)
  • 10:14 moritzm: reimaging mw1260 (video scaler) to stretch
  • 10:04 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2059 (duration: 00m 56s)
  • 09:51 moritzm: updated stretch installer netboot image after stretch 9.3 point release
  • 09:51 moritzm: updated stretch installer netboot image after jessie 8.10 point release
  • 09:44 marostegui: Stop db1039 and db1086 in sync - T163190
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T163190 (duration: 00m 56s)
  • 09:38 gehel: reduce replication factor for cassandra on maps-test cluster and reset cassandra on maps-test2001 to work around limited disk space - T182583
  • 09:34 marostegui: Stop replication in sync on db1034 and db1039 for data consistency check - T163190
  • 09:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 after InnoDB there - T178359 (duration: 00m 56s)
  • 09:00 jynus: stop, upgrade and reboot db2059
  • 08:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2059 (duration: 00m 56s)
  • 08:47 moritzm: updated jessie installer netboot image after jessie 8.10 point release
  • 07:15 marostegui: Deploy schema change on db1081 - https://phabricator.wikimedia.org/T174569
  • 07:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T174569 (duration: 00m 56s)
  • 06:56 marostegui: stop MySQL on db1055 - T182653
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T178359 T182653 (duration: 00m 56s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.11) (duration: 06m 21s)
  • 00:53 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Remove manual firejailing of Score binaries (T181535) (duration: 00m 56s)
  • 00:47 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Have ExtensionDistributor treat REL1_30 as stable (duration: 00m 56s)
  • 00:39 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Re-enable ORES on fawiki (T182354) (duration: 00m 56s)
  • 00:25 ejegg: updated CiviCRM from 2db9c76 to 85a8526
  • 00:23 catrope@tin: Synchronized wmf-config/CommonSettings.php: Fix default for rcOresDamagingPref (duration: 00m 56s)
  • 00:22 ejegg: disabled nightly Ingenico WX audit parsing job

2017-12-11

  • 23:37 smalyshev@tin: Finished deploy [wdqs/wdqs@f6b110f]: updater fix and GUI update (duration: 06m 10s)
  • 23:31 smalyshev@tin: Started deploy [wdqs/wdqs@f6b110f]: updater fix and GUI update
  • 23:28 mholloway-shell@tin: Finished deploy [mobileapps/deploy@07293bc]: Update mobileapps to e290b17 (duration: 06m 16s)
  • 23:22 mholloway-shell@tin: Started deploy [mobileapps/deploy@07293bc]: Update mobileapps to e290b17
  • 21:11 mholloway-shell@tin: Finished deploy [mobileapps/deploy@6347d62]: Update mobileapps to 61ca333 (duration: 07m 56s)
  • 21:04 mholloway-shell@tin: Started deploy [mobileapps/deploy@6347d62]: Update mobileapps to 61ca333
  • 19:47 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T181493: Enable Page Previews EventLogging instrumentation (duration: 00m 56s)
  • 19:47 mobrovac@tin: Finished deploy [restbase/deploy@be7d72f]: Expose the Reading Lists end points, take #3 - T181107 (duration: 06m 19s)
  • 19:46 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b1beaf1]: Revert dedupe based on sha1 as well as on event ID (duration: 00m 29s)
  • 19:46 ppchelko@tin: Started deploy [cpjobqueue/deploy@b1beaf1]: Revert dedupe based on sha1 as well as on event ID
  • {{safesubst:SAL entry|1=19:43 urandom: lower compaction throughput to 2 MB/s, restbase1010-{a,b,c} - T178177}}
  • 19:40 mobrovac@tin: Started deploy [restbase/deploy@be7d72f]: Expose the Reading Lists end points, take #3 - T181107
  • 19:34 mobrovac@tin: Finished deploy [restbase/deploy@bce2885]: Expose the Reading Lists end points, take #2 - T181107 (duration: 05m 33s)
  • 19:28 mobrovac@tin: Started deploy [restbase/deploy@bce2885]: Expose the Reading Lists end points, take #2 - T181107
  • 19:28 mobrovac@tin: Finished deploy [restbase/deploy@bce2885]: Expose the Reading Lists end points - T181107 (duration: 01m 26s)
  • 19:26 mobrovac@tin: Started deploy [restbase/deploy@bce2885]: Expose the Reading Lists end points - T181107
  • 19:24 ebernhardson@tin: Synchronized wmf-config/throttle.php: SWAT: T182613 Update throttle rule for McGill University Library (duration: 00m 56s)
  • 19:21 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T131132: Switch submit button from save to publish on enwiki (duration: 02m 43s)
  • 19:19 awight@tin: Finished deploy [ores/deploy@1c0ede0]: (non-production) Testing parallel ORES deployment, T181661 (duration: 01m 12s)
  • 19:17 awight@tin: Started deploy [ores/deploy@1c0ede0]: (non-production) Testing parallel ORES deployment, T181661
  • 19:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.11/extensions/CirrusSearch/includes/Search/RescoreBuilders.php: SWAT: Add query string for running Cirrus MLR pre-deploy checks (duration: 00m 57s)
  • 18:56 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch cebwiki, ruwiki and small projects to Kafka for htmlCacheUpdate - T182023 (duration: 00m 57s)
  • 18:55 ppchelko@tin: Finished deploy [cpjobqueue/deploy@e1075af]: Enable htmlCacheUpdate for ceb and ru wiki and small projects T182023 (duration: 00m 34s)
  • 18:54 ppchelko@tin: Started deploy [cpjobqueue/deploy@e1075af]: Enable htmlCacheUpdate for ceb and ru wiki and small projects T182023
  • 18:03 gehel@tin: Finished deploy [wdqs/wdqs@353b3cb]: wdqs: GUI and updater updates (duration: 01m 14s)
  • 18:02 gehel@tin: Started deploy [wdqs/wdqs@353b3cb]: wdqs: GUI and updater updates
  • 16:42 hashar: Restarting Jenkins
  • 15:36 marostegui: Deploy schema change on s2 master (db1054) - T174569
  • 14:51 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Disable the wgTranslateNumerals at hiwikiversity T182584 (duration: 00m 56s)
  • 14:46 niharika29@tin: Synchronized php-1.31.0-wmf.11/includes/specials/SpecialUndelete.php: Revert replacing textarea in Special:Undelete with OOUI T182398 (duration: 00m 57s)
  • 14:34 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Bureaucrats to grant and remove translationadmin rights; sysops to add and remove the same from themselves - T182492 (duration: 00m 56s)
  • 14:32 niharika29@tin: Synchronized wmf-config/interwiki.php: Update the Interwiki map - T182506 (duration: 00m 56s)
  • 14:25 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Create NS_PROJECT and NS_PROJECT_TALK alias for kowikisource (T182487) (duration: 00m 56s)
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 - T174569 (duration: 04m 42s)
  • 12:39 _joe_: trying to get a full core dump from hhvm on mw1200
  • 11:12 _joe_: depooling mw1200 for investigation instead
  • 11:11 _joe_: restarting hhvm on mw1200, stuck in a kernel task
  • 11:08 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 45s)
  • 11:07 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 44s)
  • 10:49 ema: cp4021: restart varnish-be due to mbox lag
  • 10:04 godog: upgrade grafana to 4.6.2 on labmon1001 - T182294
  • 10:00 jynus: stopping dbstore2001:s5 and dbstore1002 (s5) mysql replication in sync
  • 09:28 akosiaris: upload scap_3.7.4-1 to apt.wikimedia.org/jessie-wikimedia/main
  • 09:16 gehel: cleaning old cassandra dumps on maps-test2001 servers
  • 09:15 gehel: cleaning up old postgres logs on maps-test2001
  • 09:05 elukey: set notebook1002 as role::spare as prep step to reimage it to kafka1023
  • 09:03 jynus: dropping multiple leftover files from db1102
  • 08:52 marostegui: Stop replication in sync on db1034 and db1039 - T163190
  • 08:12 elukey: powercycle ganeti1008 - all vms stuck, console com2 showed a ton of printks without a clear indicator of the root cause
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 - T182556 (duration: 00m 45s)
  • 07:44 _joe_: restarting hhvm on mw1189,mw1229,mw1235,mw1282,mw1285,mw1315,mw1316, all stuck with a kernel hang
  • 06:59 _joe_: restarted hhvm, nginx on mw1280, hanging kernel operations
  • 06:45 marostegui: Deploy schema change on s2 db1060 with replication enabled, this will generate some lag on s2 on labs - T174569
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 - T174569 (duration: 00m 44s)
  • 06:22 marostegui: Compress s6 on db1096 - T178359
  • 06:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 to compress InnoDB there - T178359 (duration: 00m 45s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.11) (duration: 09m 21s)

2017-12-10

  • 20:33 elukey: execute restart-hhvm on mw1312 - hhvm stuck multiple times queueing requests
  • 20:01 elukey: ran kafka preferred-replica-election for the kafka analytics cluster (1012->1022) to re-add kafka1012 to the kafka brokers acting as partition leaders (will spread the load in a better way)

2017-12-09

  • 17:00 apergos: restarted hhvm on mw1276, the same old hang with the same old symptoms
  • 16:10 awight@tin: Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (take 4\!) (duration: 03m 01s)
  • 16:07 awight@tin: Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (take 4\!)
  • 16:02 awight@tin: Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 05m 58s)
  • 15:56 awight@tin: Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity
  • 15:55 awight@tin: Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 00m 17s)
  • 15:55 awight@tin: Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity
  • 15:53 awight@tin: Finished deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity (duration: 00m 31s)
  • 15:53 awight@tin: Started deploy [ores/deploy@1c0ede0]: Reducing ORES Celery log verbosity
  • 15:53 apergos: did same on scb1002,3,4
  • 15:48 awight: Making an emergency deployment to ORES logging config to reduce verbosity.
  • 15:45 apergos: on scb1001 moved daemon.log out of the way, did "service rsyslog rotate", saved the last 5000 entries for use by ores team, removed the log
  • 11:44 apergos: that server list: mw1278, 1277, 1226, 1234, 1230
  • 11:42 apergos: restarted hhvm on api servers after lockup
  • 11:19 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Disable ORES in fawiki - T182354 (duration: 00m 45s)
  • 00:11 Jamesofur: removed 2FA from EVinente after verification T182373

2017-12-08

  • 23:23 hashar: force ran puppet on contint2001
  • 22:15 madhuvishy: Kicked off rsync of /data/xmldatadumps/public to labstore1006 & 7
  • 22:05 smalyshev@tin: Finished deploy [wdqs/wdqs@353b3cb]: temporary fix for T182464, better fix coming soon (duration: 05m 55s)
  • 21:59 smalyshev@tin: Started deploy [wdqs/wdqs@353b3cb]: temporary fix for T182464, better fix coming soon
  • 20:22 aaron@tin: Synchronized php-1.31.0-wmf.11/includes/Setup.php: a319c3e7ab61 - disable cpPosTime injection (duration: 00m 45s)
  • 18:00 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Disable GlobalBlocking on fishbowl wikis (duration: 00m 45s)
  • 16:23 urandom: starting cassandra, restbase1010 - T178177
  • 16:22 urandom: disabling smart path, restbase1010, arrays 'b'...'e' - T178177
  • 16:20 urandom: disabling smart path, restbase1010, array 'a' (canary) - T178177
  • 16:15 urandom: shutting down cassandra, restbase1010 - T178177
  • 15:35 marostegui: Fix dbstore1002 s5 replication
  • 15:28 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 (duration: 00m 03s)
  • 15:28 gehel@tin: Started deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003
  • 15:08 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 (duration: 02m 08s)
  • 15:06 gehel@tin: Started deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003
  • 15:05 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003 (duration: 00m 42s)
  • 15:05 gehel@tin: Started deploy [tilerator/deploy@29d633e]: testing new tilerator packaging on maps-test2003
  • 14:39 gehel@tin: Finished deploy [tilerator/deploy@e52ea1d]: testing new tilerator packaging on maps-test2003 (duration: 02m 34s)
  • 14:36 gehel@tin: Started deploy [tilerator/deploy@e52ea1d]: testing new tilerator packaging on maps-test2003
  • 11:45 elukey: updated prometheus-druid-exporter on druid* to 0.6
  • 11:39 elukey: upload prometheus-druid-exporter 0.6 to stretch/jessie wikimedia
  • 06:52 marostegui: Fix labsdb1004 replication broken
  • 06:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1099:3311 - T178359 (duration: 00m 55s)
  • 04:56 bd808: scholarships: updated db schema for 2018 cycle (T181072)
  • 00:04 bblack: cp4026 - restart varnish backend, mailbox lag

2017-12-07

  • 23:59 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T181107 enable ReadingLists on all wikis (duration: 00m 46s)
  • 23:54 tgr: ran mwscript extensions/ReadingLists/maintenance/populateProjectsFromSiteMatrix.php --wiki=testwiki
  • 23:47 tgr@tin: Finished scap: T181107 deploy ReadingLists to testwiki (duration: 24m 44s)
  • 23:22 tgr@tin: Started scap: T181107 deploy ReadingLists to testwiki
  • 23:04 tgr: ran mwscript ../../../home/tgr/sql.php --wiki=mediawikiwiki --cluster extension1 --wikidb wikishared /srv/mediawiki-staging/php-1.31.0-wmf.11/extensions/ReadingLists/sql/readinglists.sql
  • 22:33 tgr@tin: Synchronized php-1.31.0-wmf.11/extensions/ReadingLists/: ReadingLists/wmf.11: catching up with master (duration: 00m 45s)
  • 22:29 tgr@tin: Synchronized php-1.31.0-wmf.10/extensions/ReadingLists/: ReadingLists/wmf.10: catching up with master (duration: 00m 46s)
  • 20:39 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.11
  • 20:35 elukey: restart hhvm on mw1235 - hhvm-dump-debug hanging out, not stacktrace available
  • 20:31 elukey: restart hhvm on mw1281 - hhvm stuck (hhvm-dump-debug timing out)
  • 20:22 joal@tin: Finished deploy [analytics/refinery@53bd630]: Regular analytics deploy - Long time no see, deployment :) - post-patch-2 (hopefully last for tonight) (duration: 05m 21s)
  • 20:18 herron: re-pooling eqiad puppet 4 masters via dns puppet.eqiad.wmnet puppet.wikimedia.org
  • 20:17 joal@tin: Started deploy [analytics/refinery@53bd630]: Regular analytics deploy - Long time no see, deployment :) - post-patch-2 (hopefully last for tonight)
  • 20:14 joal@tin: Finished deploy [analytics/refinery@3e52903]: Regular analytics deploy - Long time no see, deployment :) - post-patch (duration: 01m 20s)
  • 20:12 joal@tin: Started deploy [analytics/refinery@3e52903]: Regular analytics deploy - Long time no see, deployment :) - post-patch
  • 19:52 joal@tin: Finished deploy [analytics/refinery@bd9c6cc]: Regualr analytics deploy - Long time no see, deployment :) (duration: 12m 15s)
  • 19:40 joal@tin: Started deploy [analytics/refinery@bd9c6cc]: Regualr analytics deploy - Long time no see, deployment :)
  • 19:35 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Start description usage tracking for commonswiki T106287 (duration: 00m 48s)
  • 19:29 thcipriani@tin: Synchronized php-1.31.0-wmf.11/includes/specialpage/ChangesListSpecialPage.php: SWAT: WLFilters: Correctly check if RCFilters should be enabled on WL T182318 (duration: 00m 48s)
  • 19:24 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Remove obsolete WikibaseQualityConstraints settings (duration: 00m 48s)
  • 19:14 ppchelko@tin: Finished deploy [changeprop/deploy@3c4f51d]: Long awaited deploy: generic optimizations, gc metric, delay reporting (duration: 01m 15s)
  • 19:13 ppchelko@tin: Started deploy [changeprop/deploy@3c4f51d]: Long awaited deploy: generic optimizations, gc metric, delay reporting
  • 19:12 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Rm all past throttle overrides in throttle.php (duration: 00m 48s)
  • 18:06 mholloway-shell@tin: Finished deploy [mobileapps/deploy@2fa32ed]: Update mobileapps to 71f581c (duration: 05m 36s)
  • 18:00 mholloway-shell@tin: Started deploy [mobileapps/deploy@2fa32ed]: Update mobileapps to 71f581c
  • 18:00 herron: re-pooling eqiad puppet 4 masters as puppet.ulsfo.wnet
  • 17:33 demon@tin: Synchronized php-1.31.0-wmf.11/extensions/VisualEditor/lib/ve: Ief480487, Deskana made me do it (duration: 00m 49s)
  • 17:25 milimetric@tin: Finished deploy [analytics/aqs/deploy@4ec13b4]: (no justification provided) (duration: 07m 28s)
  • 17:25 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1314.eqiad.wmnet
  • 17:25 herron: puppetmaster1001 upgraded to puppet 4. re-enabling puppet agents across the fleet
  • 17:18 milimetric@tin: Started deploy [analytics/aqs/deploy@4ec13b4]: (no justification provided)
  • 17:13 herron: upgrading puppetmaster1001 to puppet 4
  • 17:08 herron: temporarily disabling all puppet agents during puppetmaster1001 (puppet ca) upgrade to puppet 4
  • 16:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight - T178359 (duration: 00m 48s)
  • 16:13 herron: upgrading puppetmaster1002 to puppet 4
  • 16:03 moritzm: uploaded openssl 1.0.2n for jessie-wikimedia to apt.wikimedia.org
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 back as main traffic in s6 - T178359 (duration: 00m 48s)
  • 15:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight - T178359 (duration: 00m 48s)
  • 15:42 elukey: hhvm-dump-debug for mw1314 saved to /tmp/hhvm.17991.bt.
  • 15:30 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1314.eqiad.wmnet
  • 15:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight - T178359 (duration: 00m 48s)
  • 15:19 otto@tin: Finished deploy [analytics/superset/deploy@f0f5adf]: initial deployment (duration: 00m 02s)
  • 15:18 otto@tin: Started deploy [analytics/superset/deploy@f0f5adf]: initial deployment
  • 14:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1099:3311 with low weight - T178359 (duration: 00m 47s)
  • 14:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1074 (duration: 00m 47s)
  • 14:09 zeljkof: EU SWAT finished
  • 14:08 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: alswiki: Set wgRestrictDisplayTitle = false (T182154) (duration: 00m 49s)
  • 13:39 moritzm: reimaging mw2246 to stretch
  • 12:54 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2152.codfw.wmnet
  • 12:32 addshore@tin: Synchronized wmf-config/Wikibase.php: Move Wikibase dispatchingLockManager to InitialiseSettings PT 2/2 (duration: 00m 48s)
  • 12:31 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Move Wikibase dispatchingLockManager to InitialiseSettings PT 1/2 (duration: 00m 48s)
  • 11:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase API traffic for db1074 (duration: 00m 48s)
  • 11:41 moritzm: reimaging mw2152 to stretch
  • 11:35 marostegui: Compress s8 on db1099 - T178359
  • 11:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase API traffic for db1074 (duration: 00m 48s)
  • 11:11 mobrovac@tin: Finished deploy [restbase/deploy@097ba7d]: Add CORS headers to erroneous responses as well - T182103 (duration: 05m 24s)
  • 11:05 mobrovac@tin: Started deploy [restbase/deploy@097ba7d]: Add CORS headers to erroneous responses as well - T182103
  • 10:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 with low weight (duration: 00m 48s)
  • 10:50 elukey: powercycle analytics1003 - no serial console, ssh stuck in System is booting up. See pam_nologin(8)
  • 10:28 _joe_: depooling mw1283 for further investigation
  • 10:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1098:3316 db1098:3317 - T178359 (duration: 00m 51s)
  • 10:12 elukey: reboot analytics1003 for kernel+jvm updates - T179943
  • 10:01 gehel: upgrade of ELK stack on logstash100* completed - Kibana was unavailable for longer than expected - T178412
  • 09:48 hashar: CI: removed Wikidata from configuration, replaced by Wikibase. wmf/* and REL branches are going to be broken though | https://gerrit.wikimedia.org/r/395704 | T181838
  • 09:46 akosiaris: silence ganeti1006 on icinga T181121
  • 09:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098:3316 db1098:3317 - T178359 (duration: 00m 52s)
  • 09:28 marostegui: Upgrade MySQL and kernel on db1074
  • 09:22 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=logstash1009.eqiad.wmnet
  • 09:19 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=logstash1009.eqiad.wmnet
  • 09:18 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=logstash1008.eqiad.wmnet
  • 09:14 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=logstash1008.eqiad.wmnet
  • 09:13 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=logstash1007.eqiad.wmnet
  • 09:08 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=logstash1007.eqiad.wmnet
  • 09:05 gehel@tin: Finished deploy [logstash/plugins@b13d2fa]: (no justification provided) (duration: 00m 02s)
  • 09:05 gehel@tin: Started deploy [logstash/plugins@b13d2fa]: (no justification provided)
  • 09:00 gehel: upgrading ELK stack on logstash100* - some log messages might be lost during the upgrade - T178412
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098:3316 db1098:3317 - T178359 (duration: 00m 48s)
  • 08:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098:3316 db1098:3317 - T178359 (duration: 00m 48s)
  • 08:28 elukey: install prometheus-druid-exporter 0.5 on druid*
  • 08:26 elukey: upload prometheus-druid-exporter 0.5-1 to jessie/stretch-wikimedia
  • 07:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1098:3316 db1098:3317 - T178359 (duration: 00m 47s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly pool db1098:3316 db1098:3317 - T178359 (duration: 00m 48s)
  • 06:45 marostegui: Stop replication on db1099:3311 to reimport: change_tag, tag_summary, user and watchlist tables and recompress again - T178359
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 - T174569 (duration: 00m 48s)
  • 06:40 marostegui: Deploy schema change on db1074 (s2) - T174569
  • 03:00 catrope@tin: Synchronized php-1.31.0-wmf.11/resources/src/mediawiki.rcfilters/: T182268 (duration: 00m 57s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.10) (duration: 08m 57s)

2017-12-06

  • 23:45 eileen: update CiviCRM from 85263f6 to 2db9c76 (improve org merges)
  • 23:12 ppchelko@tin: Started restart [changeprop/deploy@065a06e]: (no justification provided)
  • 23:04 eileen: update process-control to b1cd515 (reenable bounce processing, enable dedupe on orgs)
  • 22:58 ppchelko@tin: Finished deploy [cpjobqueue/deploy@761524f]: Separate delay and totaldelay metrics T182216 (duration: 00m 31s)
  • 22:57 ppchelko@tin: Started deploy [cpjobqueue/deploy@761524f]: Separate delay and totaldelay metrics T182216
  • 22:55 mutante: stat1003 - re-enabled puppet after putting role::spare on it (T175150)
  • 22:48 mutante: stat1003 - this host was kind of invisible (not in site, not in icinga) but still up, re-enabling puppet after re-adding it to site
  • 22:03 demon@tin: Synchronized php-1.31.0-wmf.11/extensions/WikidataPageBanner/includes/WikidataPageBanner.hooks.php: unbreak (duration: 00m 48s)
  • 22:00 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.11, again
  • 21:51 hoo@tin: Synchronized php-1.31.0-wmf.10/extensions/Wikibase/lib/includes/Changes/EntityChange.php: Make EntityChange truly forward compatible with compact diffs (T182243) (duration: 00m 48s)
  • 21:43 arlolra: Updated Parsoid to 01c1fc3 (T178253, T61840, T180930, T179259)
  • 21:34 arlolra@tin: Finished deploy [parsoid/deploy@dfcc622]: Updating Parsoid to 01c1fc3 (duration: 09m 45s)
  • 21:24 arlolra@tin: Started deploy [parsoid/deploy@dfcc622]: Updating Parsoid to 01c1fc3
  • 20:45 marostegui: Add 400G to labsdb1003 /srv partition
  • 20:15 hoo: Ran "scap pull" on snapshot1001 after T177486 related tests
  • 20:10 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: partial rollback -- wikidata errors
  • 20:07 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.11
  • 20:05 demon@tin: Synchronized php: symlink bump (duration: 00m 47s)
  • 19:32 hoo@tin: Synchronized php-1.31.0-wmf.11/extensions/Wikibase/repo/maintenance/dispatchChanges.php: dispatch: track how long client selecting takes (duration: 00m 48s)
  • 19:24 ppchelko@tin: Finished deploy [cpjobqueue/deploy@8c66189]: Deduplicate based on event sha1 as well as on id, take 2 (duration: 00m 38s)
  • 19:24 ppchelko@tin: Started deploy [cpjobqueue/deploy@8c66189]: Deduplicate based on event sha1 as well as on id, take 2
  • 19:23 hoo@tin: Synchronized php-1.31.0-wmf.10/extensions/Wikibase/repo/maintenance/dispatchChanges.php: dispatch: track how long client selecting takes (duration: 00m 48s)
  • 19:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@8c66189]: Deduplicate based on event sha1 as well as on id, take 2
  • 19:22 ppchelko@tin: Finished deploy [cpjobqueue/deploy@8c66189]: Deduplicate based on event sha1 as well as on id (duration: 00m 11s)
  • 19:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@8c66189]: Deduplicate based on event sha1 as well as on id
  • 19:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@df72b34]: Deduplicate based on event sha1 as well as on id (duration: 00m 09s)
  • 19:19 ppchelko@tin: Started deploy [cpjobqueue/deploy@df72b34]: Deduplicate based on event sha1 as well as on id
  • 18:44 volans: stopped irc echo on tegmen, re-enabled puppet and run it (it was disabled by a run-no-puppet sync_icinga_state)
  • 18:32 mobrovac@tin: Finished deploy [zotero/translators@3044b3a]: Update translators to 092c7bc - T178596 (duration: 00m 07s)
  • 18:32 mobrovac@tin: Started deploy [zotero/translators@3044b3a]: Update translators to 092c7bc - T178596
  • 18:16 herron: upgrading rhodium to puppet 4
  • 18:13 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch htmlCacheUpdate jobs for wiktionaries to EventBus, file 2/2 - T182023 (duration: 00m 48s)
  • 18:12 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch htmlCacheUpdate jobs for wiktionaries to EventBus, file 1/2 - T182023 (duration: 00m 48s)
  • 18:05 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add https://studiezaal.nijmegen.nl to $wgCopyUploadsDomains (T181713) (duration: 00m 49s)
  • 17:51 ppchelko@tin: Finished deploy [cpjobqueue/deploy@df72b34]: Switch htmlCacheUpdate for wiktionaries, attempt 2 T182023 (duration: 00m 32s)
  • 17:50 ppchelko@tin: Started deploy [cpjobqueue/deploy@df72b34]: Switch htmlCacheUpdate for wiktionaries, attempt 2 T182023
  • 17:44 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3281df1]: Switch htmlCacheUpdate for wiktionaries T182023 (duration: 02m 57s)
  • 17:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@3281df1]: Switch htmlCacheUpdate for wiktionaries T182023
  • 17:36 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1260.eqiad.wmnet
  • 17:05 hoo: Started Wikidata RDF dumps (sudo -b -u datasets bash -c 'dumpwikidatardf.sh all ttl; dumpwikidatardf.sh truthy nt') on snapshot1007
  • 16:33 anomie@terbium: Finished cleanupUsersWithNoId.php on mediawikiwiki
  • 16:29 anomie@terbium: Running cleanupUsersWithNoId.php for mediawikiwiki, see T181731
  • 16:28 anomie@terbium: Finished cleanupUsersWithNoId.php on testwikidatawiki
  • 16:27 anomie@terbium: Running cleanupUsersWithNoId.php for testwikidatawiki, see T181731
  • 16:26 anomie@terbium: Finished cleanupUsersWithNoId.php on test2wiki
  • 16:14 anomie@terbium: Running cleanupUsersWithNoId.php for test2wiki, see T181731
  • 16:11 anomie@terbium: Finished cleanupUsersWithNoId.php on testwiki
  • 16:03 anomie@terbium: Running cleanupUsersWithNoId.php for testwiki, see T181731
  • 15:13 akosiaris: upload kubernetes_1.7.10-1_amd64 on apt.wikimedia.org/stretch-wikimedia/main T181489
  • 14:59 addshore: swat done
  • 14:57 addshore@tin: Synchronized php-1.31.0-wmf.11/extensions/Wikibase/repo/includes/ChangeDispatcher.php: SWAT Tracking within ChangeDispatcher::getPendingChanges (duration: 00m 49s)
  • 14:53 addshore@tin: Synchronized php-1.31.0-wmf.10/extensions/Wikibase/repo/includes/ChangeDispatcher.php: SWAT Tracking within ChangeDispatcher::getPendingChanges (duration: 00m 49s)
  • 14:33 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add https://studiezaal.nijmegen.nl to $wgCopyUploadsDomains (T181713) (duration: 00m 47s)
  • 14:22 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Portal namespace for mwl.wikipedia (T180052) (duration: 00m 48s)
  • 14:13 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T180476 wmgUseNewWikiDiff2Extension true for dewiki (duration: 00m 48s)
  • 14:08 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T181498 T181329 Enable AdvancedSearch on fawiki and huwiki (duration: 00m 50s)
  • 13:40 moritzm: reimaging mw1260 to stretch
  • 10:55 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2118.codfw.wmnet
  • 10:55 moritzm: reimaging mw2119 to stretch
  • 10:45 mobrovac@tin: Finished deploy [restbase/deploy@b1d7c82]: Use Cass3 for revisions, deprecate trending-edits, fix CX end point - T179421 T180384 T173801 (duration: 06m 02s)
  • 10:39 mobrovac@tin: Started deploy [restbase/deploy@b1d7c82]: Use Cass3 for revisions, deprecate trending-edits, fix CX end point - T179421 T180384 T173801
  • 09:34 gehel: shuttting down logstash / elasticsearch on logstash100[123] in preparation for decommission -T175830
  • 08:05 moritzm: reimaging mw2118 to stretch (now for real, yesterday's reimage logged to SAL was interrupted)
  • 07:33 bblack: cp4024 - backend restart, mailbox lag
  • 02:39 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.10) (duration: 08m 53s)
  • 01:01 eileen: update process-control to 55529ea - threshold for $500+ dedupe was too low
  • 00:36 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable description usage tracking for all wikis except commons T106287 (duration: 00m 48s)
  • 00:27 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHTML on wikis with zero high priority linter errors T182042 (duration: 00m 51s)

2017-12-05

  • 23:07 eileen: update civicrm from d2c70d2 to 85263f6 (Major gifts address - choose latest)
  • 22:29 eileen: update process_control to 3522186 renable jobs to fetch recipient data
  • 22:13 mutante: restbase1010 failed at reboot with P6431 , after a cold start (power off, power on) it came back though :) (T178177 T141756)
  • 22:00 mutante: restbase1010 - rebooting for firmware upgrade
  • 21:58 mutante: restbase1010 - upgraded HP firmware (Flashing Smart Array P440ar in Slot 0 [ 3.56 -> 6.06 ]) T141756 T178177
  • 21:56 urandom: draining cassandra instances, restbase1010 - T178177
  • 21:52 mutante: restbase1010 - upgrading firmware - Flashing Smart Array P440ar in Slot 0 [ 3.56 -> 6.06 ]
  • 21:31 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 shenanigans / wmf.11
  • 21:12 demon@tin: Finished scap: bootstrap wmf.11 (duration: 51m 01s)
  • 20:50 twentyafterfour: phabricator: running `sudo bin/garbage collect --collector search.ferret.ngram`
  • 20:29 twentyafterfour: phabricator: running `sudo bin/search ngrams --threshold 0.2`
  • 20:21 demon@tin: Started scap: bootstrap wmf.11
  • 19:50 mutante: forcing puppet ron on ores eqiad to reduce number of celery workers used for stress test
  • 19:37 ppchelko@tin: Finished deploy [eventlogging/eventbus@6ca0372]: Make the kafka async deliver callback thread-safe. T180017 (duration: 01m 29s)
  • 19:36 ppchelko@tin: Started deploy [eventlogging/eventbus@6ca0372]: Make the kafka async deliver callback thread-safe. T180017
  • 19:20 gehel: moving eventlogging collection by logstash from logstash1003 to logstash1007, no messages **should** be lost - T175830
  • 18:24 ppchelko@tin: Finished deploy [eventlogging/eventbus@6ca0372]: Make the kafka async deliver callback thread-safe. Limited to kafka1001. T180017 (duration: 00m 14s)
  • 18:24 ppchelko@tin: Started deploy [eventlogging/eventbus@6ca0372]: Make the kafka async deliver callback thread-safe. Limited to kafka1001. T180017
  • 17:27 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2118.codfw.wmnet
  • 15:56 awight: beginning stress test on ores* (non-production)
  • 15:54 awight@tin: Finished deploy [ores/deploy@6baed71]: (non-production) Test ORES deployment to ores100[1-2] (duration: 01m 01s)
  • 15:53 awight@tin: Started deploy [ores/deploy@6baed71]: (non-production) Test ORES deployment to ores100[1-2]
  • 15:15 ottomata: restarrting kafka-jumbo brokers, applying SSL (downtime scheduled)
  • 15:03 urandom: bootstrapping cassandra, restbase1014-c - T179422
  • 14:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1076 (duration: 00m 43s)
  • 14:41 zeljkof: EU SWAT finished
  • 14:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add category collation for sewiki" (T181503) (duration: 00m 44s)
  • 14:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHTML on itwiki and dewiki (T181188 T181190) (duration: 00m 43s)
  • 13:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1076 weight (duration: 00m 44s)
  • 13:14 moritzm: reimaging mw2118/mw2119 (video scalers) to stretch
  • 12:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1034 (duration: 00m 43s)
  • 12:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 with low weight (duration: 00m 44s)
  • 11:58 marostegui: Upgrade MariaDB and kernel on db1076
  • 11:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1034 - T178359! (duration: 00m 43s)
  • 10:53 moritzm: rebooting meitnerium/archiva.wikimedia.org for update to 4.9.51
  • 10:46 moritzm: rebooting einsteinium (icinga.wikimedia.org) for update to 4.9,51
  • 10:45 elukey: reboot druid1003 for kernel+jvm updates - T179943
  • 10:33 addshore@tin: Synchronized wmf-config/extension-list: extension-list extension.json entrypoint for ArticlePlaceholder (duration: 00m 43s)
  • 10:33 moritzm: rebooting tegmen for update to 4.9,51
  • 10:07 jynus: stopping dbstore1002 (s5) and dbstore2001 (s5) for maintenance
  • 10:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1034 - T178359! (duration: 00m 43s)
  • 10:00 ema: cp4021: restart varnish-be due to mbox lag/fetch failures
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 with low weight - T178359! (duration: 00m 43s)
  • 09:42 elukey: reboot analytics100[12] for kernel+jvm updates (Hadoop Master nodes) - T179943
  • 09:39 godog: bootstrap restbase1014-b - T179422
  • 09:20 marostegui: Optimize s7 on db1098 - T178359
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 - T174569 (duration: 00m 43s)
  • 09:01 marostegui: Deploy schema change on db1076 (s2) - T174569
  • 08:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 - T174569 (duration: 00m 44s)
  • 08:54 moritzm: enabling test production traffic for mw1259 (stretch-based video scaler)
  • 08:24 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1259.eqiad.wmnet
  • 08:08 moritzm: installing python updates on trusty
  • 07:27 kartik@tin: Finished deploy [cxserver/deploy@4b74f03]: Update cxserver to 1693bcf (duration: 03m 22s)
  • 07:23 kartik@tin: Started deploy [cxserver/deploy@4b74f03]: Update cxserver to 1693bcf
  • 06:58 marostegui: Stop MySQL on db1034 to clone db1098:3317 - T178359
  • 06:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 - T178359! (duration: 00m 43s)
  • 06:48 marostegui: Fix dbstore1002 replication
  • 06:20 marostegui: Deploy schema change on db1090 (s2) - T174569
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 - T174569 (duration: 00m 43s)
  • 06:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3312 - T174569 (duration: 00m 44s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.10) (duration: 06m 36s)
  • 00:29 mutante: bast4001 - removing ganglia aggregators, package, config...

2017-12-04

  • 23:44 mutante: bast3002 - killall -u ganglia to kill all aggregator procs, apt-get remove --purge ganglia-monitor, rm -rf /etc/ganglia, rm -rf /usr/lib/ganglia, apt-get autoremove
  • 23:28 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Revert wgCommentTableSchemaMigrationStage change, breaks too much stuff (duration: 00m 44s)
  • 22:08 brion: brion running requeueTranscodes.php on terbium to batch-run .mp3 output on Commons (T181749)
  • 21:27 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c8bea2e]: (no justification provided) (duration: 00m 42s)
  • 21:26 ppchelko@tin: Started deploy [cpjobqueue/deploy@c8bea2e]: (no justification provided)
  • 21:26 awight@tin: Finished deploy [ores/deploy@6baed71]: Update ORES to 6baed71 (duration: 12m 54s)
  • 21:25 eileen: re-enable dedupe job process control commit 461f482
  • 21:13 awight@tin: Started deploy [ores/deploy@6baed71]: Update ORES to 6baed71
  • 21:07 gehel@tin: Finished deploy [kartotherian/deploy@6e223df]: testing new kartotherian packaging on maps-test2003 (duration: 00m 20s)
  • 21:07 gehel@tin: Started deploy [kartotherian/deploy@6e223df]: testing new kartotherian packaging on maps-test2003
  • 20:07 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Revert wgCommentTableSchemaMigrationStage change, breaks too much stuff (duration: 00m 43s)
  • 19:31 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on test wikis (T166733) (duration: 00m 43s)
  • 19:18 gehel@tin: Finished deploy [kartotherian/deploy@e166d87]: testing new kartotherian packaging on maps-test2003 (duration: 00m 03s)
  • 19:18 gehel@tin: Started deploy [kartotherian/deploy@e166d87]: testing new kartotherian packaging on maps-test2003
  • 19:17 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Set $wgRestrictionMethod = 'firejail'; everywhere (T173370) (duration: 00m 43s)
  • 19:11 legoktm@tin: Synchronized static/images/mobile/copyright/: Add converted copyright svg images as png files - https://gerrit.wikimedia.org/r/#/c/394820/ (duration: 00m 43s)
  • 19:06 legoktm@tin: Synchronized wmf-config/InitialiseSettings-labs.php: beta: Add ORES filter thresholds for simplewiki (duration: 00m 43s)
  • 18:46 hoo: Started dumpwikidatajson.sh on snapshot1007 (T181385)
  • 18:40 hoo: Ran scap pull on snapshot1001 (T181385)
  • 18:35 hoo@tin: Synchronized php-1.31.0-wmf.10/includes/objectcache/ObjectCache.php: Only send statsd data for WAN cache in non-CLI mode (T181385) (duration: 00m 44s)
  • 18:28 godog: bootstrap restbase1014-a - T179422
  • 18:04 gehel@tin: Finished deploy [wdqs/wdqs@2873745]: wdqs GUI update (duration: 02m 10s)
  • 18:02 gehel@tin: Started deploy [wdqs/wdqs@2873745]: wdqs GUI update
  • 17:41 demon@tin: Synchronized php-1.31.0-wmf.10/extensions/CirrusSearch/maintenance/: fix some deprecated spam (duration: 00m 44s)
  • 17:40 ejegg: disabled CiviCRM bounce processing again
  • 17:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.6 (duration: 02m 29s)
  • 17:08 demon@tin: Pruned MediaWiki: 1.31.0-wmf.5 (duration: 02m 58s)
  • 16:54 demon@tin: Synchronized docroot/noc/conf/dblists: double checking symlink move (duration: 00m 44s)
  • 16:44 ejegg: re-enabled CiviCRM bounced mail processing
  • 16:41 marostegui: Deploy schema change on db1053 (s2) - T174569
  • 16:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 - T174569 (duration: 00m 45s)
  • 16:39 demon@tin: Finished scap: docroot/noc/conf/ Adding new dblist symlink (duration: 02m 36s)
  • 16:39 marostegui: Deploy schema change on db1103:3312 - T174569
  • 16:37 demon@tin: Started scap: docroot/noc/conf/ Adding new dblist symlink
  • 16:37 demon@tin: scap aborted: docroot/noc/conf/dblists (duration: 00m 02s)
  • 16:37 demon@tin: Started scap: docroot/noc/conf/dblists
  • 16:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3312 - T174569 (duration: 00m 45s)
  • 16:32 demon@tin: Finished scap: docroot/noc/conf/ drop some dangling symlinks (duration: 03m 53s)
  • 16:28 demon@tin: Started scap: docroot/noc/conf/ drop some dangling symlinks
  • 16:11 herron: re-enabling puppet agents in eqiad
  • 16:07 demon@tin: Finished scap: Revert "Special:Preferences: Use OOjs UI" and follow-ups (duration: 21m 19s)
  • 16:05 herron: re-enabling puppet agents in codfw
  • 15:46 demon@tin: Started scap: Revert "Special:Preferences: Use OOjs UI" and follow-ups
  • 15:31 herron: temporarily disabling puppet agents in eqiad and codfw while puppetdb catches up with command queue
  • 15:29 herron: disabling ircecho temporarily
  • 15:24 _joe_: restarting puppetdb on nihal, will cause puppet failures
  • 15:23 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1259.eqiad.wmnet
  • 15:18 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op (duration: 00m 45s)
  • 15:01 godog: reimage restbase1014 - T179422
  • 14:51 herron: cutting over all production puppet agents to codfw puppet 4 masters via dns
  • 14:50 Amir1: deployed backward compatibility of entity compact diff transmit T113468
  • 14:49 ladsgroup@tin: Synchronized php-1.31.0-wmf.10/extensions/Wikibase: (no justification provided) (duration: 01m 41s)
  • 14:37 gehel@tin: Finished deploy [kartotherian/deploy@e166d87]: dummy kartotherian deployment to test udp2log config change - T175242 (duration: 00m 03s)
  • 14:37 gehel@tin: Started deploy [kartotherian/deploy@e166d87]: dummy kartotherian deployment to test udp2log config change - T175242
  • 14:30 elukey: reboot druid100[23] for kernel updates
  • 14:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Localize sitename and meta NS for wawiktionary (T181782) (duration: 00m 46s)
  • 14:01 elukey: reboot analytics106* (hadoop worker nodes) for kernel+jvm updates - T179943
  • 13:24 paravoid: upgrading bast2001 to stretch
  • 13:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3312 - T174569 (duration: 00m 45s)
  • 13:20 marostegui: Deploy schema change on db1105:3312 (s2) - T174569
  • 13:05 marostegui: Compress s6 on db1098 - T178359
  • 12:44 gehel@tin: Finished deploy [kartotherian/deploy@e166d87]: testing new kartotherian packaging on maps-test2003 (duration: 00m 22s)
  • 12:43 gehel@tin: Started deploy [kartotherian/deploy@e166d87]: testing new kartotherian packaging on maps-test2003
  • 11:06 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 45s)
  • 11:05 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 45s)
  • 10:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2085 (duration: 00m 43s)
  • 10:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1096:3316 - T178359wq! (duration: 00m 45s)
  • 09:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1099:3318 from s5 (duration: 00m 44s)
  • 09:51 godog: bootstrap restbase1012-c - T179422
  • 09:32 godog: clear erroneous table metrics from graphite1003 / graphite2002 - T181689
  • 09:24 elukey: reboot analytics105* (hadoop worker nodes) for kernel+jvm updates - T179943
  • 09:19 jynus: rebooting mariadb at labsdb1005
  • 09:12 moritzm: reimaging mw1259 (video scaler) to stretch, will be kept disabled initially (some controlled live tests following)
  • 08:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3315 and 3316 - T178359 (duration: 00m 45s)
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316 - T178359 (duration: 00m 45s)
  • 08:44 moritzm: updating tor on radium to 0.3.1.9
  • 08:41 moritzm: updating tor packages to 0.3.1.9
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3315 and pool db1096:3316 - T178359 (duration: 00m 45s)
  • 08:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db1096:3315 - T178359 (duration: 00m 44s)
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1096:3315 - T178359 (duration: 00m 45s)
  • 07:53 moritzm: installing curl security updates
  • 07:17 marostegui: Compress s1 on db1099 - T178359
  • 07:08 marostegui: Stop MySQL on db1044 as it will be decommissioned - T181696
  • 07:05 _joe_: playing with puppetdb status for ores2003 (deactivating/reactivating node)
  • 06:40 marostegui: Stop MySQL on db1098 to clone db1096.s6 - T178359
  • 06:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1044 from config as it will be decommissioned - T181696 (duration: 00m 45s)
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1044 from config as it will be decommissioned - T181696 (duration: 00m 45s)
  • 06:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098 - T178359 (duration: 00m 46s)
  • 06:21 marostegui: Deploy alter table on s3 master (db1075) without replication - T174569
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.10) (duration: 06m 28s)

2017-12-03

  • 15:33 ejegg: disabled CiviCRM bounce processing job
  • 12:17 akosiaris: empty ganeti1006, it had issues this morning per T181121
  • 12:06 marostegui: Fix dbstore1002 replication
  • 07:44 akosiaris: ran puppet on conf2002, etcdmirror-conftool-eqiad-wmnet got started again
  • 05:11 andrewbogott: deleting files on labsdb1003 /srv/tmp older than 30 days
  • 03:57 no_justification: gerrit2001: icinga is flapping on the gerrit process/systemd check, but this is kind of known (not sure why it's doing this all of a sudden). It's not letting me acknowledge it, but it's fine/harmless. Cf T176532

2017-12-02

  • 17:55 marostegui: Reboot db1096.s5 to pick up the correct innodb_buffer_pool size after finishing compressing s5 - T178359
  • 03:51 hoo: Ran "scap pull" on snapshot1001, after final T181385 tests
  • 00:03 mutante: tried one more time on db2028,db2029, both trusty. on db2028: gmond was running as user ganglia-monitor, failed, had to manually kill the process, run puppet again then ok. on db2029, gmond was running as "499" but puppet just ran and removed it without manual intervention. (T177225)

2017-12-01

  • 23:15 urandom: starting cassandra bootstrap, restbase1012-b - T179422
  • 21:49 mutante: db2029 - removing ganglia-monitor, testing to kill gmond, running puppet to figure out how to cleanly remove it on trusty
  • 21:12 mutante: db2023 killed gmond (ganglia-monitor) process manually which was still running even though ganglia-monitor package was removed and caused puppet breakage (it seems only on trusty). after that puppet run is clean again and ganglia removed. (T177225) (https://gerrit.wikimedia.org/r/#/c/394647/1)
  • 20:18 awight@tin: Started deploy [ores/deploy@9afbf14]: (non-production) Test ORES deployment to ores100*
  • 20:17 awight@tin: Finished deploy [ores/deploy@9afbf14]: (non-production) Test ORES deployment to ores1001 (duration: 02m 31s)
  • 20:15 awight@tin: Started deploy [ores/deploy@9afbf14]: (non-production) Test ORES deployment to ores1001
  • 20:03 aaron@tin: Synchronized php-1.31.0-wmf.10/includes/libs/objectcache/WANObjectCache.php: f096d0b465b75d - temp logging for statsd spam (duration: 00m 45s)
  • 18:59 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op (duration: 00m 46s)
  • 18:22 mutante: Phabricator: restarting Apache for php-curl update
  • 18:21 _joe_: restarting apache2 on the codfw puppetmasters
  • 18:06 marktraceur@tin: Synchronized php-1.31.0-wmf.10/extensions/UploadWizard/resources/controller/uw.controller.Deed.js: (no justification provided) (duration: 00m 46s)
  • 17:49 mutante: phab2001 - restarted apache
  • 17:33 herron: stopped ircecho on einsteinium
  • 17:00 awight@tin: Unlocked for deployment [ores/deploy]: Don't deploy while we're messing with git-lfs (duration: 00m 14s)
  • 17:00 awight@tin: Locking from deployment [ores/deploy]: Don't deploy while we're messing with git-lfs (planned duration: 16666666666m 39s)
  • 17:00 awight@tin: Locking from deployment [ores/deploy]: Don't deploy while we're messing with git-lfs (planned duration: -1m 59s)
  • 16:59 awight@tin: Unlocked for deployment [ores/deploy]: Don't deploy while we're messing with git-lfs (duration: 00m 07s)
  • 16:59 awight@tin: Locking from deployment [ores/deploy]: Don't deploy while we're messing with git-lfs (planned duration: 60m 00s)
  • 16:34 jynus: stopping db2092 to clone s1 to db2085
  • 16:24 urandom: starting cassandra bootstrap, restbase1012-a -- T179422
  • 15:27 godog: bounce uwsgi on labmon1001 - stuck
  • 15:21 moritzm: installing nspr security updates on trusty
  • 15:17 moritzm: installing ffmpeg security updates
  • 15:14 gehel@tin: Finished deploy [kartotherian/deploy@df7ebff]: testing new kartotherian packaging on maps-test2003 (duration: 00m 20s)
  • 15:14 jynus@tin: Synchronized wmf-config/db-codfw.php: Undeploy db2092, use db2085 for s1 (duration: 00m 45s)
  • 15:14 gehel@tin: Started deploy [kartotherian/deploy@df7ebff]: testing new kartotherian packaging on maps-test2003
  • 15:14 moritzm: installing libxcursor security updates on trusty
  • 15:09 jynus@tin: Synchronized wmf-config/db-eqiad.php: Undeploy db2092, use db2085 for s1 (duration: 00m 45s)
  • 14:24 awight@tin: Finished deploy [ores/deploy@532bd0b]: (non-production) Update ORES on new cluster (duration: 02m 06s)
  • 14:22 awight@tin: Started deploy [ores/deploy@532bd0b]: (non-production) Update ORES on new cluster
  • 14:22 akosiaris: upload apertium-crh-tur_0.3.0~r83159-1+wmf1 to apt.wikimedia.org/jessie-wikimedia component main. T181465
  • 14:10 herron: cutting all puppet service records over to codfw puppet 4 masters
  • 12:44 elukey: reboot druid1001 for kernel+jvm updates - T179943
  • 12:11 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 (duration: 00m 45s)
  • 11:59 jynus: restarting and upgrading mysql on labsdb1004
  • 11:28 jynus: upgrading and restarting dbstore2001
  • 11:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2055 (duration: 00m 46s)
  • 11:08 marostegui: Stop MySQL on db2055 for testing
  • 11:04 marostegui: Update MySQL on db1039 for testing
  • 10:57 elukey: reboot analytics1028 for kernel + jvm updates (Hadoop HDFS journalnode) - T179943
  • 10:56 godog: delete docker diskspace metrics from labs - T181476
  • 10:21 godog: initial purge of old table metrics from graphite2002 - T181689
  • 10:18 moritzm: reimaging mw1259 (video scaler) to stretch, will be kept disabled initially (with some live tests starting next week)
  • 09:55 akosiaris: run memtester 61G on ganeti1008 T181121
  • 09:23 elukey: reboot analytics104* for kernel+jvm updates - T179943
  • 09:23 akosiaris: powercycle ganeti1008 T181121, it's largely unresponsive
  • 08:41 marostegui: Remove db1046 and db1047 from tendril - T156844
  • 08:40 elukey: reboot the remaining analytics103* hadoop workers to pick up kernel+jvm updates - T179943
  • 08:28 akosiaris: empty ganeti1008, move VMs to ganeti1006 T181121
  • 08:20 akosiaris: repool ganeti1005, ganeti1006 to empty ganeti1008 T181121
  • 07:51 akosiaris: upload apertium-tur_0.2.0~r83161-1+wmf1, apertium-crh_0.2.0~r83161-1+wmf1 to apt.wikimedia.org/jessie-wikimedia component main. T181465
  • 07:14 marostegui: Logging retroactively for the record, restarting MySQL on db1039
  • 05:14 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 44s)
  • 01:46 hoo: Ran scap pull on mwdebug1001 after T181385 testing
  • 00:13 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: HTML5 sections be upon us! (duration: 00m 45s)
  • 00:04 hoo: Killed all remaining Wikidata JSON/RDF dumpers, due to T181385. This means no dumps this week!

2017-11-30

  • 23:54 demon@tin: Synchronized wmf-config/extension-list-labs: no-op (duration: 00m 45s)
  • 23:39 addshore@tin: Synchronized wmf-config: wdbuild: T173818 T177060 Add wikidata extensions to extension-list (duration: 00m 46s)
  • 23:35 addshore@tin: Synchronized wmf-config: wdbuild: T173818 Remove Wikibase-buildentry.php config file (empty) (duration: 00m 46s)
  • 23:34 addshore@tin: Synchronized wmf-config/Wikibase.php: wdbuild: T173818 Remove Wikibase-buildentry.php config file (empty) (duration: 00m 45s)
  • 23:30 addshore@tin: Synchronized wmf-config: wdbuild: T173818 Remove wmgUseWikidataBuild (duration: 00m 46s)
  • 23:29 addshore@tin: Synchronized wmf-config/Wikibase-buildentry.php: wdbuild: T173818 Remove wmgUseWikidataBuild (duration: 00m 45s)
  • 23:21 addshore@tin: Synchronized wmf-config: wdbuild: T173818 wdbuild: Stop loading from build on ALL WIKIS The build is dead! Mwahahaaa (duration: 00m 47s)
  • 23:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@4305d96]: Update mobileapps to 4317ea5 (T181743) (duration: 04m 50s)
  • 23:15 addshore@tin: Synchronized wmf-config: wdbuild: wdbuild: Stop loading from build on all wikis (except the one i really dont want to break) (duration: 00m 46s)
  • 23:14 bsitzmann@tin: Started deploy [mobileapps/deploy@4305d96]: Update mobileapps to 4317ea5 (T181743)
  • 23:06 addshore@tin: Synchronized wmf-config: wdbuild: wdbuild: Stop loading from build on group1 (duration: 00m 46s)
  • 22:55 addshore@tin: Synchronized wmf-config: wdbuild: wdbuild: Stop loading from build on group0 (duration: 00m 46s)
  • 22:42 addshore: BETA ONLY was a lie on that last one ...
  • 22:41 addshore@tin: Synchronized wmf-config: wdbuild: BETA ONLY wdbuild: Stop loading from build on test and testwikidata (duration: 00m 47s)
  • 22:40 demon@tin: Pruned MediaWiki: 1.31.0-wmf.8 [keeping static files] (duration: 02m 33s)
  • 22:27 addshore@tin: Synchronized wmf-config: wdbuild: BETA ONLY gerrit:394411 (duration: 00m 47s)
  • 21:35 addshore@tin: Synchronized wmf-config: wdbuild: BETA ONLY (duration: 00m 47s)
  • 21:31 jynus: upgrading labsdb1010 and restarting mariadb
  • 21:23 mutante: powercycling kafka1018 (was down in Icinga and saw in SAL: reboot kafka10[12-22] for kernel + jvm updates - T179943)
  • 21:18 jynus: upgrade and restart db2078
  • 21:14 addshore@tin: Synchronized wmf-config: wdbuild: BETA ONLY (duration: 00m 47s)
  • 21:01 addshore@tin: Synchronized wmf-config: wdbuild: T173818: add switch to ease killing (again) (duration: 00m 46s)
  • 20:59 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: wdbuild: T173818: add switch to ease killing (again) (duration: 00m 45s)
  • 20:49 XenoRyet: Updated tools from 6e604fd to 626fe02
  • 20:40 bd808: Sending Toolforge survey final reminder emails from silver for T177126
  • 20:32 addshore@tin: Synchronized wmf-config: REVERT wdbuild: T173818: add switch to ease killing (duration: 00m 47s)
  • 20:30 addshore@tin: Synchronized wmf-config: wdbuild: T173818: add switch to ease killing (duration: 00m 47s)
  • 20:13 ejegg: re-adjusted Civi job timings. QC every odd min for 105 sec, TY every even min for 70 sec after 45 sec delay
  • 20:12 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.10
  • 19:55 ejegg: 105 seconds for each job
  • 19:55 ejegg: adjusted timings of Civi jobs to let TY and QC run concurrently
  • 19:37 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Temporary disable remex html" T178632 (duration: 00m 49s)
  • 19:20 thcipriani@tin: Synchronized php-1.31.0-wmf.10/includes/htmlform/fields/HTMLMultiSelectField.php: SWAT: HTMLMultiSelectField: Allow formatting in section headings in OOUI mode T181698 (duration: 00m 49s)
  • 19:12 thcipriani@tin: Synchronized wmf-config: SWAT: Enable MP3 uploads on Commons T120288 (duration: 00m 51s)
  • 17:57 andrewbogott: upgrading labtestpuppetmaster2001 to puppet 4.8
  • 17:47 moritzm: uploaded prometheus-openldap-exporter 0+git20171128-1 for jessie-wikimedia (T181511)
  • 17:06 herron: beginning cut over of esams to codfw puppet 4 masters
  • 16:44 urandom: drop (erroneous) legacy tables from -ng cassandra cluster - T181689
  • 16:12 elukey: drain and reboot analytics1031->39 to pick up jvm+kernel updates - T179943
  • 15:59 jynus: setup prometheus with unix_socket on new server db1107 and db1108
  • 15:19 gehel: restart blazegraph on wdqs1004
  • 15:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 load (duration: 00m 48s)
  • 15:11 herron: beginning cut over of ulsfo to codfw puppet 4 masters
  • 14:50 zeljkof: EU SWAT finished
  • 14:46 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WikiLove Extension on pa.wiki (T178919) (duration: 00m 49s)
  • 14:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add import sources to de.wiki (T181695) (duration: 00m 49s)
  • 14:20 zfilipin@tin: Synchronized php-1.31.0-wmf.10/extensions/ContentTranslation/modules/dashboard/ext.cx.dashboard.js: SWAT: Set default languages after fetching valid languages (duration: 00m 49s)
  • 14:19 marostegui: Deploy schema change on s3 - db1078 - T174569
  • 13:23 addshore@tin: Synchronized php-1.31.0-wmf.10/extensions/AdvancedSearch/modules/ext.advancedSearch.init.js: pre-swat: T181644 Force search profile advanced in AdvancedSearch gerrit:394297 (duration: 00m 48s)
  • 13:22 addshore@tin: Synchronized php-1.31.0-wmf.8/extensions/AdvancedSearch/modules/ext.advancedSearch.init.js: pre-swat: T181644 Force search profile advanced in AdvancedSearch gerrit:394299 (duration: 00m 50s)
  • 13:18 herron: cutting codfw puppet agents over to puppetmaster2001.codfw.wmnet
  • 12:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Tackle s4 DB weights to make them more equal (duration: 00m 48s)
  • 11:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 and reduce traffic for db1091 (duration: 00m 50s)
  • 10:33 jynus: stopping replication on db1044 and db1095 (s3)
  • 09:16 moritzm: installing exim security updates on stretch (jessie/trusty not affected)
  • 09:14 elukey: drain and reboot analytics1029/1030 for jvm+kernel updates (Hadoop worker canaries)
  • 09:05 godog: add 200G of space to graphite2002 carbon lv
  • 08:25 moritzm: rolling restart of mw canaries to pick up curl security update
  • 08:21 marostegui: Enable GTID on s8 eqiad hosts that do not have it enabled (db1109, db1104, db1101, db1092, db1087, db1063) - T177208
  • 07:48 moritzm: installing curl security updates
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 (duration: 01m 18s)
  • 06:48 marostegui: Deploy schema change on dbstore1001 s3 - T174569
  • 06:26 marostegui: Deploy schema change on s3 db1077 - T174569
  • 02:48 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@7aa39b7]: (no justification provided) (duration: 03m 15s)
  • 02:45 ebernhardson@tin: Started deploy [search/mjolnir/deploy@7aa39b7]: (no justification provided)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.8) (duration: 08m 43s)
  • 01:17 ejegg: updated fundraising tools from 6d4b6f3 to 6e604fd
  • 01:06 twentyafterfour: finished phabricator upgrade, service is online and appears to be functioning normally
  • 01:03 twentyafterfour: Starting phabricator upgrade & maintenance. Service will be offline for less than 5 minutes.
  • 01:02 bsitzmann@tin: Finished deploy [mobileapps/deploy@fa2a877]: Update mobileapps to dcea7d3 (T181004) (duration: 06m 08s)
  • 00:56 bsitzmann@tin: Started deploy [mobileapps/deploy@fa2a877]: Update mobileapps to dcea7d3 (T181004)
  • 00:53 maxsem@tin: Finished scap: Message updates for https://gerrit.wikimedia.org/r/#/c/394155/ (duration: 26m 38s)
  • 00:48 ejegg: re-enabled CiviCRM jobs
  • 00:42 ejegg: updated CiviCRM from 0f95f3e to e81228f
  • 00:40 ejegg: disabled CiviCRM jobs
  • 00:31 ejegg: updated fundraising dashboard from 6ee6567 to 1141317
  • 00:27 maxsem@tin: Started scap: Message updates for https://gerrit.wikimedia.org/r/#/c/394155/
  • 00:24 maxsem@tin: Synchronized php-1.31.0-wmf.10/extensions/ContentTranslation/: https://gerrit.wikimedia.org/r/#/c/394206/ (duration: 00m 52s)
  • 00:21 ejegg: added weekly Ingenico audit processing job in makemissing mode

2017-11-29

  • 23:20 eileen: update civicrm from a1022cf to 0f95f3e (Benevity import fix)
  • 22:23 hashar: Nodepool had some troubles spawning new instances from 21:09 to 21:36, and took a while to recover. Issue similar to T170492#3581822
  • 21:51 SMalyshev: starting wikidata reindex (T181426)
  • 21:23 no_justification: docroot sync was for If1afa59a
  • 21:21 demon@tin: Synchronized docroot/: (no justification provided) (duration: 00m 49s)
  • 20:29 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.10
  • 20:26 demon@tin: Synchronized php: symlink bump (duration: 00m 48s)
  • 20:05 urandom: restarting cassandra bootstrap of restbase1012-a (T179422)
  • 19:55 mutante: restarting gerrit to apply config change and set gitBasicAuth to true to unblock T171758 (gerrit:391865)
  • 19:50 herron: beginning rolling cut over of eqiad scb hosts to codfw puppet 4 masters
  • 19:15 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/394059/3 (duration: 00m 49s)
  • 18:57 herron: beginning cutover of codfw lvs2* systems to codfw puppet 4 masters. first standby nodes, then active nodes
  • 18:12 marostegui: Compress s5 on db1096 - T178359
  • 18:10 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 load (duration: 00m 53s)
  • 18:05 awight@tin: Finished deploy [ores/deploy@532bd0b]: (non-production) Update ORES on new cluster (duration: 01m 11s)
  • 18:04 awight@tin: Started deploy [ores/deploy@532bd0b]: (non-production) Update ORES on new cluster
  • 18:03 herron: beginning cut over of codfw cp2* servers to codfw puppet 4 masters
  • 17:40 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1110 with low load (duration: 00m 48s)
  • 17:40 godog: bootstrapping restbase1012-a - T179422
  • 17:19 herron: beginning cut over of cp200[12] to codfw puppet 4 masters
  • 17:16 ejegg: increased time limit for donation queue consumer from 60 to 70 seconds, reduced thank you job from 55 to 45 seconds
  • 17:03 ejegg: re-enabled Civi jobs
  • 16:58 ejegg: disabled Civi jobs for Deadlock-retry update
  • 16:46 jynus@tin: Synchronized wmf-config/db-codfw.php: Update db1110 ip (duration: 00m 50s)
  • 16:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Update db1110 ip (duration: 00m 49s)
  • 15:02 chasemp: purge old qcow2 images from under /home on labtestcontrol2001
  • 15:00 awight: Restarting ORES celery workers manually, T181538
  • 14:39 _joe_: reloading apache on puppetmasters to pick up the configuration changes
  • 14:37 awight@tin: Started restart [ores/deploy@e58bfbf]: Restart ORES services (take 2), T181538
  • 14:36 elukey: reboot druid100[456] for jvm+kernel updates - T179943
  • 14:35 awight@tin: Finished deploy [ores/deploy@e58bfbf]: Restart ORES services, T181538 (duration: 00m 16s)
  • 14:35 awight@tin: Started deploy [ores/deploy@e58bfbf]: Restart ORES services, T181538
  • 14:25 zeljkof: EU SWAT finished
  • 14:18 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Define throttle rule (T181367) (duration: 00m 49s)
  • 14:14 herron: beginning cut over of cache::misc and codfw cache::canary cp servers to codfw puppet4 masters
  • 14:13 moritzm: rebooting neodymium for kernel update to 4.9.51
  • 14:12 godog: reimage restbase1012 - T179422
  • 14:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T180291 T180128 AdvancedSearch for arwiki (duration: 00m 49s)
  • 14:07 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T180128 AdvancedSearch for dewiki (duration: 00m 50s)
  • 13:34 awight@tin: Finished deploy [ores/deploy@e58bfbf]: (non-production) Update ORES on new cluster (take 3) (duration: 16m 49s)
  • 13:18 elukey: reboot kafka100[23] for jvm+kernel updates - T179943
  • 13:17 awight@tin: Started deploy [ores/deploy@e58bfbf]: (non-production) Update ORES on new cluster (take 3)
  • 12:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Setup s8 on eqiad, with no wikis (duration: 00m 48s)
  • 12:24 jynus: scap pull on mwdebug1001
  • 12:23 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db1096:3315 (duration: 00m 48s)
  • 12:01 marostegui: Deploy schema change on db1072 (sanitarium master) on s3 with replication enabled to replicate to labs - T174569
  • 11:30 elukey: reboot kafka1001 for kernel + jvm updates - T179943
  • 10:15 akosiaris: disable puppet on oresrdb* for merging https://gerrit.wikimedia.org/r/#/c/394022/. T181563
  • 09:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 48s)
  • 09:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 49s)
  • 09:39 godog: upload cassandra-tools-wmf 1.0.2-1 - T181438
  • 09:28 mobrovac@tin: Finished deploy [electron-render/deploy@94d27d7]: Update to electron v1.7.9 and start using the Charter font - T181200 (duration: 04m 03s)
  • 09:24 mobrovac@tin: Started deploy [electron-render/deploy@94d27d7]: Update to electron v1.7.9 and start using the Charter font - T181200
  • 09:04 godog: bootstrap restbase1007-c - T179422
  • 08:41 moritzm: uploaded prometheus-openldap-exporter 0+git20171128-1 for jessie-wikimedia (T181511)
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096 - T178359 (duration: 00m 48s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1099:3318 and db1055 - T178359 (duration: 00m 48s)
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3318 traffic - T178359 (duration: 00m 45s)
  • 07:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1099:3318 with low weight - T178359 (duration: 00m 45s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1055 - T178359 (duration: 00m 48s)
  • 06:42 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1099:3311 and db1099:3318 to the config (depooled) T178359 (duration: 00m 48s)
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1099:3311 and db1099:3318 to the config (depooled) T178359 (duration: 00m 49s)
  • 06:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 with lower weight - T178359 (duration: 00m 50s)
  • 06:17 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@7aa39b7]: (no justification provided) (duration: 04m 27s)
  • 06:12 ebernhardson@tin: Started deploy [search/mjolnir/deploy@7aa39b7]: (no justification provided)
  • 03:39 ejegg: reduced CiviMail sample rate to 0.35
  • 03:09 ejegg: reduced donation queue consumer time limit from 70 to 60 seconds, increased ty mail batch time limit from 45 to 55 seconds
  • 02:50 ejegg: reduced donation queue consumer time limit from 75 to 70 seconds, increased ty mail batch time limit from 40 to 45 seconds
  • 02:44 ejegg: reduced CiviMail record creation rate from 100% to 50%
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.8) (duration: 06m 08s)
  • 01:25 mutante: snapshot1001 - closed idle screen session
  • 01:22 mutante: analytics1003 - closed idle screen session
  • 01:21 mutante: mw1276 - run "scap pull" to get in sync after hardware issue, then pooled again (T181397)
  • 01:09 mutante: restarting ircecho - it stopped talking
  • 01:07 mutante: forcing puppet run on all labvirt* machines to clean out Icinga alerts
  • 00:44 awight@tin: Synchronized php-1.31.0-wmf.10/extensions/ORES: Hotfix to mitigate cache stampeding, T181567 (duration: 00m 49s)
  • 00:33 reedy@tin: Synchronized php-1.31.0-wmf.10/includes/logging/LogPager.php: Fix fatal on Special:Log T181565 (duration: 00m 48s)
  • 00:32 awight@tin: Synchronized php-1.31.0-wmf.8/extensions/ORES: Hotfix to mitigate cache stampeding, T181567 (duration: 00m 50s)
  • 00:18 Jamesofur: deleted 6 archived files from servers for legal compliance
  • 00:08 ejegg: updated fundrasing dashboard from df94248 to 6ee6567

2017-11-28

  • 23:58 hoo: Ran scap pull on mwdebug1001 after T181385 related testing
  • 23:07 akosiaris@tin: Synchronized wmf-config/CommonSettings.php: T181538 (duration: 00m 49s)
  • 22:31 akosiaris@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 00m 49s)
  • 22:31 akosiaris: deploy wmf-config/CommonSettings.php for ORES internal discovery URL, https://gerrit.wikimedia.org/r/#/c/393924/ T181538
  • 21:41 akosiaris: disable ORES queue redis persistency by config set appendonly no on oresrdb1001
  • 21:24 hoo: Manually killed all remaining Wikidata TTL (RDF) dumpers on snapshot1007. Some shards failed due to the db1110 depool.
  • 21:23 hoo: Manually killed all remaining Wikidata JSON dumpers on snapshot1007. Some shards failed due to the db1110 depool.
  • 20:56 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: removing aawiki from group0
  • 20:42 gehel: repooling elastic2004 after RAID controller maintenance - T181412
  • 20:42 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.10
  • 20:28 mutante: forcing puppet run on cache misc to revert "failover ORES to codfw"
  • 20:19 demon@tin: Synchronized scap/plugins/prep.py: no-op (duration: 00m 48s)
  • 20:16 demon@tin: Synchronized dblists/group0.dblist: adding some new wikis (duration: 00m 48s)
  • 18:57 demon@tin: Finished scap: bootstrap wmf.10 (duration: 35m 20s)
  • 18:51 demon@tin: Finished deploy [gerrit/gerrit@571cf4c]: deploying 2.15+ polygerrit style changes (duration: 00m 09s)
  • 18:51 demon@tin: Started deploy [gerrit/gerrit@571cf4c]: deploying 2.15+ polygerrit style changes
  • 18:40 akosiaris: revert weight changes for scb1001, scb1002 T181835
  • 18:39 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 18:39 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 18:35 awight@tin: Finished deploy [ores/deploy@e58bfbf]: (non-production) Update ORES on new cluster (take 2) (duration: 04m 30s)
  • 18:32 akosiaris: force puppet run on cache::misc boxes T181538
  • 18:30 awight@tin: Started deploy [ores/deploy@e58bfbf]: (non-production) Update ORES on new cluster (take 2)
  • 18:28 urandom: (re)bootstrapping cassandra, restbase1007-b - T179422
  • 18:22 demon@tin: Started scap: bootstrap wmf.10
  • 18:18 awight@tin: Finished deploy [ores/deploy@e58bfbf]: (non-production) Update ORES on new cluster (duration: 01m 41s)
  • 18:17 awight@tin: Started deploy [ores/deploy@e58bfbf]: (non-production) Update ORES on new cluster
  • 18:00 akosiaris: force stop celery-ores-worker on scb1001
  • 17:58 akosiaris@puppetmaster1001: conftool action : set/weight=5; selector: scb1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 17:58 akosiaris@puppetmaster1001: conftool action : set/weight=5; selector: scb1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=ores'])
  • 17:46 urandom: decommissioning cassandra, restbase1007-b - T179422
  • 17:42 urandom: restart cassandra, restbase1007, to pickup logstash java deps - T179422
  • 16:45 ejegg: updated fundraising dashboard from d8c86e7 to df94248
  • 16:25 bblack: mw1329 boot to PXE (should come up with new .66 IP)
  • 15:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: depool db1110 (duration: 00m 44s)
  • 15:38 bblack: powered off mw1329
  • 15:06 marostegui: Compress s5 on db1099
  • 15:04 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Revert "Comply wikidata with new ores thresholds" (duration: 00m 45s)
  • 14:56 marostegui: Deploy schema change on s3 on dbstore1002 - T174569
  • 14:45 gehel@tin: Finished deploy [kartotherian/deploy@cb9b1ef]: testing new kartotherian packaging on maps-test2003 (duration: 00m 02s)
  • 14:45 gehel@tin: Started deploy [kartotherian/deploy@cb9b1ef]: testing new kartotherian packaging on maps-test2003
  • 14:41 otto@tin: Finished deploy [eventlogging/analytics@c464b8c]: Fixing bug where userAgent set by client producer was not used T178440 (duration: 00m 02s)
  • 14:40 otto@tin: Started deploy [eventlogging/analytics@c464b8c]: Fixing bug where userAgent set by client producer was not used T178440
  • 14:40 otto@tin: Started deploy [eventlogging/analytics@c464b8c]: Fixing bug where userAgent set by client producer was not used
  • 14:39 gehel@tin: Finished deploy [kartotherian/deploy@cb9b1ef]: testing new kartotherian packaging on maps-test2003 (duration: 00m 18s)
  • 14:39 gehel@tin: Started deploy [kartotherian/deploy@cb9b1ef]: testing new kartotherian packaging on maps-test2003
  • 14:37 marostegui: Stop MySQL on db1055 to clone db1099:3311 - T178359
  • 14:34 gehel@tin: Finished deploy [kartotherian/deploy@55c5da4]: (no justification provided) (duration: 00m 11s)
  • 14:34 gehel@tin: Started deploy [kartotherian/deploy@55c5da4]: (no justification provided)
  • 14:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T178359 (duration: 00m 44s)
  • 14:17 elukey: reboot kafka10[12-22] for kernel + jvm updates - T179943
  • 14:03 elukey: reboot kafka200[123] for kernel + jvm updates - T179943
  • 12:25 godog: cleanup wanobjectcache metrics with hashes - T178531
  • 12:14 godog: bounce carbon-frontend-relay after https://gerrit.wikimedia.org/r/393749
  • 11:25 hoo: Manually re-started Wikidata JSON dumps on snapshot1007, got stuck after db1082 went down.
  • 10:55 jynus: stop db1044 replication for db1095 master switchover
  • 10:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082. Depool db1044 (duration: 00m 44s)
  • 10:44 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Process wikibase-addUsagesForPage only via EventBus - T175212 (duration: 00m 44s)
  • 10:08 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c4b9e16]: Enable wikibase-addUsagesForPage with low concurrency (duration: 00m 29s)
  • 10:07 ppchelko@tin: Started deploy [cpjobqueue/deploy@c4b9e16]: Enable wikibase-addUsagesForPage with low concurrency
  • 09:35 jynus: restart db1087 for maintenance
  • 09:22 jynus: restart db1082
  • 09:19 godog: unmask and restart restbase1007-b - T179422
  • 09:15 moritzm: installibg libxml-libxml-perl security updates on trusty (Debian already fixed)
  • 09:11 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b0b1793]: Remove double-processing created for consumer group renaming (duration: 00m 28s)
  • 09:10 ppchelko@tin: Started deploy [cpjobqueue/deploy@b0b1793]: Remove double-processing created for consumer group renaming
  • 09:04 marostegui: Drop database log from dbstore1002 - T156844
  • 08:57 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2212086]: Move enabled jobs config to vars.yaml (duration: 00m 39s)
  • 08:56 ppchelko@tin: Started deploy [cpjobqueue/deploy@2212086]: Move enabled jobs config to vars.yaml
  • 08:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: depool db1082 (duration: 00m 44s)
  • 08:02 mobrovac: bootstrap restbase1007-b - T179422
  • 06:39 marostegui: Stop MySQL on db1099 to copy its content to dbstore1001 and reimage it as multi-instance - T178359
  • 05:23 mutante: labtestpuppetmaster2001 - manually purging ganglia things since puppet is disabled
  • 05:05 mutante: labweb1001/labweb1002: manually purging ganglia package/config/service/unit files because puppet is disabled there (T177225)
  • 04:08 eileen: update process-control to b40eaec, disabling 3 jobs to reduce load for Big english starting: dedupe_civicrm_contacts, omnimail_recipient_load, omnimail_recipient_load_backfill
  • 03:58 demon@tin: Pruned MediaWiki: 1.31.0-wmf.4 (duration: 03m 13s)
  • 03:31 mutante: labservices1001/1002,labtestservices2001 - remove pdns_gmetric cronjobs causing cron spam after ganglia decom from lab*
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.8) (duration: 05m 23s)

2017-11-27

  • 22:29 mutante: labtestnet2001 - re-enabled puppet to decom ganglia, no errors
  • 22:22 mutante: decom'ing Ganglia from all labtest* hosts - packages removed by puppet etc, might cause some false positives in Icinga but it's ok (T177225)
  • 22:21 otto@tin: Finished deploy [eventlogging/eventbus@e024af3]: deploy revert checked out at 3df06ab (we caught one). T180017 (duration: 00m 10s)
  • 22:21 otto@tin: Started deploy [eventlogging/eventbus@e024af3]: deploy revert checked out at 3df06ab (we caught one). T180017
  • 21:59 awight: clearing ORES threshold caches for ruwiki, frwiki, wikidatawiki.
  • 21:59 awight@tin: Synchronized wmf-config/InitialiseSettings.php: Reenable ORES on frwiki, ruwiki, and wikidata; T181006 (duration: 00m 45s)
  • 21:53 awight@tin: Finished deploy [ores/deploy@e58bfbf]: (codfw) Update ORES for impossible threshold handling and new wikidata editquality model, T179711 T180686 T180450 (duration: 05m 06s)
  • 21:47 awight@tin: Started deploy [ores/deploy@e58bfbf]: (codfw) Update ORES for impossible threshold handling and new wikidata editquality model, T179711 T180686 T180450
  • 21:46 awight@tin: Finished deploy [ores/deploy@e58bfbf]: Update ORES for impossible threshold handling and new wikidata editquality model, T179711 T180686 T180450 (duration: 12m 55s)
  • 21:33 awight@tin: Started deploy [ores/deploy@e58bfbf]: Update ORES for impossible threshold handling and new wikidata editquality model, T179711 T180686 T180450
  • 21:23 awight@tin: Synchronized php-1.31.0-wmf.8/extensions/ORES: ORES error handling for bad thresholds, T181191 (duration: 00m 46s)
  • 21:11 awight@tin: Synchronized wmf-config/InitialiseSettings.php: Temporarily disable ORES on wikidata (duration: 00m 45s)
  • 21:10 awight: Previous “rollback ORES” was from a stale screen session, no actions were taken today.
  • 21:08 awight@tin: Finished deploy [ores/deploy@82a13ae]: Rollback ORES (take 3); 181006 (duration: 9942m 42s)
  • 20:21 demon@tin: i lied, that was for wgStyleVersion removal
  • 20:20 demon@tin: Synchronized wmf-config/CommonSettings.php: cli sapi fixes (duration: 00m 45s)
  • 20:17 ejegg: updated payments-wiki from fea6b37 to f594dfa
  • 20:03 kaldari@tin: Synchronized wmf-config/InitialiseSettings-labs.php: (no justification provided) (duration: 00m 45s)
  • 20:02 kaldari@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 00m 45s)
  • 20:01 kaldari@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 46s)
  • 19:54 bblack: crN-ulsfo: remove lvs400[1-4] from PyBal BGP neighbors list
  • 19:42 catrope@tin: Synchronized php-1.31.0-wmf.8/resources/src/mediawiki.rcfilters/mw.rcfilters.UriProcessor.js: T181100 (duration: 00m 45s)
  • 19:41 demon@tin: Synchronized wmf-config/InitialiseSettings.php: babel officewiki thing (duration: 00m 45s)
  • 19:39 catrope@tin: Synchronized php-1.31.0-wmf.8/includes/specials/SpecialRecentchangeslinked.php: T181100 (duration: 00m 45s)
  • 19:34 otto@tin: Finished deploy [eventlogging/eventbus@3df06ab]: Temp deploy to kafka1001 only to catch bug: T180017 (duration: 00m 12s)
  • 19:34 otto@tin: Started deploy [eventlogging/eventbus@3df06ab]: Temp deploy to kafka1001 only to catch bug: T180017
  • 19:21 catrope@tin: Synchronized wmf-config/Wikibase.php: T176903 (duration: 00m 45s)
  • 19:17 AaronSchulz: Removed more bogus md5 wanobjecache/ metric from graphite[12]001
  • 19:11 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T179241 (duration: 00m 45s)
  • 19:02 volans: restarted ircecho
  • 18:51 ejegg: updated payments-wiki from 6b3019b to fea6b37
  • 18:50 dcausse: elastic/cirrus: reindexing english group2 wikis: T179945
  • 18:45 volans: stopped ircecho temporary to avoid spam
  • 18:14 XioNoX: pushing v6 gre firewall filter
  • 18:12 gehel@tin: Finished deploy [wdqs/wdqs@7ad20f3]: (no justification provided) (duration: 02m 10s)
  • 18:11 gehel: deploying blazegraph + GUI updates
  • 18:10 gehel@tin: Started deploy [wdqs/wdqs@7ad20f3]: (no justification provided)
  • 18:08 hoo: Killed truthy nt dumpers stuck at 100% CPU (from last week) on snapshot1007 (T181385)
  • 18:06 akosiaris: poweroff ganeti1005, ganeti1006 T181121
  • 18:05 bd808: Sending Toolforge survey 1 week reminder emails from silver for T177126
  • 16:07 bblack: lvs400x - puppet disabled for https://gerrit.wikimedia.org/r/#/c/393610
  • 15:52 herron: beginning canary cutover/deployment of codfw prometheus servers to codfw puppet 4 puppetmasters
  • 15:19 chasemp: disable puppet across cloud things for cleanup
  • 15:11 herron: beginning canary cutover/deployment of codfw elasticsearch servers to codfw puppet 4 puppetmasters
  • 14:22 zeljkof: EU SWAT finished
  • 14:19 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: IP cap lift for Semaine contributive 2017-2018 (T181360) (duration: 00m 45s)
  • 14:10 addshore@tin: Synchronized php-1.31.0-wmf.8/extensions/AdvancedSearch: SWAT AdvancedSearch T181175 T181222, adjust links in beta section & fix usability of mobile search (duration: 00m 46s)
  • 13:31 moritzm: installing nspr security updates on trusty
  • 13:22 elukey: remove eventlogging replication support (log database) from dbstore1002 - T156844
  • 13:17 moritzm: updating nginx on francium
  • 13:01 Pchelolo: stop cpjobqueue in eqiad for backlog accumulation
  • 12:44 Pchelolo: started parsoid linter script to generate load on cpjobqueue (python3 parsoid_reparse.py http://parsoid.discovery.wmnet:8000 --sitematrix --linter-only --skip-closed https://commons.wikimedia.org/w/api.php)
  • 12:44 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: gerrit:393587 BETA wmgMonologChannels FileImporter => debug (duration: 01m 01s)
  • 12:43 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1276.eqiad.wmnet
  • 12:35 moritzm: powercycling mw1276
  • 12:21 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: gerrit:393582 LABS ONLY Actually call wfLoadExtension for FileExporter & Importer on beta BETA (T181383) (duration: 00m 44s)
  • 12:14 marostegui: Stop replication on db1109 to test table rename for s5/s8 failover
  • 12:09 ppchelko@tin: Finished deploy [cpjobqueue/deploy@47d27dc]: Enable keep-alive T181007 (duration: 01m 03s)
  • 12:08 ppchelko@tin: Started deploy [cpjobqueue/deploy@47d27dc]: Enable keep-alive T181007
  • 11:47 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: gerrit:393577 LABS ONLY Enable FileImporter & FileExporter on BETA PT2/2 (T181383) (duration: 00m 45s)
  • 11:46 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: gerrit:393577 LABS ONLY Enable FileImporter & FileExporter on BETA PT1/2 (T181383) (duration: 00m 45s)
  • 11:44 addshore@tin: Synchronized wmf-config/extension-list-labs: gerrit:393576 BETA ONLY Add FileExporter & FileImporter to extension-list-labs (duration: 00m 45s)
  • 11:27 hashar: contint1001 enable Icinga "Disks space" notification again. It is no more complaing about Docker partitions | ping mutante | T178454
  • 11:17 godog: bootstrap restbase1007-a - T179422
  • 11:05 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 45s)
  • 11:05 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 10:53 moritzm: installing postgresql-common security updates
  • 10:08 ppchelko@tin: Finished deploy [cpjobqueue/deploy@e35aa05]: Revert using keep-alive (duration: 00m 22s)
  • 10:08 ppchelko@tin: Started deploy [cpjobqueue/deploy@e35aa05]: Revert using keep-alive
  • 09:42 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b570d4e]: Make http agent use keep-alive (duration: 00m 48s)
  • 09:41 ppchelko@tin: Started deploy [cpjobqueue/deploy@b570d4e]: Make http agent use keep-alive
  • 09:14 godog: reimage restbase1007 - T179422
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1097:3314 - T178359 (duration: 00m 43s)
  • 08:40 moritzm: installing openjdk security updates on hadoop, druid and kafka clusters
  • 08:27 marostegui: Deploy schema change on dbstore1002 and dbstore1001 - T174569
  • 08:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1097:3314 weight - T178359 (duration: 00m 45s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1097:3314 weight - T178359 (duration: 00m 45s)
  • 07:10 marostegui: Stop MySQL on db1021 as it will be decommissioned - T181378
  • 06:53 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1021 from the config as it will be decommissioned - T181378 (duration: 00m 44s)
  • 06:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1021 from the config as it will be decommissioned - T181378 (duration: 00m 45s)
  • 06:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1097:3314 with low weight - T178359 (duration: 00m 46s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.8) (duration: 05m 44s)

2017-11-25

  • 19:31 marostegui: Set 32:3 disk to offline on db1051
  • 16:09 kartik@tin: Finished deploy [cxserver/deploy@11aecc9]: Update cxserver to 0c242c0, Pin service-runner to 2.4.2 (duration: 03m 29s)
  • 16:05 kartik@tin: Started deploy [cxserver/deploy@11aecc9]: Update cxserver to 0c242c0, Pin service-runner to 2.4.2
  • 16:05 godog: unban statsd traffic from scb on graphite1001 - T181333
  • 15:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@e35aa05]: Rollback. Disable GC metric reporting T181333 (duration: 00m 31s)
  • 15:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@e35aa05]: Rollback. Disable GC metric reporting T181333
  • 15:37 volans: restarted statsd-proxy on graphite1001 (died during investigation) T181333
  • 14:34 godog: rolling restart of cxserver to alleviate metrics leak - T181333
  • 14:26 godog: restart cxserver on scb100[34] - T181333
  • 14:10 godog: roll-restart cpjobqueue to alleviate metrics leak - T181333
  • 13:40 godog: drop incoming statsd from scb to graphite1001 temporarily - T181333
  • 08:32 ariel@tin: Finished deploy [dumps/dumps@ec21673]: fix abstracts recombine job (duration: 00m 02s)
  • 08:32 ariel@tin: Started deploy [dumps/dumps@ec21673]: fix abstracts recombine job

2017-11-24

  • 13:52 moritzm: removing git packages from jessie-wikimedia/experimental (replaced by component/git)
  • 13:24 moritzm: installing openjpeg2 updates (original security already got installed after initial release, but there was a binNMU for amd64)
  • 13:17 marostegui: Stop replication on db1097 to reimport and recompress commonswiki.watchlist
  • 12:54 jynus: reenabling puppet on db1071
  • 12:50 jynus: resetting replication on es1011 for consistency with other replica sets
  • 12:40 jynus: setting up s8 topology on eqiad
  • 12:38 jynus: disable puppet on db1071 and stop local s5 heartbeat there
  • 12:32 reedy@tin: Synchronized docroot/mediawiki/keys/: Fixup keys (duration: 00m 45s)
  • 12:13 marostegui: Enable GTID on es2018 - T181293
  • 11:57 marostegui: Disable puppet on es2018 - T181293
  • 11:50 jynus@tin: Synchronized wmf-config/db-codfw.php: depool es2018 T181293 (duration: 00m 45s)
  • 11:48 marostegui: Reboot es2018 after full-upgrade - T181293
  • 11:25 marostegui: Restart mysql on es2018
  • 11:25 jynus@tin: Synchronized wmf-config/db-eqiad.php: db2085:3318, db2086:3318 (duration: 00m 43s)
  • 11:24 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2038, db2085:3318, db2086:3318 (duration: 00m 45s)
  • 10:55 marostegui: Restart MySQL on db2086 to move s5 to s8
  • 10:37 jynus: cancelling db2085 restart, only doing mysql:s5
  • 10:35 jynus: restarting db2085 (including both s5 and s3 instances)
  • 10:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool all future s8 slaves for a topology change - T177208 (duration: 00m 45s)
  • 10:22 moritzm: installing ca-cerfificates updates on trusty hosts
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1097:3315 and db1092 (duration: 00m 45s)
  • 09:50 jynus: restarting db2045
  • 09:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101:3318 db1097:3315 abd db1092 (duration: 00m 45s)
  • 08:49 marostegui: Stop MySQL on db1092
  • 08:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101:3318 in s5 to warm it up and depool db1092 - T178359 T177208 (duration: 00m 45s)
  • 08:40 moritzm: installing java security updates on notebook* hosts
  • 08:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1097:3315 - T178359 (duration: 00m 45s)
  • 08:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly pool db1097:3315 - T178359 (duration: 00m 45s)
  • 08:20 moritzm: installing java security updates on meitnerium
  • 08:15 moritzm: installing java security updates on stat1004
  • 08:14 hashar: restarting jenkins on contint1001 for a java update
  • 08:07 elukey: re-enabling piwik on bohrium (only VM running on ganeti1006 atm) after mysql tables restore completed
  • 06:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1101:3318 in s5 to warm it up - T178359 (duration: 00m 45s)
  • 06:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly pool db1101:3318 in s5 to warm it up - T178359 (duration: 00m 49s)
  • 00:12 reedy@tin: Synchronized docroot/mediawiki/keys/: Add Brian Wolff's key (duration: 00m 45s)

2017-11-23

  • 23:42 reedy@tin: Synchronized wmf-config/: Simplify variable assignment (duration: 00m 47s)
  • 23:25 reedy@tin: Synchronized composer.lock: (no justification provided) (duration: 00m 44s)
  • 23:24 reedy@tin: Synchronized composer.json: (no justification provided) (duration: 00m 45s)
  • 23:23 reedy@tin: Synchronized phpcs.xml: (no justification provided) (duration: 00m 45s)
  • 22:39 demon@tin: Synchronized wmf-config/: style fixes, no-op (duration: 00m 47s)
  • 22:19 demon@tin: Synchronized wmf-config/CommonSettings.php: no-op, moving stuff around (duration: 00m 47s)
  • 21:34 ariel@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2251.codfw.wmnet
  • 21:15 apergos: rebooted mw2251 after unresponsive on mgmt console and no ping
  • 20:49 demon@tin: Synchronized static/images/project-logos/nowikimedia.png: logo update (duration: 02m 51s)
  • 17:29 akosiaris: set ganeti1006 as drained. ganeti1005 was already set. That will prevent scheduling VMs on those. T181121
  • 17:01 akosiaris: force powercycle on ganeti1006 T181121
  • 16:30 akosiaris: gnt-node failover ganeti1006
  • 15:50 jynus: disable puppet on db2023, db2038 for deployment whle transfer is ongoing
  • 15:23 moritzm: rebooting mwdebug* servers for update to 4.9.51
  • 14:10 marostegui: Stop Mysql on db1101.s5 to clone db1097 - T178359
  • 14:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1053 in s2 as vslow to replace db1021 - T134476 (duration: 00m 45s)
  • 13:13 jynus: sutting down db2023 and db2038 for cloning
  • 12:47 reedy@tin: Synchronized wmf-config/CommonSettings.php: GWToolset migratory config T87928 (duration: 00m 46s)
  • 12:40 moritzm: migrating instances off ganeti2001 / kernel reboot to 4.9.51
  • 12:10 moritzm: migrating instances off ganeti2002 / kernel reboot to 4.9.51
  • 11:52 moritzm: migrating instances off ganeti2003 / kernel reboot to 4.9.51
  • 11:37 kartik@tin: Finished deploy [cxserver/deploy@51b78ce]: T181209 Update cxserver to e8fe3f0 to fix Youdao MT (duration: 03m 01s)
  • 11:35 moritzm: migrating instances off ganeti2004 / kernel reboot to 4.9.51
  • 11:35 moritzm: migrating instances off ganeti200 / kernel reboot to 4.9.51
  • 11:34 kartik@tin: Started deploy [cxserver/deploy@51b78ce]: T181209 Update cxserver to e8fe3f0 to fix Youdao MT
  • 11:23 moritzm: migrating instances off ganeti2005 / kernel reboot to 4.9.51
  • 11:15 moritzm: migrating instances off ganeti2006 / kernel reboot to 4.9.51
  • 11:06 moritzm: migrating instances off ganeti2007 / kernel reboot to 4.9.51
  • 10:57 moritzm: migrating instances off ganeti2008 / kernel reboot to 4.9.51
  • 10:07 moritzm: rebooting sarin for update to 4.9.51
  • 09:16 jynus@tin: Synchronized wmf-config/db-codfw.php: mariadb: Switchover s5 codfw master (duration: 00m 45s)
  • 08:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 (duration: 00m 44s)
  • 08:44 jynus: stopping and restarting db2052
  • 08:31 moritzm: installing bdb security updates on trusty
  • 08:29 jynus: starting switchover of db2023 to db2052
  • 08:21 marostegui: Stop MySQL on db1072 to MySQL upgrade and kernel upgrade
  • 08:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 00m 45s)
  • 07:28 marostegui: Stop MySQL on db1021 and db1053 to clone db1053
  • 07:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1021 - T134476 (duration: 00m 45s)
  • 07:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1053 to s2 to replace db1021 as vslow, dump slave - T134476 (duration: 00m 45s)
  • 06:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T180045 (duration: 00m 45s)
  • 06:55 marostegui: Drop index on ores_classification on s1 - T180045
  • 06:53 marostegui: Drop index on ores_classification on s2 - T180045
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T180045 (duration: 00m 45s)
  • 06:44 legoktm: legoktm@tin:~$ echo "https://www.mediawiki.org/keys/keys.html" | mwscript purgeList.php
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1051 and db1063 in vslow service for s5 to warm them up for the s8 split - T177208 (duration: 00m 46s)
  • 05:58 kartik@tin: Finished deploy [cxserver/deploy@5b35ed5]: Update cxserver to e8fe3f0 (duration: 04m 13s)
  • 05:54 kartik@tin: Started deploy [cxserver/deploy@5b35ed5]: Update cxserver to e8fe3f0
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.8) (duration: 05m 27s)
  • 01:36 demon@tin: Synchronized wmf-config/InitialiseSettings.php: dropping education program from cswiki (duration: 00m 45s)
  • 01:29 demon@tin: Synchronized wmf-config/CommonSettings.php: Set $wgCentralAuthGlobalBlockInterwikiPrefix (duration: 00m 45s)
  • 01:26 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: no-op (duration: 00m 44s)
  • 01:24 demon@tin: Synchronized wmf-config/CommonSettings.php: abusefilter stuff (duration: 00m 45s)
  • 01:21 demon@tin: Synchronized wmf-config/InitialiseSettings.php: abusefilter stuff (duration: 00m 45s)
  • 01:19 demon@tin: Synchronized wmf-config/CommonSettings.php: abusefilter stuff (duration: 00m 45s)
  • 01:05 demon@tin: Synchronized docroot/mediawiki/keys/: update formatting, docs, etc (duration: 00m 46s)

2017-11-22

  • 21:15 mutante: running puppet on logstash hosts to apply config change 392897 (add log4j filter)
  • 21:11 demon@tin: Synchronized docroot/noc/conf/index.php: favicon fix (duration: 00m 46s)
  • 20:35 demon@tin: Synchronized dblists/: now with more alphabeticalizedness (duration: 00m 45s)
  • 20:25 demon@tin: Synchronized dblists/Makefile: no-op (duration: 00m 45s)
  • 20:23 demon@tin: Synchronized tests/noc-conf/NOCDblistTest.php: no-op (duration: 00m 45s)
  • 20:19 demon@tin: Synchronized wmf-config/LabsServices.php: no-op (duration: 00m 45s)
  • 20:14 demon@tin: Synchronized wmf-config/InitialiseSettings.php: adding national library of Israel to copy domains (duration: 00m 45s)
  • 20:13 otto@tin: Finished deploy [eventlogging/analytics@57234e7]: no-op: removing now unneeded code that might accidentally serialize userAgent to json string: T179625 (duration: 00m 04s)
  • 20:13 otto@tin: Started deploy [eventlogging/analytics@57234e7]: no-op: removing now unneeded code that might accidentally serialize userAgent to json string: T179625
  • 20:09 demon@tin: Synchronized robots.txt: block a nasty bot 💔 (duration: 00m 44s)
  • 20:02 demon@tin: Synchronized multiversion/bin/expanddblist: fix param warning (duration: 00m 45s)
  • 20:00 mepps: updated civicrm from a16e566 to a1022cf
  • 19:58 demon@tin: Synchronized wmf-config/InitialiseSettings.php: comments (duration: 00m 45s)
  • 19:56 demon@tin: Synchronized multiversion/MWScript.php: assume aawiki for purgeUrls (duration: 00m 45s)
  • 19:51 demon@tin: Synchronized docroot/: removing old foundation docroot (duration: 00m 46s)
  • 19:49 urandom: starting cassandra cleanups, restbase-200{1,3,5}-a - T179422
  • 19:41 demon@tin: Synchronized w/extract2.php: removing old portal support (duration: 00m 45s)
  • 18:47 demon@tin: Synchronized dblists/closed.dblist: closed transitionteamwiki (duration: 00m 45s)
  • 18:34 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: no-op (duration: 00m 45s)
  • 18:32 demon@tin: Synchronized scap/plugins/updatewikiversions.py: minor fix (duration: 00m 45s)
  • 18:30 demon@tin: Pruned MediaWiki: 1.31.0-wmf.7 [keeping static files] (duration: 01m 46s)
  • 18:12 herron: re-enabling puppet agents after puppetdb postgres security updates
  • 18:09 moritzm: installing postgres security updates on nitrogen/puppetdb
  • 18:05 moritzm: installing postgres security updates on nihal/puppetdb
  • 18:02 herron: disabling puppet agents for puppetdb postgres security update
  • 17:30 demon@tin: Synchronized php-1.31.0-wmf.8/extensions/AdvancedSearch/: fixing layout issues in timeless (duration: 00m 46s)
  • 16:54 mepps: updated payments-wiki from 1ca91b1 to 6b3019b
  • 16:45 chasemp: disable puppet accross labtest things
  • 16:34 marostegui: Compress s4 on db1097 - T178359
  • 16:28 herron: starting canary deploy/cutover of codfw scb hosts to codfw puppet 4 masters
  • 16:16 elukey: restart druid broker,coordinator,historical daemons on druid100[123] to pick up new logging settings
  • 15:41 jynus: starting manually pt-heartbeat for s8 on db1071
  • 15:22 herron: beginning cut over of codfw db servers (^db2.*) to codfw puppet 4 masters
  • 14:49 jynus@tin: Synchronized wmf-config/db-codfw.php: mariadb: Setup s8 replica set on codfw (duration: 00m 45s)
  • 14:27 moritzm: installing libxml-libxml-perl security updates
  • 14:21 jynus: starting database topology changes for s8 on codfw T177208
  • 14:11 urandom: bootstrapping cassandra, restbase2004-c.codfw.wmnet - T179422
  • 13:43 apergos: one more round of labstore1006 <-- ms1001 rsync catchup
  • 13:37 moritzm: installing imagemagick security updates
  • 12:39 marostegui: Stop MySQL on db1053 to clone db1097.s4 - T178359
  • 12:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 - T178359 (duration: 00m 45s)
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Start adapting the config to move db1097 to s4 and s5 as multi-instance rc slave T178359 (duration: 00m 45s)
  • 12:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1101.s7 - T178359 (duration: 00m 45s)
  • 11:51 jynus: starting dropping incorrectly created database on s7 amwikimedia (not to be confused with production wiki s3 amwikimedia)
  • 11:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101.s7 - T178359 (duration: 00m 45s)
  • 11:19 akosiaris: gnt-node evacuate -s -f ganeti1005. T181121
  • 11:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101.s7 - T178359 (duration: 00m 45s)
  • 10:54 akosiaris: gnt-node migrate -f ganeti1005. T181121
  • 10:51 marostegui: Drop index from ores_classification on s5 - T180045
  • 10:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1051 and db1063 in vslow service for s5 to warm them up for the s8 split - T177208 (duration: 00m 45s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1101 to s5 and s7 as recentchanges multi-instance slave - T178359 (duration: 00m 45s)
  • 10:04 moritzm: running "scap pull" on mw1191, it's depooled and marked as "inactive", but health checks are triggering db errors
  • 09:35 bblack: cr[12]-ulsfo - switch static fallback LVS routes from lvs400[12] to lvs400[56]
  • 09:27 bblack: lvs@ulsfo - done switching primaries (host MED config) - lvs400[56] now primary for text/upload traffic
  • 09:11 akosiaris@tin: Finished deploy [parsoid/deploy@b150764]: T180211 (duration: 05m 05s)
  • 09:08 bblack: puppet disabled on lvs400[1256] for switching primaries
  • 09:06 akosiaris@tin: Started deploy [parsoid/deploy@b150764]: T180211
  • 09:04 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wtp2017.codfw.wmnet
  • 09:00 bblack: lvs4005 - reboot to clear experimental stuff
  • 08:16 bblack: backend restart on cp4024 (upload@ulsfo) - mailbox lag
  • 07:56 marostegui: Drop index from ores_classification on s3 - T180045
  • 07:50 marostegui: Drop index from ores_classification on s6 - T180045
  • 07:48 marostegui: Drop index from ores_classification on s7 - T180045
  • 07:29 _joe_: stopping the additional workers for htmlCacheUpdate (commons and ruwiki), adding one additional runner for refreshLinks on ruwiki
  • 06:43 marostegui: Stop MySQL on db1063 and db1051 (which is going to be recloned) - T177208
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1051 from s1 to s5 - T177208 (duration: 00m 53s)
  • 06:28 ejegg: updated payments-wiki from f871160 to 1ca91b1
  • 03:36 mutante: powercycled mw2251 which had gone down without further comment
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.8) (duration: 11m 34s)
  • 01:09 hoo: Cleaned out remaining T180934 related log blow up on snapshot1007 (dumpwikidatajson-wikidata-20171120-all-0.log)
  • 00:43 mutante: gerrit restarting service to apply config change
  • 00:40 mutante: gerrit - re-enabling puppet to apply logstash change on cobalt, gerrit restart incoming (T141324)
  • 00:39 maxsem@tin: Synchronized php-1.31.0-wmf.8/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/392754/ (duration: 00m 47s)
  • 00:11 maxsem@tin: Synchronized php-1.31.0-wmf.8/extensions/InputBox/: https://gerrit.wikimedia.org/r/#/c/392745/ (duration: 00m 45s)
  • 00:10 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/392576/ (duration: 00m 46s)
  • 00:07 bblack@neodymium: conftool action : set/pooled=yes; selector: name=nescio.wikimedia.org
  • 00:04 bblack@neodymium: conftool action : set/pooled=no; selector: name=nescio.wikimedia.org
  • 00:02 bblack@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org
  • 00:01 bblack@neodymium: conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org
  • 00:01 bblack@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org
  • 00:00 bblack@neodymium: conftool action : set/pooled=yes; selector: name=chromium.wikimedia.org

2017-11-21

  • 23:55 bblack@neodymium: conftool action : set/pooled=no; selector: name=chromium.wikimedia.org
  • 23:30 mholloway-shell@tin: Finished deploy [mobileapps/deploy@fc01242]: Update mobileapps to 52d6a83 (duration: 04m 36s)
  • 23:26 mholloway-shell@tin: Started deploy [mobileapps/deploy@fc01242]: Update mobileapps to 52d6a83
  • 22:45 bblack@neodymium: conftool action : set/pooled=yes; selector: name=achernar.wikimedia.org
  • 22:38 bblack@neodymium: conftool action : set/pooled=no; selector: name=achernar.wikimedia.org
  • 22:36 bblack@neodymium: conftool action : set/pooled=yes; selector: name=hydrogen.wikimedia.org
  • 22:29 urandom: Bootstrapping Cassandra, restbase2004-b.codfw.wmnet (T179422)
  • 22:12 mutante: gerrit - temp disable puppet on cobalt (prod gerrit), test switching gerrit logging to logstash on gerrit2001 - gerrit:392079 gerrit:392083 T141324
  • 22:05 bblack@neodymium: conftool action : set/pooled=no; selector: name=hydrogen.wikimedia.org
  • 22:03 bblack@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org
  • 22:02 bblack: repooling acamar for recdns
  • 21:26 ariel@tin: Finished deploy [dumps/dumps@16f92d6]: take 2: gzip namespace and abstract dumps; remove last configfile existence checks (duration: 00m 02s)
  • 21:26 ariel@tin: Started deploy [dumps/dumps@16f92d6]: take 2: gzip namespace and abstract dumps; remove last configfile existence checks
  • 20:45 bblack: recdns: puppet disabled on all, acamar depooled, careful deploys going on for anycast+recdns stuff
  • 20:42 ayounsi@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org
  • 20:26 ariel@tin: Finished deploy [dumps/dumps@16f92d6]: gzip namespace and abstract dumps; remove last configfile existence checks (duration: 00m 16s)
  • 20:26 ariel@tin: Started deploy [dumps/dumps@16f92d6]: gzip namespace and abstract dumps; remove last configfile existence checks
  • 20:22 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to 1.31.0-wmf.8
  • 20:00 thcipriani: finishing wmf.8 rollout, starting group2 to wmf.8
  • 19:21 addshore: SWAT done!
  • 19:20 addshore@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add images.collection.cooperhewitt.org & *.dimu.org to wgCopyUploadsDomains. T180791 T180241 (duration: 00m 48s)
  • 19:11 addshore@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Enable AdvancedSearch on group0 PT2/2 (duration: 00m 49s)
  • 19:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AdvancedSearch on group0 PT1/2 (duration: 00m 50s)
  • 18:40 mholloway-shell@tin: Finished deploy [mobileapps/deploy@dd41387]: Update mobileapps to 9d1602d (duration: 05m 09s)
  • 18:35 mholloway-shell@tin: Started deploy [mobileapps/deploy@dd41387]: Update mobileapps to 9d1602d
  • 18:28 smalyshev@tin: Finished deploy [wdqs/wdqs@c69c739]: Restore categories vocabulary to V003 (duration: 01m 55s)
  • 18:26 smalyshev@tin: Started deploy [wdqs/wdqs@c69c739]: Restore categories vocabulary to V003
  • 18:25 smalyshev@tin: Finished deploy [wdqs/wdqs@7d951d2]: Restore categories vocabulary to V003 (duration: 00m 11s)
  • 18:25 smalyshev@tin: Started deploy [wdqs/wdqs@7d951d2]: Restore categories vocabulary to V003
  • 16:45 marostegui: Compress s3 on db2085 - T178359
  • 16:35 papaul: powering down wtp2017 for disk replacement
  • 16:19 papaul: updating firmware on db2068
  • 15:31 _joe_: rolling restart of pybal on low-traffic in codfw, eqiad for the new depool thresholds for MW
  • 15:00 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3e948de]: Revert: Temporarily disable deduplication (duration: 00m 26s)
  • 15:00 ppchelko@tin: Started deploy [cpjobqueue/deploy@3e948de]: Revert: Temporarily disable deduplication
  • 14:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1096 (duration: 00m 48s)
  • 14:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aac3201]: Temporarily disable deduplication (duration: 00m 29s)
  • 14:39 ppchelko@tin: Started deploy [cpjobqueue/deploy@aac3201]: Temporarily disable deduplication
  • 14:29 zeljkof: EU SWAT finished
  • 14:25 bblack: cr[12]-ulsfo: allow lvs400[567] as PyBal neighbors for BGP
  • 14:18 zfilipin@tin: Synchronized php-1.31.0-wmf.8/extensions/TimedMediaHandler/MwEmbedModules/EmbedPlayer/resources/mw.EmbedPlayerOgvJs.js: SWAT: Disable wasm, use asm.js codec modules for Safari/Edge (T181022) (duration: 00m 49s)
  • 14:12 jdrewniak@tin: Synchronized portals: SWAT: Bumping portals to master (T128546) (duration: 00m 49s)
  • 14:11 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T128546) (duration: 00m 50s)
  • 14:11 ppchelko@tin: Started restart [cpjobqueue/deploy@5341d94]: Restart to try increased maxSockets
  • 14:11 Pchelolo: restart cpjobqueue to try increasing maxSockets
  • 14:08 bblack: not-abnormally-quick reboot on lvs4005
  • 14:08 mobrovac@tin: Started restart [electron-render/deploy@8dd5f13]: electron stuck - T174916
  • 13:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5341d94]: Enable GC metrics reporting (duration: 00m 36s)
  • 13:44 ppchelko@tin: Started deploy [cpjobqueue/deploy@5341d94]: Enable GC metrics reporting
  • 13:39 _joe_: starting 2 manual runners for htmlcacheupdate on commons, 1 for htmlcacheupdate and 1 for refreshlinks on ruwiki, on terbium
  • 13:39 bblack: quick reboot on lvs4005
  • 12:02 kartik@tin: Finished deploy [cxserver/deploy@b87a27a]: Update cxserver to 4301987 (duration: 03m 24s)
  • 11:58 kartik@tin: Started deploy [cxserver/deploy@b87a27a]: Update cxserver to 4301987
  • 11:04 akosiaris: reboot ununpentium for serial tty change
  • 11:03 akosiaris: reboot oresrdb2001 for serial tty change
  • 11:02 akosiaris: reboot netmon1003 for serial tty change
  • 11:01 akosiaris: reboot install2002 for serial tty change
  • 10:57 akosiaris: reboot install1002 for serial tty change
  • 10:35 godog: bootstrap cassandra restbase2004-a - T179422
  • 10:11 ppchelko@tin: Finished deploy [cpjobqueue/deploy@e35aa05]: Set consumer_batch_size to 10 T181007 (duration: 00m 31s)
  • 10:10 ppchelko@tin: Started deploy [cpjobqueue/deploy@e35aa05]: Set consumer_batch_size to 10 T181007
  • 09:39 elukey: upload prometheus-druid-exporter 0.4 to jessie/stretch-wikimedia
  • 09:28 marostegui: Shutdown db2068 for maintenance - T180927
  • 09:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1096 (duration: 00m 49s)
  • 09:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1096 (duration: 00m 49s)
  • 08:39 marostegui: Drop index on db1089 enwiki.ores_classification - T180045
  • 08:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096 with low weight - T178359 (duration: 00m 48s)
  • 08:27 marostegui: Reboot db1096 for kernel and MariaDB upgrade
  • 08:19 marostegui: Compress s5 on db1101 - T178359
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1109 and db1110 - T180700 (duration: 00m 48s)
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1109 and db1110 - T180700 (duration: 00m 49s)
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1109 and db1110 in s5 with small weight - T180700 (duration: 00m 48s)
  • 06:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T179106 (duration: 00m 48s)
  • 06:47 marostegui: Remove index wb_terms_language from db1087 - https://phabricator.wikimedia.org/T179106
  • 06:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T179106 (duration: 00m 48s)
  • 06:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096 - T178359 (duration: 00m 48s)
  • 06:27 marostegui: Stop MySQL on db1096 to clone db1101.s5 - T178359
  • 06:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1082 original weight - T177208 (duration: 00m 49s)
  • 02:58 mutante: phabricator back up
  • 02:55 mutante: phab1001 (phabricator prod) reboot for kernel upgrade
  • 02:48 krinkle@tin: Synchronized docroot/noc: clean up (duration: 00m 49s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 05m 42s)
  • 02:03 smalyshev@tin: Started deploy [wdqs/wdqs@7d951d2]: Restore categories vocabulary to V003
  • 00:19 catrope@tin: Synchronized php-1.31.0-wmf.8/resources/src/mediawiki.rcfilters/ui/mw.rcfilters.ui.ItemMenuOptionWidget.js: T180863 (duration: 00m 49s)

2017-11-20

  • 23:46 mutante: phab2001 - reboot for kernel upgrade
  • 23:35 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: emergency disable ORES on frwp/ruwp T181006 (duration: 00m 49s)
  • 23:35 awight: purge cache keys for ORES thresholds on frwiki and ruwiki
  • 23:25 awight@tin: Started deploy [ores/deploy@82a13ae]: Rollback ORES (take 3); 181006
  • 23:19 awight: aborted ORES rollback
  • 23:18 awight@tin: Started deploy [ores/deploy@95cd523]: Rollback ORES (take 2); 181006
  • 23:14 smalyshev@tin: Finished deploy [wdqs/wdqs@7d951d2]: Rollback categories vocabulary version due to a bug (duration: 08m 11s)
  • 23:11 awight: purged memcache key 'ruwiki:ORES:threshold_statistics:goodfaith:1’, T181006
  • 23:06 smalyshev@tin: Started deploy [wdqs/wdqs@7d951d2]: Rollback categories vocabulary version due to a bug
  • 22:55 awight@tin: Finished deploy [ores/deploy@5084251]: Rollback ORES; T179711 (duration: 01m 05s)
  • 22:55 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: no wmf.8 for group2. i hate my life
  • 22:55 awight: rolling back ORES to fix T181006
  • 22:54 eileen: update civicrm from 272580a to a16e566
  • 22:54 awight@tin: Started deploy [ores/deploy@5084251]: Rollback ORES; T179711
  • 22:50 Krinkle: MW HTTP 500 spike tracked as https://phabricator.wikimedia.org/T181006
  • 22:49 Krinkle: Sharp rise in HTTP 500 errors as of 22:05 (45 minutes ago)
  • 22:29 aaron@tin: Synchronized php-1.31.0-wmf.7/includes/libs/objectcache/WANObjectCache.php: fix cache key namespace (duration: 00m 49s)
  • 22:27 awight@tin: Finished deploy [ores/deploy@5084251]: Updating ORES to revscoring 2.0.10, T179711 (duration: 49m 54s)
  • 22:26 aaron@tin: Synchronized php-1.31.0-wmf.7/extensions/Collection: cache key name fix (duration: 00m 49s)
  • 22:11 aaron@tin: Synchronized php-1.31.0-wmf.8/includes/libs/objectcache/WANObjectCache.php: 7e74b49: namespace WAN cache variant keys (duration: 00m 48s)
  • 22:04 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.8
  • 21:52 bblack: ns2 back on eeden
  • 21:52 aaron@tin: Synchronized php-1.31.0-wmf.8/extensions/Collection: 3baebf4a: cache key name fix (duration: 00m 51s)
  • 21:44 bblack: rebooting eeden
  • 21:37 awight@tin: Started deploy [ores/deploy@5084251]: Updating ORES to revscoring 2.0.10, T179711
  • 21:33 bblack: routing ns2.wikimedia.org to radon for eeden reboot
  • 21:19 bblack: routing ns0.wikimedia.org back to radon post-reboot
  • 21:13 bblack: rebooting radon
  • 21:07 bblack: re-routing ns0.wikimedia.org traffic to baham for radon reboot
  • 20:17 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Remove overlapping userrights (T101983) (duration: 19m 41s)
  • 20:16 bblack: ns1 dns traffic back to normal on baham
  • 20:16 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Revert "Enable Translate extension in amwikimedia (T180879)" (duration: 00m 48s)
  • 20:11 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Translate extension in amwikimedia (T180879) (duration: 00m 49s)
  • 20:11 hashar: CI docker jobs were all broken due to a mistake. Should be back now. T177684
  • 20:07 ladsgroup@tin: Synchronized php-1.31.0-wmf.8/resources/src/mediawiki.rcfilters/ui/mw.rcfilters.ui.ItemMenuOptionWidget.js: RCFilters: Only apply excluded label to namespace items (T180863) (duration: 00m 49s)
  • 20:06 bblack: rebooting baham
  • 19:58 bblack: re-routing ns1.wikimedia.org traffic to radon for baham reboot
  • 19:52 chasemp: updating phab mail handler
  • 19:51 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Remove overlapping userrights (T101983) (duration: 00m 49s)
  • 19:47 ottomata: restarted coal with fixes for eventcapsule changes in T179625
  • 19:42 ladsgroup@tin: Synchronized wmf-config/throttle.php: Adjust throttle.php for dewiki workshop (T180046) (duration: 00m 49s)
  • 19:40 ladsgroup@tin: Synchronized wmf-config/throttle.php: Adjust throttle.php for dewiki workshop (T180046) (duration: 00m 48s)
  • 19:36 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Allow admins to remove users from MP3 uploaders user group (T180002) (duration: 00m 49s)
  • 19:31 ladsgroup@tin: Synchronized portals: SWAT: Bumping portals to master (T128546) (duration: 00m 51s)
  • 19:30 ladsgroup@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master (T128546) (duration: 00m 49s)
  • 19:22 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Switch submit button from 'save' to 'publish' on dewiki (duration: 00m 50s)
  • 18:51 chasemp: disable puppet across cloud things to rollout https://gerrit.wikimedia.org/r/#/c/392168/ slowly
  • 18:25 smalyshev@tin: Finished deploy [wdqs/wdqs@2e39b69]: Blazegraph and GUI update (duration: 01m 42s)
  • 18:23 smalyshev@tin: Started deploy [wdqs/wdqs@2e39b69]: Blazegraph and GUI update
  • 17:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 with low weight - T177208 (duration: 00m 49s)
  • 17:19 bd808: Sending Toolforge survey emails from silver for T177126
  • 17:00 moritzm: uploaded git 2.11-3+deb9u2+bpo8+wmf1 for component/git to apt.wikimedia.org/jessie-wikimedia
  • 16:43 marostegui: Reboot db1082 for kernel upgrade and MariaDB upgrade to 10.0.33
  • 16:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@174420f]: Revert: Temporary set consumer_batch_size to 50, forgot -f, checkout prev rev (duration: 03m 33s)
  • 16:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1109 and db1110 to the config depooled - T180700 (duration: 00m 48s)
  • 16:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1109 and db1110 to the config depooled - T180700 (duration: 00m 48s)
  • 16:36 ppchelko@tin: Started deploy [cpjobqueue/deploy@174420f]: Revert: Temporary set consumer_batch_size to 50, forgot -f, checkout prev rev
  • 16:32 ppchelko@tin: Finished deploy [cpjobqueue/deploy@f0610d3]: Revert: Temporary set consumer_batch_size to 50, forgot -f (duration: 00m 28s)
  • 16:32 ppchelko@tin: Started deploy [cpjobqueue/deploy@f0610d3]: Revert: Temporary set consumer_batch_size to 50, forgot -f
  • 16:24 marostegui: Add db1109 and db1110 to tendril - T180700
  • 16:06 ppchelko@tin: Finished deploy [cpjobqueue/deploy@f0610d3]: Revert: Temporary set consumer_batch_size to 50 (duration: 00m 17s)
  • 16:06 ppchelko@tin: Started deploy [cpjobqueue/deploy@f0610d3]: Revert: Temporary set consumer_batch_size to 50
  • 15:59 urandom: Use lz4 compression instead of deflate (T180804)
  • 15:45 jynus: shutting down db2068 for maintenance after depool T180927
  • 15:42 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2068 (duration: 00m 49s)
  • 15:34 chasemp: disable puppet for a merge across cloud things
  • 15:31 ppchelko@tin: Finished deploy [cpjobqueue/deploy@f0610d3]: Temporary set consumer_batch_size to 50 (duration: 00m 30s)
  • 15:30 ppchelko@tin: Started deploy [cpjobqueue/deploy@f0610d3]: Temporary set consumer_batch_size to 50
  • 15:03 herron: pointing codfw mw servers at codfw puppet 4 masters via puppetmaster2001
  • 14:39 zeljkof: EU SWAT finished
  • 14:34 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Single edit tab in Catalan Wikipedia (T180660) (duration: 00m 49s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable wgNamespacesWithSubpages for hiwikiversity (T180913) (duration: 00m 48s)
  • 14:22 jynus: enable semi-sync replication on s5
  • 14:14 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: T180805: Revert [cirrus] disable token count router (duration: 00m 49s)
  • 14:05 elukey: upload prometheus-druid-exporter 0.3 to jessie-wikimedia
  • 13:53 Pchelolo: disable puppet on scb100x and stop cpjobqueue to accumulate some backlog
  • 13:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@174420f]: Optimise committed offset calculation (duration: 00m 28s)
  • 13:39 ppchelko@tin: Started deploy [cpjobqueue/deploy@174420f]: Optimise committed offset calculation
  • 13:30 elukey: upload prometheus-druid-exporter 0.3 to stretch-wikimedia
  • 12:40 marostegui: Optimize db1063 wikidatawiki.wb_terms
  • 12:17 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf1+icu57 to apt.wikimedia.org (jessie-wikimedia/component/icu57) (HHVM build linked against a co-installable backport of icu57)
  • 11:46 moritzm: rebooting tungsten for update to 4.9.51
  • 11:41 moritzm: rebooting etherpad1001 (etherpad.wikimedia.org) for update to 4.9.51
  • 11:36 moritzm: rebooting pybal-test for update to 4.9.51
  • 11:27 moritzm: rebooting restbase-test cluster for update to 4.9.51
  • 11:23 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bdcef23]: Various performance improvements in committing (duration: 00m 30s)
  • 11:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@bdcef23]: Various performance improvements in committing
  • 11:13 moritzm: rebooting ruthenium for update to 4.9.51
  • 10:57 godog: reimage restbase2004 - T179422
  • 10:51 moritzm: rebooting cerium/praseodymium/xenon for update to 4.9.51
  • 10:42 moritzm: rebooting mwlog1001 for update to 4.9.51
  • 10:39 moritzm: rebooting mwlog2001 for update to 4.9.51
  • 10:23 hoo: Manually re-started the Wikidata entity JSON dump on snapshot1007 (T180934)
  • 10:19 dcausse: elastic/cirrus: reindexing english group0 and group1 wikis: T179945
  • 10:08 hashar: contint1001: sudo systemctl start jenkins
  • 10:05 moritzm: rebooting contint1001 for update to 4.9.51
  • 09:22 moritzm: rebooting hafnium for update to 4.9.51
  • 08:46 marostegui: Run mydumper for db1047.staging - T156844
  • 07:58 moritzm: installing procmail security updates
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 - T178359 (duration: 00m 49s)
  • 06:41 marostegui: Stop MySQL on db1082 to clone db1109 and db1110 - T180700
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T177208 (duration: 00m 48s)
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weights for db1100 and db1071 - T180917 (duration: 00m 49s)
  • 06:15 marostegui: Reboot db2068 - T180927
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 06m 26s)

2017-11-19

  • 23:20 Jamesofur: removed 2FA for Ask21 T180889
  • 20:28 krinkle@tin: Synchronized docroot/noc/index.html: noc: Link to Grafana (duration: 00m 49s)
  • 19:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 00m 49s)
  • 07:09 mobrovac@tin: Started restart [zotero/translators@a0c41c3]: Zotero eating up memory
  • 07:01 mobrovac@tin: Started restart [electron-render/deploy@8dd5f13]: electron stuck - T174916

2017-11-18

  • 20:06 gehel: upgrade of elasticsearch eqiad complete - T178411
  • 18:03 reedy@tin: Synchronized multiversion/vendor/: bump (duration: 01m 14s)
  • 15:53 marostegui: Compress s5 on db2085 - T178359
  • 14:48 marostegui: Compress s3 on db2092 - T178359
  • 00:30 urandom: Bootstrapping restbase2006-b (T179422)

2017-11-17

  • 20:35 mutante: netmon1002 - rsync smokeping data back from local backup to show measurements made from eqiad as requested on T180812
  • 20:05 mutante: running puppet on all cache misc to switch smokeping web to eqiad
  • 20:02 mutante: T180812 copying smokeping data from 2001 to 1002 - netmon1002: /usr/bin/rsync -avp rsync://netmon2001.wikimedia.org/var-lib-smokeping /var/lib/smokeping/ | switching backend from codfw to eqiad
  • 19:37 mutante: T180812 - @netmon1002:# rsync -avp /var/lib/smokeping/ /root/backup/netmon1002/201711717/var/lib/smokeping/@netmon2001:/# rsync -avp /var/lib/smokeping/ /root/backup/netmon2001/201711717/var/lib/smokeping/
  • 19:36 mutante: @netmon1002:/var/lib/smokeping# rsync -avp /var/lib/smokeping/ /root/backup/netmon1002/201711717/var/lib/smokeping/
  • 18:37 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on Beta Cluster, no prod change (duration: 00m 48s)
  • 18:36 anomie@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on Beta Cluster, no prod change (duration: 00m 49s)
  • 17:43 demon@tin: Synchronized php-1.31.0-wmf.8/extensions/Wikidata/extensions/Wikibase/client/WikibaseClient.php: fix client dependencies (duration: 00m 49s)
  • 17:41 demon@tin: Synchronized php-1.31.0-wmf.8/extensions/Wikibase/client/WikibaseClient.php: fix client dependencies (duration: 00m 50s)
  • 17:21 bblack: enabling new normalization code for all upload@esams (done)
  • 17:19 bblack: enabling new normalization code for all upload@eqiad
  • 17:14 marostegui: Revert schema change on dbstore1001 - T180714
  • 17:12 marostegui: Move dbstore1001 under db1070 - T180714
  • 17:12 bblack: enabling new normalization code for all upload@codfw
  • 16:52 bblack: enabling new normalization code for all upload@ulsfo
  • 16:24 bblack: disabling puppet on all cp* (testing encoding patch)
  • 16:19 moritzm: uploaded boost 1.55.0+dfsg-3+wmf2+icu57 to apt.wikimedia.org for jessie-wikimedia/component/icu57 (needed for HHVM build linked against ICU 57)
  • 16:04 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: T180795 [cirrus] disable token count router (duration: 00m 49s)
  • 15:43 chasemp: labservices1001:~# mv /var/zones/tools.eqiad.wmflabs /home/rush T180797
  • 15:26 urandom: Starting restbase2006-c w/ -Dcassandra.replace_address=10.192.48.51 (T179422)
  • 14:06 jynus: deploy master events to db1070
  • 14:04 chasemp: labstore1003 service nfs-kernel-server restart && service rsync start
  • 13:55 moritzm: uploaded boost 1.55.0+dfsg-3+wmf1+icu57 to apt.wikimedia.org for jessie-wikimedia/component/icu57 (needed for HHVM build linked against ICU 57)
  • 12:15 moritzm: installing openssl updates on poolcounters
  • 12:07 moritzm: installing openssl updates on graphite* hosts
  • 11:46 moritzm: installing openssl updates on etcd* hosts
  • 11:43 moritzm: installing openssl updates on dbproxy hosts
  • 11:33 moritzm: installing openssl updates on puppetmasters
  • 11:01 akosiaris: create webperf1001, webperf2001 in ganeti T179036
  • 10:50 akosiaris@tin: Synchronized wmf-config/db-eqiad.php: (no justification provided) (duration: 00m 49s)
  • 10:49 akosiaris: sync wmf-config/db-eqiad.php for T180724
  • 10:48 akosiaris@tin: scap aborted: (no justification provided) (duration: 00m 02s)
  • 10:48 akosiaris@tin: Started scap: (no justification provided)
  • 10:47 akosiaris: pool mw2251 T180724
  • 10:47 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2251.codfw.wmnet
  • 10:01 moritzm: rebooting darmstadtium (docker registry) for update to 4.9.51
  • 09:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original traffic for the s5 eqiad hosts that were out for hours due to schema change revert after s5 master crash (duration: 00m 48s)
  • 09:54 moritzm: rebooting labnodepool* for update to 4.9.51
  • 09:38 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2141162]: Bump concurrency even more (duration: 00m 29s)
  • 09:38 ppchelko@tin: Started deploy [cpjobqueue/deploy@2141162]: Bump concurrency even more
  • 09:20 godog: start restbase2006-c instead, restbase2006-b failed and -c shows as "down" - T179422
  • 09:13 jynus: changing master of db1071 (63 -> 70)
  • 09:10 gehel: cleanup leftover titlesuggest indices on elasticsearch eqiad (jawiki, frwiki, ptwiki)
  • 09:05 ppchelko@tin: Finished deploy [cpjobqueue/deploy@cbd25d3]: Bump overall concurrency to get rid of RecordLintJob backlog (duration: 00m 35s)
  • 09:04 ppchelko@tin: Started deploy [cpjobqueue/deploy@cbd25d3]: Bump overall concurrency to get rid of RecordLintJob backlog
  • 08:54 godog: bootstrap restbase2006-b - T179422
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for the s5 eqiad hosts that were out for hours due to schema change revert after s5 master crash (duration: 00m 50s)
  • 08:04 elukey: reboot stat100[456] for kernel updates
  • 07:51 marostegui: Revert schema change on dbstore1002 - T180714
  • 07:50 marostegui: Move dbstore1002 under db1070 - T180714
  • 07:33 marostegui: Enable GTID on all the eqiad up-to-date hosts, only pending db1071 - T180714
  • 07:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool s5 eqiad hosts after reverting schema change - T180714 (duration: 00m 50s)
  • 06:34 marostegui: Revert schema changes on s5 codfw master with replication enabled, lag will be generated on codfw s5 - T180714
  • 04:43 krinkle@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op for labs (duration: 00m 51s)
  • 01:32 ejegg: re-enabled donation queue consumer
  • 01:28 ejegg: updated CiviCRM from 8454e06 to 272580a
  • 01:07 ejegg: updated CiviCRM from 0b8ceea to 8454e06
  • food: disabled donations queue consumer for thank you subject update
  • 00:41 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend.php: I60cce0 (duration: 00m 48s)
  • 00:40 krinkle@tin: Synchronized wmf-config/StartProfiler.php: I60cce0 (duration: 00m 49s)
  • 00:37 krinkle@tin: Synchronized wmf-config/profiler.php: I60cce0 (duration: 00m 48s)
  • 00:23 urandom: Decommissioning Cassandra, restbase1014-c.eqiad.wmnet (T179422)
  • 00:22 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/390131/3 (duration: 00m 49s)

2017-11-16

  • 23:50 mutante: wezen - systemctl restart rsyslog
  • 21:52 mepps: updated civicrm from 22f6532 to 0b8ceea
  • 21:05 mepps: updated payments-wiki from 210cb37 to f871160
  • 20:51 apergos: one more catchup rsync from ms1001 to labstore1006 kicking off
  • 20:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool only 'good' servers - third try (duration: 00m 49s)
  • 20:11 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool only 'good' servers - second try (duration: 00m 49s)
  • 20:10 demon@tin: Unlocked for deployment [operations/mediawiki-config]: No deploys, recovering from downtime (more narrow locking) (duration: 06m 49s)
  • 20:03 demon@tin: Locking from deployment [operations/mediawiki-config]: No deploys, recovering from downtime (more narrow locking) (planned duration: 360m 00s)
  • 20:02 twentyafterfour: deploy a9613b4 to hotfix T180706
  • 19:56 demon@tin: Unlocked for deployment [ALL REPOSITORIES]: No deploys, recovering from downtime (duration: 50m 28s)
  • 19:54 ayounsi@tin: Finished deploy [netbox/deploy@19f4f65]: (no justification provided) (duration: 00m 04s)
  • 19:54 ayounsi@tin: Started deploy [netbox/deploy@19f4f65]: (no justification provided)
  • 19:49 akosiaris: schedule extra downtime for s5 slaves
  • 19:46 urandom: Bootstapping Cassandra restbase2006-a (T179422)
  • 19:13 jynus: reset slave all on db1070 @ db1063-bin.001382:234464548
  • 19:05 demon@tin: Locking from deployment [ALL REPOSITORIES]: No deploys, recovering from downtime (planned duration: 360m 00s)
  • 19:05 demon@tin: Unlocked for deployment [ALL REPOSITORIES]: Dealing with outage, no deploys for now (duration: 01m 22s)
  • 19:03 demon@tin: Locking from deployment [ALL REPOSITORIES]: Dealing with outage, no deploys for now (planned duration: 60m 00s)
  • 18:48 marostegui: dewikipedia and wikidata currently back to writable
  • 18:44 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool only 'good' servers (duration: 00m 48s)
  • 18:41 mutante: dewikipedia and wikidata currently back to readonly-mode while wikidata is being worked on
  • 18:41 marostegui: Set s5 master read_only
  • 18:25 akosiaris: set mw2251 to inactive. T180724
  • 18:24 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2251.codfw.wmnet
  • 17:57 gehel: silencing wdqs
  • 17:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: Failover db1063 to db1070 (duration: 00m 46s)
  • 17:44 XioNoX: merging netbox CR
  • 17:41 bblack: disable port ge-5/0/39 on asw-c-eqiad (db1063)
  • 17:40 urandom: Decommissioning Cassandra, restbase1014-b.eqiad.wmnet (T179422)
  • 17:36 jynus: stopping slave on db1070
  • 17:28 urandom: Converting 'enwiki parsoid' to size-tiered compaction (T179422)
  • 17:19 demon@tin: Finished scap: consistency (duration: 25m 12s)
  • 17:18 bblack: Temporary GeoDNS routing changes (eqsin traffic simulation using ulsfo) - https://gerrit.wikimedia.org/r/#/c/391357/ - expecting ~24h, West Asia latencies will probably increase, spike in cache misses, etc...
  • 17:18 urandom: Converting 'wikipedia parsoid' to size-tiered compaction (T179422)
  • 17:16 godog: reimage restbase2006 - T179422
  • 17:15 urandom: Converting 'commons mobile' to size-tiered compaction (T179422)
  • 17:05 urandom: Converting 'others mobile' to size-tiered compaction (T179422)
  • 16:54 demon@tin: Started scap: consistency
  • 16:54 godog: reimage restbase2006 - T179422
  • 16:48 jynus: stop and restart db1071 for upgrade and reconfiguration
  • 16:48 herron: beginning gradual cutover of codfw mw systems to puppet 4 master puppetmaster2001
  • 16:48 godog: upgrade hpsa firmware to 6.06 on restbase2006 - T141756
  • 16:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1034 and db1079 original weight (duration: 00m 48s)
  • 16:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: mariadb: Depool db1071, pool db1104 as api (duration: 00m 49s)
  • 16:21 moritzm: restarting apache on labs puppet masters to pick up openssl updates
  • 16:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1034 weight (duration: 00m 48s)
  • 15:47 moritzm: rebooting ores1* for update to 4.9.51
  • 15:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1034 weight (duration: 00m 49s)
  • 15:33 moritzm: rebooting seaborgium (slapd) for update to 4.9.51
  • 15:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2038 - T178359 (duration: 00m 49s)
  • 15:16 milimetric@tin: Finished deploy [analytics/refinery@4ef15d3]: Mainly deploying the interlanguage navigation dataset (duration: 14m 29s)
  • 15:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1034 weight (duration: 00m 48s)
  • 15:10 godog: upgrade prometheus-redis-exporter to 0.13-1 - T148637
  • 15:06 moritzm: installing postgres security updates on labsdb1006/1007
  • 15:02 milimetric@tin: Started deploy [analytics/refinery@4ef15d3]: Mainly deploying the interlanguage navigation dataset
  • 15:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T177208 (duration: 00m 48s)
  • 15:00 herron: beginning cut over of codfw canary appservers to puppet 4 master puppetmaster2001
  • 14:55 moritzm: rebooting serpens (slapd) for update to 4.9.51
  • 14:50 elukey: updating puppet compiler's facts (following https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource:Puppet3-diffs#FAQ)
  • 14:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1034 weight - T178359 (duration: 00m 49s)
  • 14:26 dcausse: EU swat done
  • 14:23 moritzm: rebooting kubetcd* to 4.9.51
  • 14:18 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: T177913: [cirrus] Add overridden iw prefix for svwiki (2/2) (duration: 00m 48s)
  • 14:16 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T177913: [cirrus] Add overridden iw prefix for svwiki (1/2) (duration: 00m 50s)
  • 14:02 gehel: starting upgrade of elasticsearch eqiad - T178411
  • 13:46 moritzm: rebooting bohrium (piwik host) for update to 4.9.51
  • 13:33 moritzm: installing openssl updates on es* hosts
  • 13:24 moritzm: rebooting dubnium/pollux (openldap corp mirror) for update to 4.9.51
  • 13:12 moritzm: installing openssl updates on pc* hosts
  • 13:07 elukey: restart aqs on aqs100[5-9] to apply localQuorum (https://gerrit.wikimedia.org/r/391765) - T164348
  • 13:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1034 weight - T178359 (duration: 00m 49s)
  • 12:54 moritzm: rebooting radium (tor relay) for update to 4.9.51
  • 12:34 moritzm: installing openssl updates on memcached/redis clusters
  • 12:27 hoo: Updated operations/dumps/dcat (ea4e75..7734e04) on snapshot1007
  • 12:09 moritzm: rebooting dbmonitor* hosts for update to 4.9.51
  • 12:03 addshore@tin: Synchronized php-1.31.0-wmf.8/extensions/Wikidata/extensions/Constraints: T180665 Manually applied: Fix SparqlHelper::getCacheMaxAge() (duration: 00m 52s)
  • 11:58 addshore@tin: Synchronized php-1.31.0-wmf.8/extensions/WikibaseQualityConstraints: T180665 Fix SparqlHelper::getCacheMaxAge() (duration: 00m 53s)
  • 11:42 moritzm: installing openssl updates on conf* clusters
  • 11:37 addshore@tin: Synchronized wmf-config/Wikibase-production.php: testwikidata only, T180665, WBQC configuration for testwikidatawiki (duration: 00m 49s)
  • 11:32 addshore: addshore@terbium:~$ mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintStatements.php --wiki=testwikidatawiki > importConstraintStatements.log
  • 11:21 godog: upgrade prometheus to 1.8.1+ds+k8s-1 in ulsfo/esams/eqiad - T177395
  • 11:09 moritzm: rebooting debug proxies (hassium/hassaleh) for update to 4.9.51
  • 11:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1034 weight - T178359 (duration: 00m 48s)
  • 11:01 jynus: shutting down labsdb1010 to clone to labsdb1009
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 with low weight - T178359 (duration: 00m 48s)
  • 10:26 addshore: Kill the build deploy slot done!
  • 10:24 addshore@tin: Synchronized php-1.31.0-wmf.8/extensions/Wikibase: T180634 Bring Wikibase up to date with .8 branch of Wikidata build extension (duration: 01m 46s)
  • 10:15 addshore@tin: Synchronized php-1.31.0-wmf.8/extensions/WikibaseQualityConstraints: T180634 Bring WikibaseQualityConstraints up to date with .8 branch of Wikidata build extension (duration: 00m 53s)
  • 09:59 moritzm: rebooting prometheus servers in eqiad for update to 4.9.51
  • 09:44 elukey: restart aqs on aqs1004 to apply localQuorum (https://gerrit.wikimedia.org/r/391765) - T164348
  • 09:38 moritzm: rebooting prometheus servers in codfw for update to 4.9.51
  • 09:30 moritzm: uploaded icu57.1-6+wmf2 to jessie-wikimedia/component/icu57
  • 09:23 godog: bootstrap restbase2002-c - T179422
  • 09:19 godog: upgrade grafana to 4.6.1 on https://grafana.wikimedia.org/ - T180428
  • 08:19 marostegui: Deploy schema change on db1092 - T174569
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T174569 (duration: 00m 49s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 (duration: 00m 49s)
  • 07:40 marostegui: Stop MySQL on db1034 to copy its content to db1101.s7 - T178359
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 - T178359 (duration: 00m 49s)
  • 07:17 ppchelko@tin: Finished deploy [eventlogging/eventbus@872cfb3]: Revert gerrit 302372 due to AssertionError T180017 (duration: 00m 14s)
  • 07:17 ppchelko@tin: Started deploy [eventlogging/eventbus@872cfb3]: Revert gerrit 302372 due to AssertionError T180017
  • 06:54 marostegui: Deploy alter table on db1096 - T174569
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096 - T174569 (duration: 00m 48s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099 and db1071 - T174569 (duration: 00m 49s)
  • 06:15 smalyshev@tin: Finished deploy [wdqs/wdqs@b44cf27]: data reload/T176593 (duration: 00m 28s)
  • 06:14 smalyshev@tin: Started deploy [wdqs/wdqs@b44cf27]: data reload/T176593
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 07m 54s)
  • 02:09 twentyafterfour: phabricator database migrations complete, service is back online
  • 01:28 twentyafterfour: Phabricator will be offline for a couple of minutes while I apply database migrations.
  • 01:22 twentyafterfour: updating phabricator (belatedly)
  • 01:05 demon@tin: Synchronized scap/plugins/prep.py: no-op, co-master sync (duration: 00m 55s)
  • 00:59 tzatziki: Removing 2FA from AuburnPilot (T180654)
  • 00:59 tzatziki: Removing 2FA from Twotwo2019 (T180438)
  • 00:42 eileen: update civicrm from b99a9cf to 22f6532

2017-11-15

  • 22:48 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikidatawiki back to wmf.7
  • 22:21 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.8
  • 22:11 urandom: Setting keyspaces erroneously configured for leveled compaction, to use size-tiered (T180568)
  • 22:05 madhuvishy: Kicking off re-enabling puppet and puppet runs across Cloud VPS instances
  • 21:33 demon@tin: Synchronized php: symlink swap (duration: 00m 49s)
  • 21:21 madhuvishy: Disabling puppet across cloud VPS through cumin on labpuppetmaster1001
  • 21:01 ottomata: restarting kafka-jumbo brokers to update to Kafka 0.11.0.1
  • 20:52 urandom: Restarting restbase2002-a.codfw.wmnet (T180568)
  • 20:40 mutante: restarting gerrit to enable 'large file support'-plugin gerrit:391635
  • 20:38 addshore@tin: Finished scap: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch (again) T180539 (duration: 21m 48s)
  • 20:17 addshore@tin: Started scap: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch (again) T180539
  • 20:11 otto@tin: Finished deploy [eventlogging/eventbus@872cfb3]: deploying kafka-futures change to kafka1001 only, will apply async with https://gerrit.wikimedia.org/r/#/c/391634/ (duration: 00m 20s)
  • 20:11 otto@tin: Started deploy [eventlogging/eventbus@872cfb3]: deploying kafka-futures change to kafka1001 only, will apply async with https://gerrit.wikimedia.org/r/#/c/391634/
  • 19:53 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add BP and WP aliases for project namespace on mwlwiki (T180052) (duration: 00m 48s)
  • 19:48 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on mwlwiki (T180052) (duration: 00m 48s)
  • 19:41 demon@tin: Finished deploy [gerrit/gerrit@bce982f]: adding lfs plugin @ 2.13.9 (duration: 00m 08s)
  • 19:41 demon@tin: Started deploy [gerrit/gerrit@bce982f]: adding lfs plugin @ 2.13.9
  • 19:40 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable EventLogging for Popups (T178500) (duration: 00m 49s)
  • 19:37 gehel: upgrade of elasticsearch codfw completed, cluster still recovering - T178411
  • 19:34 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Minerva download icon on all wikis (T179914) (duration: 00m 48s)
  • 19:30 chasemp: labstore1004:~# service nfs-kernel-server restart
  • 19:19 catrope@tin: Synchronized php-1.31.0-wmf.8/resources/src/mediawiki.rcfilters/mw.rcfilters.UriProcessor.js: T180577 (duration: 00m 49s)
  • 19:16 catrope@tin: Synchronized php-1.31.0-wmf.7/skins/MinervaNeue/resources/skins.minerva.scripts/init.js: Limit download button to Google Chrome (T179529, T179914) (duration: 00m 49s)
  • 19:14 catrope@tin: Synchronized wmf-config/flaggedrevs.php: Enable RCFilters on all remaining wikis (T177445) (duration: 00m 49s)
  • 19:01 urandom: Restarting Cassandra instances on restbase2001.codfw.wmnet (T180568)
  • 18:38 Jamesofur: removed 2FA from User:Einsbor after verification for votewiki, stewardwiki and SUL
  • 18:20 moritzm: rebooting auth* servers for update to 4.9.51
  • 18:20 moritzm: rebooting auth
  • 17:57 demon@tin: Synchronized docroot/search.wikimedia.org/index.php: removing support for non-wikipedias (duration: 00m 49s)
  • 16:03 addshore@tin: Finished scap: Revert partial scap: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch, second try (T180539) (duration: 23m 48s)
  • 15:56 oblivian@tin: Finished deploy [docker-pkg/deploy@9b319d2]: Adding do extension to jinja2, needed for contint images (duration: 00m 18s)
  • 15:56 oblivian@tin: Started deploy [docker-pkg/deploy@9b319d2]: Adding do extension to jinja2, needed for contint images
  • 15:55 ema: restart pybal on lvs[12]00[36] for config change https://gerrit.wikimedia.org/r/#/c/389964/
  • 15:43 jynus: shutting down labsdb1009 for maintenance T179244
  • 15:40 addshore@tin: Started scap: Revert partial scap: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch, second try (T180539)
  • 15:29 moritzm: installing perl security updates on trusty (Debian hosts fixed two months ago)
  • 15:14 ladsgroup@tin: scap aborted: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch, second try (T180539) (duration: 03m 48s)
  • 15:13 moritzm: removing unused kernels from prometheus*
  • 15:11 ladsgroup@tin: Started scap: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch, second try (T180539)
  • 15:05 moritzm: rebooting kubestagetcd* for update to 4.9.51
  • 14:57 ladsgroup@tin: scap aborted: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch (T180539) (duration: 00m 31s)
  • 14:57 ladsgroup@tin: Started scap: Update extensions/Wikidata to new wmf/1.31.0-wmf.8 branch (T180539)
  • 14:42 Amir1: deployed ORES change for wikidata (gerrit:391197, phab:T180450)
  • 14:41 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 49s)
  • 14:37 mobrovac: restbase creating Cassandra 3 revision tables on restbase1009 - T179421
  • 14:30 zfilipin@tin: Synchronized wmf-config/: SWAT: Remove wgContentTranslationEnableSuggestions (duration: 00m 51s)
  • 14:27 ema: cache_upload: upgrade varnish to 4.1.8-1wm2
  • 14:22 zfilipin@tin: Synchronized wmf-config/CommonSettings-labs.php: SWAT: Beta: Explicitly set cookieDomain for ContentTranslationSiteTemplates (T149879) (duration: 00m 49s)
  • 14:18 ema: repool cp4024 T174891
  • 14:17 mutante: wtp2017 - systemctl start ferm (ferm wasnt running due to failed DNS lookup for prometheus2003 sometime in the past)
  • 14:12 zfilipin@tin: Synchronized php-1.31.0-wmf.8/extensions/EventBus/EventBus.php: SWAT: Increase request timeout to match kafka produce timeout (T180017) (duration: 00m 50s)
  • 14:11 godog: upgrade hpsa firmware to 6.06 on restbase2004 - T180562 T141756
  • 14:10 ema: upgrade varnish to 4.1.8-1wm2 on cp4024 (cache_upload, depooled)
  • 14:04 jynus: reload haproxy on dbproxy1010
  • 14:00 moritzm: installing openssl updates on kafka and hadoop clusters
  • 13:34 ema: cache_text: upgrade varnish to 4.1.8-1wm2
  • 13:31 ppchelko@tin: Started restart [changeprop/deploy@065a06e]: Restart to rebalance all rules T179684
  • 13:14 moritzm: rebooting video scalers in eqiad for update to 4.9.51 (and to pick up openssl update)
  • 13:07 ema: upgrade varnish to 4.1.8-1wm2 on cp3030 (cache_text)
  • 12:41 elukey: re-enable eventlogging after maintenance
  • 12:24 jynus: deploying dns change for m4-master
  • 12:19 ema: cache_misc: upgrade varnish to 5.1.3-1wm3
  • 12:09 elukey: executed sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on all jobrunners
  • 12:01 ema: upgrade varnish to 5.1.3-1wm3 on cp3007 (cache_misc)
  • 11:58 ema: varnish 4.1.8-1wm2 uploaded to apt.w.o (main)
  • 11:50 ema: varnish 5.1.3-1wm3 uploaded to apt.w.o (experimental)
  • 11:45 jynus: restart haproxy on dbproxy1009
  • 11:32 marostegui: Stop MySQL on db1046
  • 10:58 moritzm: rebooting job runners in eqiad for update to 4.9.51 (and to pick up openssl update)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106, optimizing wikidatawiki.wb_terms (duration: 00m 49s)
  • 10:09 ppchelko@tin: Finished deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x attempt 4 T179786 (duration: 04m 51s)
  • 10:04 ppchelko@tin: Started deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x attempt 4 T179786
  • 10:04 ppchelko@tin: Finished deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x attempt 3, force T179786 (duration: 04m 44s)
  • 09:59 ppchelko@tin: Started deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x attempt 3, force T179786
  • 09:59 ppchelko@tin: Finished deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x attempt 2 T179786 (duration: 03m 32s)
  • 09:58 ppchelko@tin: (no justification provided)
  • 09:55 ppchelko@tin: Started deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x attempt 2 T179786
  • 09:55 ppchelko@tin: Finished deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x T179786 (duration: 02m 44s)
  • 09:53 godog: reboot restbase2004 - T180562
  • 09:52 ppchelko@tin: Started deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x T179786
  • 09:52 mobrovac@tin: Started restart [restbase/deploy@c76a665]: Pick up the new seeds definition - T179422
  • 09:52 ppchelko@tin: Started deploy [trending-edits/deploy@a0e1fe3]: Update node-rdkafka to v1.x T179786
  • 09:32 godog: restart cassandra on restbase2002-b - T180568
  • 09:30 moritzm: updating openssl on database hosts
  • 09:19 godog: restart cassandra on restbase2001-c - T180568
  • 09:15 marostegui: Stop mysql on db1046 to transfer its content to db1107 - T177405
  • 09:08 elukey: stop eventlogging on eventlog1001, eventlogging replication on db1108/db1047/dbstore1002 as preparation steps to migrate the log db from db1046 to db1107
  • 09:02 jynus: rebooting labsdb1010 for kernel upgrade
  • 08:51 elukey: reboot thorium (hosting all analytics websites) for kernel updates
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101 - going to convert it to mult-instance T178359 (duration: 00m 48s)
  • 07:49 marostegui: Deploy schema change on db1071 and db1099 (s5) - T174569
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099 and db1071 - T174569 (duration: 00m 49s)
  • 07:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 and db1100 - T174569 (duration: 00m 49s)
  • 05:33 subbu: subbu@terbium (followup to the linter-reparse.py script log entry) it is safe to kill -9 the script on terbium anytime if there are any problems because of it
  • 05:27 subbu: subbu@terbium running linter-reparse.py script to initialize baseline linter categories for all wikis (12 hours in so far .. expected to run for ~2 weeks). hits parsoid eqiad cluster.
  • 04:06 demon@tin: Synchronized docroot/search.wikimedia.org/robots.txt: go away robots / kill some 404s (duration: 00m 50s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 05m 57s)
  • 00:59 ejegg: updated payments-wiki from d150287 to 210cb37
  • 00:53 Reedy: going to restart zuul as it's got backed up
  • 00:33 maxsem@tin: Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/391220/ (duration: 00m 49s)
  • 00:18 ejegg: updated SmashPig payments listener from 5262d53 to 45aa626
  • 00:12 ejegg: updated CiviCRM from f571c67 to b99a9cf

2017-11-14

  • 23:29 mutante: releases1001 - chmod g+w /srv/org/wikimedia/releases/mediawiki/1.*
  • 23:01 urandom: Decommissioning Cassandra, restbase1014-a.eqiad.wmnet (T179422)
  • 22:57 demon@tin: Synchronized w/: Removing mobilelanding from global w/, few sites actually need it (duration: 00m 49s)
  • 22:55 demon@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: symlink -> real file (duration: 00m 50s)
  • 21:42 no_justification: gerrit: restarted services on master/cobalt, things will flap for a second
  • 21:38 mutante: gerrit2001 - restarted gerrit
  • 21:17 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@494e0c6]: redeploy mjolnir submodule bump to master (duration: 02m 11s)
  • 21:15 ebernhardson@tin: Started deploy [search/mjolnir/deploy@494e0c6]: redeploy mjolnir submodule bump to master
  • 21:03 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@77310bc]: update mjolnir to master (duration: 02m 05s)
  • 21:01 ebernhardson@tin: Started deploy [search/mjolnir/deploy@77310bc]: update mjolnir to master
  • 20:37 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.8
  • 20:15 bblack: reboot cp4024
  • 19:58 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@b20f0da]: test mjolnir deployment (duration: 02m 12s)
  • 19:57 urandom: Bootstrapping restbase2002-b.codfw.wmnet (T179422)
  • 19:56 ebernhardson@tin: Started deploy [search/mjolnir/deploy@b20f0da]: test mjolnir deployment
  • 19:47 demon@tin: Finished scap: bootstrap wmf.8 (duration: 44m 37s)
  • 19:32 chasemp: clean up instances in error state in testlabs project
  • 19:10 jynus: restart db2034 for mariadb upgrade
  • 19:09 chasemp: for i in `OS_TENANT_NAME=testlabs openstack server list | grep stress | awk '{print $2}'`; do echo $i; OS_TENANT_NAME=testlabs openstack server delete $i; sleep 30; done T171473
  • 19:02 demon@tin: Started scap: bootstrap wmf.8
  • 18:32 arlolra: Updated Parsoid to e71937d0 (T178253)
  • 18:31 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9b10959]: Redeploying: Update mobileapps to c002862 (duration: 29m 12s)
  • 18:28 moritzm: upgraded nginx on notebook* to 1.13.6
  • 18:26 madhuvishy: Upgraded notebook1001 and 1002 to kernel version 4.9.51-1~bpo8+1
  • 18:22 arlolra@tin: Finished deploy [parsoid/deploy@b150764]: Updating Parsoid to e71937d0 (duration: 09m 13s)
  • 18:21 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@e6905f4]: test mjolnir deployment (duration: 01m 13s)
  • 18:19 ebernhardson@tin: Started deploy [search/mjolnir/deploy@e6905f4]: test mjolnir deployment
  • 18:13 arlolra@tin: Started deploy [parsoid/deploy@b150764]: Updating Parsoid to e71937d0
  • 18:08 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@cd6ddda]: (no justification provided) (duration: 04m 28s)
  • 18:07 moritzm: rebooting notebook* hosts for update to 4.9.51
  • 18:04 ebernhardson@tin: Started deploy [search/mjolnir/deploy@cd6ddda]: (no justification provided)
  • 18:02 mholloway-shell@tin: Started deploy [mobileapps/deploy@9b10959]: Redeploying: Update mobileapps to c002862
  • 17:58 ebernhardson@tin: Finished deploy [search/mjolnir/deploy@ceb5c2f]: (no justification provided) (duration: 00m 20s)
  • 17:57 ebernhardson@tin: Started deploy [search/mjolnir/deploy@ceb5c2f]: (no justification provided)
  • 17:56 ebernhardson@tin: Finished deploy [search/MjoLniR/deploy@607adfb]: (no justification provided) (duration: 00m 14s)
  • 17:56 ebernhardson@tin: Started deploy [search/MjoLniR/deploy@607adfb]: (no justification provided)
  • 17:44 herron: restarted puppetdb on nitrogen and nihal to pick up jre updates
  • 17:42 smalyshev@tin: Finished deploy [wdqs/wdqs@b44cf27]: data reload/T176593 (duration: 00m 16s)
  • 17:42 smalyshev@tin: Started deploy [wdqs/wdqs@b44cf27]: data reload/T176593
  • 17:27 demon@tin: Pruned MediaWiki: 1.31.0-wmf.3 (duration: 07m 58s)
  • 17:12 demon@tin: Synchronized wmf-config/CommonSettings.php: Beta-only, no-op (duration: 00m 43s)
  • 17:11 demon@tin: Synchronized wmf-config/extension-list-labs: No-op (duration: 00m 44s)
  • 17:10 demon@tin: Pruned MediaWiki: 1.31.0-wmf.6 [keeping static files] (duration: 02m 48s)
  • 16:32 twentyafterfour: Restarting apache2 on phab1001 (deploy phabricator hotfix: D876 )
  • 16:23 jynus: stop labsdb1010 mariadb to clone it later to labsdb1009 T179244
  • 15:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1105 in s1 and s2 - T178359 (duration: 00m 45s)
  • 15:00 thcipriani@tin: testing IRC logging
  • 15:00 urandom: Decommissioning Cassandra, restbase1012-c.eqiad.wmnet (T179422)
  • 14:37 moritzm: installing postgres security updates on maps*
  • 14:29 zeljkof: EU SWAT finished
  • 14:07 moritzm: installing postgres security updates on labsdb1004
  • 13:11 moritzm: installing openssl updates
  • 13:10 akosiaris: shutdown install1002 for disk resize
  • 12:57 godog: upgrade grafana to 4.6.1 on https://grafana-labs.wikimedia.org/ - T180428
  • 12:03 ema: restart varnish-be on cp3007, requests failing with 'no backend connection'
  • 11:49 moritzm: rebooting remaining app servers in eqiad for update to Linux 4.9.51 (and to pick up OpenSSL updates)
  • 11:22 moritzm: rebooting remaining API servers in eqiad for update to Linux 4.9.51 (and to pick up OpenSSL updates)
  • 11:04 godog: reimage restbase2002 - T179422
  • 10:50 marostegui: Deploy alter table on s5: db1104 db1100 db1106 - T174569
  • 10:32 elukey: removed old target configs from /srv/prometheus/analytics/targets on prometheus100[34] after https://gerrit.wikimedia.org/r/391179
  • 10:23 moritzm: rebooting image scalers in eqiad for update to Linux 4.9.51 (and to pick up OpenSSL updates)
  • 10:07 godog: upload scap 3.7.2-1 - T127762
  • 09:59 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Migrate RecordLintJob to EventBus - T175212 (duration: 00m 46s)
  • 09:56 marostegui: Deploy alter table on s5 - dbstore1001 - T174569
  • 09:49 foks: Disabled 2FA for Jean-Frédéric
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1105 as multi-instance host for s1 and s2 - T178359 (duration: 00m 46s)
  • 09:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db1105 as multi-instance host for s1 and s2 - T178359 (duration: 00m 47s)
  • 09:43 godog: upgrade prometheus to 1.8.1 with k8s on prometheus2004 - T177395
  • 09:27 ppchelko@tin: Finished deploy [cpjobqueue/deploy@df5fca9]: Enable RecordLintJob processing (duration: 00m 40s)
  • 09:27 ppchelko@tin: Started deploy [cpjobqueue/deploy@df5fca9]: Enable RecordLintJob processing
  • 09:07 moritzm: rebooting remaining app servers in codfw for update to Linux 4.9.51 (and to pick up OpenSSL updates)
  • 09:05 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2108.codfw.wmnet
  • 09:05 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: wtp2018.codfw.wmnet
  • 08:31 marostegui: Deploy alter table on s5 - dbstore1002 - T174569
  • 08:22 addshore@tin: Synchronized wmf-config/extension-list-labs: Add AdvancedSearch to extension-list T180147 PT 2/2 LABS ONLY (duration: 00m 46s)
  • 08:21 addshore@tin: Synchronized wmf-config/extension-list: Add AdvancedSearch to extension-list T180147 PT 1/2 (duration: 00m 47s)
  • 08:13 moritzm: rebooting job runners in codfw for update to Linux 4.9.51 (and to pick up OpenSSL updates)
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1103 on s2 and s4 - T178359 (duration: 00m 47s)
  • 06:18 marostegui: Deploy alter table on s6 primary master (db1061) - T174569
  • 02:30 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Nov 14 02:30:59 UTC 2017 (duration 6m 38s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 07m 28s)
  • 01:35 mdholloway: mobileapps finished rolling back to 11/8 deployment
  • 01:34 mholloway-shell@tin: Finished deploy [mobileapps/deploy@00e60b2]: (no justification provided) (duration: 02m 27s)
  • 01:31 mholloway-shell@tin: Started deploy [mobileapps/deploy@00e60b2]: (no justification provided)
  • 01:30 mdholloway: mobileapps rolling back today's deployment due to a significant increase in errors.

2017-11-13

  • 22:49 bawolff: deployed patch T119158 (Will affect language converter. -{}- no longer allowed in link urls)
  • 22:42 bawolff: deployed patch T124404
  • 22:02 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9b10959]: Update mobileapps to c002862 (duration: 05m 31s)
  • 21:57 mholloway-shell@tin: Started deploy [mobileapps/deploy@9b10959]: Update mobileapps to c002862
  • 21:40 urandom: Decommissioning Cassandra, restbase1012-b.eqiad.wmnet (T179422)
  • 20:20 ejegg: updated CiviCRM from ddc3881 to f571c67
  • 19:20 smalyshev@tin: Finished deploy [wdqs/wdqs@b44cf27]: data reload/T176593 (duration: 00m 20s)
  • 19:20 smalyshev@tin: Started deploy [wdqs/wdqs@b44cf27]: data reload/T176593
  • 19:18 thcipriani@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable per-filter profiling on enwiki T179323 (duration: 00m 45s)
  • 19:16 thcipriani@tin: Synchronized docroot/search.wikimedia.org/index.php: search.wikimedia.org: Clean up result returning logic (duration: 00m 47s)
  • 19:15 madhuvishy: Kick of second dumps rsync from ms1001 to labstore1006
  • 18:28 gehel: deploy latest blazegraph + GUI on wdqs200[23] to switch vocabulary - T176593
  • 18:20 elukey: drain + shutdown analytics1029 as prep step to replace the BBU - T178742
  • 18:09 hoo: Ran "scap pull" on mwdebug1001/snapshot1001 after (further) tests re T177486
  • 18:02 urandom: Restarting Cassandra, restbase-dev1004.eqiad.wmnet (testing new `c-foreach-restart`)
  • 18:00 demon@tin: Synchronized docroot/search.wikimedia.org/index.php: minor cleanup (duration: 00m 47s)
  • 17:28 hoo: Ran "scap pull" on mwdebug1001 after tests re T177486
  • 17:12 mobrovac: restbase depooling restbase1007, restbase1012, restbase1014, restbase2002, restbase2004, restbase2006 for T179422
  • 16:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1103 on s2 and s4 - T178359 (duration: 00m 47s)
  • 16:45 mobrovac@tin: Finished deploy [mathoid/deploy@63b2ddc]: Update to service-template-node v0.5.3 - T151396 (duration: 03m 45s)
  • 16:42 mobrovac@tin: Started deploy [mathoid/deploy@63b2ddc]: Update to service-template-node v0.5.3 - T151396
  • 16:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 - T174569 (duration: 00m 46s)
  • 15:58 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wtp2017.codfw.wmnet
  • 15:58 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=wtp2017.codfw.wmnet
  • 15:22 otto@tin: Finished deploy [eventlogging/analytics@e024af3]: T179625 (duration: 00m 02s)
  • 15:22 otto@tin: Started deploy [eventlogging/analytics@e024af3]: T179625
  • 15:09 otto@tin: Finished deploy [eventlogging/analytics@03285e4]: Reverting, got an error: userAgent is a <type unicode>. (duration: 00m 02s)
  • 15:09 otto@tin: Started deploy [eventlogging/analytics@03285e4]: Reverting, got an error: userAgent is a <type unicode>.
  • 15:08 urandom: Decommissioning Cassandra, restbase1012-a.eqiad.wmnet (T179422)
  • 15:06 otto@tin: Finished deploy [eventlogging/analytics@5796c27]: T179625 (duration: 00m 04s)
  • 15:06 otto@tin: Started deploy [eventlogging/analytics@5796c27]: T179625
  • 14:58 herron: upgrading puppetmaster2002 to puppet 4
  • 14:15 zeljkof: EU SWAT finished
  • 14:10 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Whitelist jenkins in test wiki (T167432) (duration: 00m 47s)
  • 14:08 marostegui: Deploy alter table on db1102.s6 (with replication - sanitarium master) - T174569
  • 13:49 marostegui: Stop replication on labsdb1010 to copy cebwiki.geo_tags table
  • 13:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1103 on s2 and s4 - T178359 (duration: 00m 46s)
  • 13:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Increase weight for db1103 on s2 and s4 - T178359 (duration: 00m 46s)
  • 13:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T174569 (duration: 00m 46s)
  • 13:09 marostegui: Deploy schema change on db1083 - T174569
  • 12:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 - T174569 (duration: 00m 47s)
  • 12:14 ema: cache_misc: upgrade varnish to 5.1.3-1wm2
  • 12:08 ema: cp3008: upgrade varnish to 5.1.3-1wm2
  • 12:02 ema: cp4021: restart varnish-be (mbox lag)
  • 11:56 moritzm: installing imagemagick security updates
  • 11:48 moritzm: installing irssi security updates
  • 11:18 elukey: restart of all the druid daemons on druid100[1-6] to apply the new prometheus jmx jvm exporters - T177459
  • 10:57 gehel: upgrade elasticsearch on cirrus / codfw - T178411
  • 10:49 marostegui: Deploy schema change on db1088 - T174569
  • 10:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 - T174569 (duration: 00m 54s)
  • 10:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T174569 (duration: 00m 46s)
  • 10:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db1103 as multi-instance host for s2 and s4 - T178359 (duration: 00m 46s)
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1103 as multi-instance host for s2 and s4 - T178359 (duration: 00m 47s)
  • 10:12 moritzm: rebooting remaining Parsoid hosts in eqiad for update to Linux 4.9.51 (and to pick up OpenSSL update)
  • 09:44 godog: test upgrade of prometheus 1.8.1 with k8s on prometheus2003 - T177395
  • 09:29 moritzm: rebooting mw1221-mw1235 for update to Linux 4.9.51 (and to pick up OpenSSL update)
  • 09:02 elukey: restart of druid brokers on druid100[1-6] to apply https://gerrit.wikimedia.org/r/390419 - T177459
  • 08:41 marostegui: Deploy alter table on db1093 - T174569
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T174569 (duration: 00m 46s)
  • 08:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098 after alter table - T174569 (duration: 00m 47s)
  • 08:33 moritzm: rebooting mw1238-mw1258 for update to Linux 4.9.51 (and to pick up OpenSSL update)
  • 07:48 moritzm: installing ruby2.3 security updates
  • 07:38 marostegui: Deploy alter table to db1104 - T179106
  • 07:33 marostegui: Optimize wb_terms table on db2052 - T179106
  • 06:44 marostegui: Deploy alter table directly on codfw s5 master (db2023), this will generate lag on codfw - T179793
  • 06:27 marostegui: Deploy alter table on db1098 - T174569
  • 06:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098 - T174569 (duration: 00m 49s)
  • 06:18 marostegui: Deploy alter table db2086 - T179106
  • 04:01 urandom: Decommissioning Cassandra, restbase1007-c.eqiad.wmnet (T179422)
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Nov 13 02:38:38 UTC 2017 (duration 6m 42s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 08m 18s)

2017-11-12

  • 19:39 urandom: Decommissioning Cassandra, restbase1007-b.eqiad.wmnet (T179422)
  • 16:19 bblack: cp4026 - restart backend (mailbox lag)
  • 01:29 urandom: Decommissioning Cassandra, restbase1007-a.eqiad.wmnet (T179422)

2017-11-11

  • 13:52 urandom: Decommissioning Cassandra, restbase2006-c.codfw.wmnet (T179422)
  • 00:48 urandom: Decommissioning Cassandra, restbase2006-b.codfw.wmnet (T179422)

2017-11-10

  • 18:08 smalyshev@tin: Finished deploy [wdqs/wdqs@ccab8ce]: data reload/T176593 (duration: 00m 34s)
  • 18:08 smalyshev@tin: Started deploy [wdqs/wdqs@ccab8ce]: data reload/T176593
  • 17:20 moritzm: uploaded icu 57.1-6+wmf1 for jessie-wikimedia/component/icu57 (co-installable build for ICU migration)
  • 17:11 moritzm: freed some disk space on install1002
  • 16:47 marostegui: Deploy alter table on s6, dbstore1001, dbstore1002 abd db1030 - T174569
  • 15:03 moritzm: rebooting remaing API servers in codfw to 4.9.51 (and to pick up OpenSSL updates)
  • 14:57 urandom: Decommissioning Cassandra, restbase2006-a.codfw.wmnet (T179422)
  • 14:17 moritzm: rebooting image scalers in codfw to 4.9.51 (and to pick up OpenSSL updates)
  • 13:10 marostegui: truncate /var/log/nginx/error.log.1 on install1002 as it is filling up
  • 13:09 moritzm: powercycling mw2118, stuck after reboot
  • 13:05 ema: cp4021: restart varnish-be due to mbox lag
  • 12:48 moritzm: rebooting video scalers in codfw to 4.9.51 (and to pick up OpenSSL updates)
  • 11:46 _joe_: restarted all services and repooled scb1001
  • 11:38 moritzm: rebooting mw2163-2199 to 4.9.51 (and to pick up OpenSSL updates)
  • 11:32 _joe_: stopping mobileapps as well on scb1001
  • 11:27 moritzm: rebooting wtp1025 to 4.9.51
  • 11:22 _joe_: stopping changeprop, celery-ores, cpjobqueue on scb1001
  • 11:21 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Disable AdvancedSearch on deployment.beta BETA ONLY T180201 (duration: 00m 46s)
  • 11:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove old comment about db1080 (duration: 00m 46s)
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1055 original weight - T178359 (duration: 00m 46s)
  • 10:59 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: wtp2017.codfw.wmnet
  • 10:55 _joe_: depooling scb1001 from all services while it becomes healthy again
  • 10:52 _joe_: restarting ores on scb1001, causing memory exhaustion
  • 10:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1055 weight - T178359 (duration: 00m 47s)
  • 10:49 moritzm: powercycling wtp2017, stuck after reboot
  • 10:24 marostegui: Deploy schema change on db2089 - T179106
  • 10:11 moritzm: rebooting Parsoid servers in codfw to 4.9.51 (and to pick up OpenSSL updates)
  • 10:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1055 weight - T178359 (duration: 00m 58s)
  • 09:54 marostegui: Compress enwiki on db1105.s1 - T178359
  • 09:45 moritzm: powercycling mw2108, stuck after reboot
  • 09:30 hashar: Upgrading operations-puppet-tests-docker jenkins job to stop passing docker --tty and thus have signals forwarded from 'docker run' - T176747
  • 09:23 moritzm: rebooting mw2097-mw2117 to 4.9.51 (and to pick up OpenSSL updates)
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 with low weight - T178359 (duration: 00m 47s)
  • 09:13 moritzm: powercycling mw2213, stuck after reboot
  • 08:43 moritzm: rebooting mw2200-mw2223 to 4.9.51 (and to pick up OpenSSL updates)
  • 07:50 smalyshev@tin: Finished deploy [wdqs/wdqs@213f864]: (no justification provided) (duration: 00m 33s)
  • 07:49 smalyshev@tin: Started deploy [wdqs/wdqs@213f864]: (no justification provided)
  • 07:27 _joe_: restarting apache on phab1001
  • 07:20 marostegui: Deploy alter table on s3.codfw master (db2018) with replication, this will generate lag on codfw - T174569
  • 06:50 marostegui: Deploy alter table on s5 eqiad master (db1063) - T172207
  • 06:41 marostegui: Stop MySQL on db1055 to copy its content to db1105 - T178359
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T178359 (duration: 00m 49s)
  • 06:39 marostegui: Force a BBU relearn on db1046 - T166141
  • 01:21 aaron@tin: Synchronized php-1.31.0-wmf.7/includes/db: Use the main stash for LBFactory "memStash" parameter (duration: 00m 47s)
  • 01:18 demon@tin: Synchronized docroot/search.wikimedia.org/index.php: minor cleanups, less 500s (duration: 00m 47s)

2017-11-09

  • 22:28 urandom: Decommissioning Cassandra, restbase2004-c.codfw.wmnet (T179422)
  • 20:28 demon@tin: Synchronized php-1.31.0-wmf.7/includes/libs/objectcache/WANObjectCache.php: less spammy error logs (duration: 00m 47s)
  • 20:09 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.7
  • 19:06 ebernhardson@tin: Synchronized php-1.31.0-wmf.7/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off DBN sizing AB test (duration: 00m 51s)
  • 19:00 bblack: cp3030 - end experimentation, puppetizing back to normal config
  • 18:46 bblack: cp3030 - round 2 of ssl_do_wait_shutdown test
  • 18:41 arlolra: Updated Parsoid to 2887b5ad (T178253, T173643, T176728, T180010, T171381, T179757)
  • 18:28 arlolra@tin: Finished deploy [parsoid/deploy@d1c7386]: Updating Parsoid to 2887b5ad (duration: 12m 20s)
  • 18:22 bblack: cp3030: puppet-disabled + manual nginx ssl_do_wait_shutdown config
  • 18:16 arlolra@tin: Started deploy [parsoid/deploy@d1c7386]: Updating Parsoid to 2887b5ad
  • 18:14 moritzm: not rebooting parsoid hosts due to Services deployment window, instead rolling restart of mw2120-mw2139 for kernel update to 4.9.51
  • 18:04 moritzm: rolling restart of parsoid servers in codfw for 4.9.51 kernel update
  • 17:14 urandom: Restarting Cassandra, restbase2005-b.codfw.wmnet (T179419)
  • 16:33 urandom: Restarting Cassandra, restbase2005-a.codfw.wmnet (T179419)
  • 15:12 urandom: Creating mathoid schema (T179419)
  • 15:05 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: Enable AdvancedSearch on beta LABS / BETA ONLY PT2/2 (duration: 00m 49s)
  • 15:04 addshore: last sync was actually "Enable AdvancedSearch on beta"
  • 15:04 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Add AdvancedSearch to extension-list-labs LABS / BETA ONLY PT1/2 (duration: 00m 50s)
  • 14:57 addshore@tin: Synchronized wmf-config/extension-list-labs: Add AdvancedSearch to extension-list-labs LABS / BETA ONLY (duration: 00m 50s)
  • 14:42 zeljkof: EU SWAT finished
  • 14:40 zfilipin@tin: Synchronized wmf-config/StartProfiler.php: SWAT: xenon: encode the request method as a virtual stack frame (duration: 00m 50s)
  • 14:35 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Use a threshold that ores in frwiki can stand (T180115) (duration: 00m 50s)
  • 14:30 zfilipin@tin: Synchronized php-1.31.0-wmf.7/extensions/ContentTranslation/modules/: SWAT: Bring back the overlay support for a specific screen region (T179997) (duration: 00m 50s)
  • 14:25 zfilipin@tin: Synchronized php-1.31.0-wmf.7/extensions/EventBus/EventBus.php: SWAT: Logging improvements Rename logged field to fix logstash mapping (duration: 00m 54s)
  • 14:09 urandom: Decommissioning Cassandra, restbase2004-b.codfw.wmnet (T179422)
  • 13:04 moritzm: rebooting mw1209-mw1220 (app servers) to 4.9.51 (also to pick up new OpenSSL)
  • 12:38 moritzm: rebooting mw1189-mw1208 (API servers) to 4.9.5 (also to pick up new OpenSSL)
  • 12:37 Amir1: ladsgroup@terbium:/srv/mediawiki-staging/php-1.31.0-wmf.6$ mwscript extensions/ORES/maintenance/CheckModelVersions.php --wiki=frwiki (T180115)
  • 11:51 moritzm: rebooting mw1180-mw1188 (app servers) to 4.9.5 (also to pick up new OpenSSL)
  • 11:45 _joe_: removed all local hacks from puppetmaster1001, now it uses rhodium again
  • 11:37 _joe_: cleaning up spurious directories /var/lib/puppet/server/ssl/ca from eqiad's puppetmaster backends, generated due to some error on 8/11/2017
  • 11:10 ema: powercycle cp2008, stuck rebooting
  • 11:03 moritzm: rebooting mw1276-mw1279 (API canaries) to 4.9.5 (also to pick up new OpenSSL)
  • 09:50 moritzm: rolling reboot of scb in eqiad for kernel update (also to pick up openssl updates)
  • 07:56 ema: cp1074 failed rebooting, power-cycled
  • 07:46 _joe_: restarting apache on rhodium after setting --profile --trace in the puppet settings
  • 07:04 legoktm@tin: Synchronized php-1.31.0-wmf.7/resources/: Restore jquery.badge and jquery.placeholder modules (duration: 00m 53s)
  • 02:58 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Nov 9 02:58:30 UTC 2017 (duration 6m 59s)
  • 02:51 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 09m 09s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.6) (duration: 09m 18s)
  • 01:02 twentyafterfour: Not deploying any phabricator updates this week.

2017-11-08

  • 23:45 urandom: Decommissioning Cassandra, restbase2004.codfw.wmnet (T179422)
  • 22:50 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.7 try #2
  • 22:43 ebernhardson@tin: Synchronized php-1.31.0-wmf.6/extensions/CirrusSearch/: Backport cirrus rescore profile refactor to wmf.6 (duration: 01m 02s)
  • 22:21 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to wmf.6
  • 22:09 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.7
  • 21:44 bsitzmann@tin: Finished deploy [mobileapps/deploy@00e60b2]: Update mobileapps to 8e82983 (T178706 T178708 T178333 T170692) (duration: 07m 12s)
  • 21:37 bsitzmann@tin: Started deploy [mobileapps/deploy@00e60b2]: Update mobileapps to 8e82983 (T178706 T178708 T178333 T170692)
  • 21:34 ejegg: re-started donations queue consumer with new thank you letter
  • 21:09 ejegg: updated CiviCRM from b11c591 to ddc3881
  • 20:52 ejegg: turned off donations consumer for ty letter update
  • 20:36 ejegg: updated payments-wiki from a539d27 to d150287
  • 20:33 aaron@tin: Synchronized php-1.31.0-wmf.6/extensions/CentralAuth: Use the proper cache key method in loadFromCache() (duration: 00m 54s)
  • 20:22 apergos: rsync from ms1001 to labstore1006 of dumps, 17T so expect it to take several days
  • 20:10 herron: depooled rhodium via puppetmaster1001 apache config
  • 20:09 demon@tin: Synchronized php: symlink (duration: 00m 49s)
  • 19:57 demon@tin: Synchronized php-1.31.0-wmf.7/extensions/CentralAuth/includes/CentralAuthUser.php: (no justification provided) (duration: 00m 51s)
  • 19:55 urandom: Creating mathoid schema (T179419)
  • 19:31 niharika29@tin: Synchronized php-1.31.0-wmf.6/extensions/ORES/: Store stats of accessing ores service for getting thresholds T179862 (duration: 00m 51s)
  • 19:31 urandom: Creating page restrictions schema (T179421)
  • 19:30 niharika29@tin: Synchronized php-1.31.0-wmf.7/extensions/ORES/: Store stats of accessing ores service for getting thresholds T179862 (duration: 00m 51s)
  • 19:24 niharika29@tin: Synchronized php-1.31.0-wmf.7/extensions/CirrusSearch/: Revert Improve handling of 5xx responses to elasticsearch requests (duration: 01m 02s)
  • 19:14 smalyshev@tin: Finished deploy [wdqs/wdqs@b330bc8]: Update service whitelist (duration: 02m 58s)
  • 19:11 smalyshev@tin: Started deploy [wdqs/wdqs@b330bc8]: Update service whitelist
  • 18:27 demon@tin: Synchronized wmf-config/: Dropping old PrivateSettings symlink (ducks and covers) (duration: 00m 52s)
  • 18:19 demon@tin: Synchronized phpcs.xml: no-op (duration: 00m 50s)
  • 17:58 urandom: Restarting Cassandra, restbase2005-[abc]
  • 17:51 urandom: Clearing snapshots in RESTBase legacy Cassandra cluster (T179417)
  • 17:47 marostegui: Deploy alter table on s1 codfw primary master (db2048) with replication, this will generate lag on codfw - T174569
  • 17:43 mobrovac: restbase truncate the default parsoid storage group's tables for T179417
  • 17:41 urandom: Restarting Cassandra, restbase2001-[abc]
  • 17:19 urandom: Restarting Cassandra, restbase2003-[abc]
  • 17:09 herron: regenerated rhodium puppet certificate
  • 17:08 urandom: Restarting Cassandra, restbase1009-[abc]
  • 16:58 urandom: Restarting Cassandra, restbase1008-[abc]
  • 16:49 awight@tin: Finished deploy [ores/deploy@82a13ae]: Roll back scb1002 (duration: 02m 37s)
  • 16:47 awight@tin: Started deploy [ores/deploy@82a13ae]: Roll back scb1002
  • 16:45 awight@tin: Finished deploy [ores/deploy@1b0e59f]: Try to purge specter of revscoring 1 (duration: 05m 45s)
  • 16:40 awight@tin: Started deploy [ores/deploy@1b0e59f]: Try to purge specter of revscoring 1
  • 16:37 urandom: Restarting Cassandra, restbase1010-[abc]
  • 16:37 godog: disregard message about thumbor rolling-restart, upgrade already done and only thumbor1001 rebooted now
  • 16:34 awight@tin: Finished deploy [ores/deploy@82a13ae]: Fix ORES on scb1002 (duration: 00m 03s)
  • 16:34 awight@tin: Started deploy [ores/deploy@82a13ae]: Fix ORES on scb1002
  • 16:29 godog: roll-restart thumbor in eqiad for kernel upgrade
  • 16:11 mobrovac@tin: Finished deploy [restbase/deploy@c5dd1e2]: Switch wiktionary definitions to use the next-gen storage, take 2b - T179420 (duration: 07m 22s)
  • 16:04 mobrovac@tin: Started deploy [restbase/deploy@c5dd1e2]: Switch wiktionary definitions to use the next-gen storage, take 2b - T179420
  • 16:02 mobrovac@tin: Finished deploy [restbase/deploy@c5dd1e2]: Switch wiktionary definitions to use the next-gen storage, take 2 - T179420 (duration: 00m 13s)
  • 16:02 mobrovac@tin: Started deploy [restbase/deploy@c5dd1e2]: Switch wiktionary definitions to use the next-gen storage, take 2 - T179420
  • 16:01 otto@tin: Finished deploy [eventlogging/analytics@03285e4]: Reverting EvenCapsule update and fixes, processes got restarted too early (duration: 00m 02s)
  • 16:01 otto@tin: Started deploy [eventlogging/analytics@03285e4]: Reverting EvenCapsule update and fixes, processes got restarted too early
  • 15:33 urandom: Decommissioning restbase2001-c.codfw.wmnet (T179422)
  • 15:23 _joe_: testing changes on rhodium regarding hostprivkey,hostcert
  • 15:21 ema: eqiad lvs reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m
  • 15:17 otto@tin: Finished deploy [eventlogging/analytics@02c5a6b]: EventCapsule update and fixes, this is no-op as is. T179625 (duration: 00m 04s)
  • 15:17 otto@tin: Started deploy [eventlogging/analytics@02c5a6b]: EventCapsule update and fixes, this is no-op as is. T179625
  • 15:01 otto@tin: Started restart [eventlogging/eventbus@41e3418]: Bumping worker processes to 16 on all targets: T180017
  • 14:55 otto@tin: Started restart [eventlogging/eventbus@41e3418]: Bumping worker processes to 16: T180017
  • 14:54 ema: esams lvs reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m
  • 14:54 otto@tin: Finished deploy [eventlogging/eventbus@41e3418]: (no justification provided) (duration: 00m 12s)
  • 14:54 otto@tin: Started deploy [eventlogging/eventbus@41e3418]: (no justification provided)
  • 14:41 ema: powercycle cp1050 (failed reboot)
  • 14:29 addshore@tin: Synchronized php-1.31.0-wmf.7/docs/uidesign/mediawiki.diff.html: SWAT Add render moved paragraphs marker in diff view PT 2/2 DOCS ONLY (duration: 00m 50s)
  • 14:28 addshore@tin: Synchronized php-1.31.0-wmf.7/resources/src/mediawiki/mediawiki.diff.styles.css: SWAT Add render moved paragraphs marker in diff view PT 1/2 (duration: 00m 51s)
  • 14:16 volans: upgrading cumin to v1.3.0 on prod and WMCS cumin masters
  • 14:10 zfilipin@tin: Synchronized php-1.31.0-wmf.6/extensions/EventBus/EventBus.php: SWAT: Logging improvements. (duration: 00m 52s)
  • 14:08 ema: codfw lvs reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m
  • 13:03 hasharAway: Upgrading jenkins on contint1001/contint2001
  • 12:54 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: name=phab2001-vcs.codfw.wmnet
  • 12:53 bblack: restart pybal on lvs2002 for git-ssh.codfw deploy
  • 12:51 bblack: restart pybal on lvs2005 for git-ssh.codfw deploy
  • 12:46 mutante: osmium - re-enabling puppet - temp test is over and will be decom'ed
  • 12:41 mutante: krypton (misc PHP apps, scholarships.wm, iegreview.wm, grafana, racktables, burrow) rebooting for kernel upgrade
  • 12:35 mutante: bromine (misc static tistes, annual/transparency/static-bz) - rebooting for kernel upgrade
  • 12:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1051 original weight after maintenance - T178359 (duration: 00m 50s)
  • 12:03 mutante: alcyone (url-downloader) rebooting for kernel upgrade
  • 11:53 mutante: netmon1003 (servermon) - rebooting for kernel upgrade
  • 11:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1051 weight after maintenance - T178359 (duration: 00m 50s)
  • 11:48 moritzm: installed openjdk-8/openssl updates and new kernels on restbase*
  • 11:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1051 weight after maintenance - T178359 (duration: 00m 50s)
  • 11:12 moritzm: installing jenkins security update on releases*
  • 11:10 moritzm: imported jenkins 2.73.3 to apt.wikimedia.org
  • 10:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 with low weight after maintenance - T178359 (duration: 01m 01s)
  • 10:57 ema: ulsfo lvs reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m
  • 10:42 mutante: actinium - reboot for kernel upgrade (url-downloader)
  • 10:40 ema: varnish 5.1.3-1wm2 built and uploaded to apt.w.o (experimental)
  • 10:39 mutante: aluminium - reboot for kernel upgrade (url-downloader)
  • 10:28 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1005.eqiad.wmnet
  • 10:18 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1004.eqiad.wmnet
  • 10:07 elukey: reboot aqs100[4-9] for jvm and kernel updates
  • 09:37 mutante: planet1001, alsafi (url-downloader) - apt autoremove; reboot for kernel upgrade
  • 09:32 mutante: restarting ircecho (icinga-wm)
  • 09:26 mutante: ununpentium (rt.wikimedia.org): apt-get autoremove; reboot for kernel upgrade
  • 09:23 mutante: rutherforidum (people.wikimedia.org) : apt-get autoremove ; reboot for kernel upgrade
  • 09:22 Pchelolo: restart cassandra-a on restbase1010
  • 09:09 Pchelolo: restart restbase on 1013 and 1015
  • 08:58 mutante: planet2001 - apt autoremove; reboot for kernel upgrade
  • 08:40 ema: resume cache_text/upload rolling reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m and 1.1.0g
  • 07:55 marostegui: Deploy alter table on s1 - on codfw master (db2048) with replication enabled - T172207
  • 07:46 marostegui: Deploy alter table on s2 - on codfw master (db2017) with replication enabled - T172207
  • 07:21 marostegui: Deploy alter table on s3 - on codfw master (db2018) with replication enabled - T172207
  • 07:13 marostegui: Deploy alter table on s7 - on codfw master (db2029) with replication enabled - T172207
  • 07:13 krinkle@tin: Synchronized docroot/noc/conf/: I2e51e783a (duration: 01m 06s)
  • 06:57 marostegui: Deploy alter table on s6 - on codfw master (db2028) with replication enabled - T172207
  • 06:26 marostegui: Add 330G to db2023 partition to make sure the alter over logging table runs fine - T174569
  • 06:16 Krinkle: Restarted uwsgi-graphite-web service on graphite2001
  • 06:16 Krinkle: Restarted uwsgi-graphite-web service on graphite1001
  • 06:15 marostegui: Stop MySQL on db1051 to copy its content to db1105.s1 - T178359
  • 06:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T178359 (duration: 00m 51s)
  • 05:45 Krinkle: rm -rf /var/lib/carbon/whisper/MediaWiki/wanobjectcache/centralauth_user_* on graphite1001 and graphite2001 for T179999
  • 03:17 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Nov 8 03:17:53 UTC 2017 (duration 7m 13s)
  • 03:10 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.7) (duration: 15m 16s)
  • 02:44 aaron@tin: Synchronized php-1.31.0-wmf.7/tests: Deploy 087f2d579a9f which reverts 4432e898be0 due to statsd spam (duration: 01m 14s)
  • 02:42 aaron@tin: Synchronized php-1.31.0-wmf.7/includes: Deploy 087f2d579a9f which reverts 4432e898be0 due to statsd spam (duration: 01m 40s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.6) (duration: 07m 38s)

2017-11-07

  • 23:57 urandom: Decommissioning restbase2001-b.codfw.wmnet (T179422)
  • 22:50 ejegg: re-enabled thank you mailer
  • 22:49 ejegg: updated CiviCRM from 85e89c6 to dc1b279
  • 22:39 hoo: Reset a global account's email, per T179950
  • 22:28 ejegg: updated CiviCRM from 122ba65 to 85e89c6
  • 21:56 ejegg: updated CiviCRM from dc9d054 to 122ba65
  • 21:20 addshore@tin: Synchronized wmf-config/Wikibase-buildentry.php: T176948 Remove Shared Cache settings from Wikibase-buildentry (duration: 00m 50s)
  • 21:16 addshore@tin: Synchronized wmf-config/Wikibase.php: T176948 Stop using wgWikibaseSharedCacheKeyPrefix from Wikidata build (duration: 00m 49s)
  • 21:09 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T176948 Remove wmgWikibaseUseConfigFromWikidataBuild PT 3/3 (LABS) (duration: 00m 50s)
  • 21:07 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T176948 Remove wmgWikibaseUseConfigFromWikidataBuild PT 2/3 (duration: 00m 52s)
  • 21:06 addshore@tin: Synchronized wmf-config/Wikibase.php: T176948 Remove wmgWikibaseUseConfigFromWikidataBuild PT 1/3 (duration: 00m 50s)
  • 21:02 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T176948 wmgWikibaseUseConfigFromWikidataBuild flase for all of PROD (duration: 00m 49s)
  • 21:01 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T176948 Load wikibase build from mediawiki-config for wikidataclient (duration: 00m 50s)
  • 20:59 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T176948 Load wikibase build from mediawiki-config for group0 & group1 (duration: 00m 50s)
  • 20:54 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T176948 Load wikibase build from mediawiki-config for hewiki (duration: 00m 49s)
  • 20:51 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T176948 Load wikibase build from mediawiki-config for wikidatawiki (duration: 00m 50s)
  • 20:48 herron: puppet issue cleared after reverting 386666. restarting ircecho on einsteinium
  • 20:45 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T176948 #1 #2 #3 Load wikibase build from mediawiki-config for BETA ONLY (duration: 00m 50s)
  • 20:28 addshore@tin: Synchronized wmf-config/Wikibase.php: Add loading of wikibase extensions from build PT 3/3 (duration: 00m 50s)
  • 20:27 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Add loading of wikibase extensions from build PT 2/3 (duration: 00m 50s)
  • 20:25 addshore@tin: Synchronized wmf-config/Wikibase-buildentry.php: Add loading of wikibase extensions from build PT 1/3 (duration: 00m 49s)
  • 20:20 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Remove unused wmgUseWikibasePropertySuggester PT 2/2 (duration: 00m 50s)
  • 20:19 addshore@tin: Synchronized wmf-config/CommonSettings.php: Remove unused wmgUseWikibasePropertySuggester PT 1/2 (duration: 00m 50s)
  • 20:06 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.7
  • 19:44 volans: restarted apache2 on rhodium (puppet master failing)
  • 19:31 herron: restart apache2 service on rhodium
  • 19:12 demon@tin: Finished scap: wmf.7 bootstrap (duration: 48m 15s)
  • 19:03 awight: begin stress test on ores*
  • 18:58 awight@tin: Finished deploy [ores/deploy@29905e5]: test deployment to repair ores1009 (non-production) (duration: 00m 31s)
  • 18:57 awight@tin: Started deploy [ores/deploy@29905e5]: test deployment to repair ores1009 (non-production)
  • 18:56 awight@tin: Finished deploy [ores/deploy@29905e5]: test deployment to repair ores1008 (non-production) (duration: 00m 32s)
  • 18:56 awight@tin: Started deploy [ores/deploy@29905e5]: test deployment to repair ores1008 (non-production)
  • 18:50 awight@tin: Finished deploy [ores/deploy@29905e5]: test deployment to repair ores1002 (non-production) (duration: 00m 33s)
  • 18:49 awight@tin: Started deploy [ores/deploy@29905e5]: test deployment to repair ores1002 (non-production)
  • 18:39 volans: slowly running puppet on failed hosts with cumin (concurrency=5)
  • 18:38 awight@tin: Finished deploy [ores/deploy@29905e5]: test deployment to ores* (non-production) (duration: 00m 20s)
  • 18:38 awight@tin: Started deploy [ores/deploy@29905e5]: test deployment to ores* (non-production)
  • 18:36 _joe_: restarting apache2 on rhodium
  • 18:24 demon@tin: Started scap: wmf.7 bootstrap
  • 18:16 awight@tin: Finished deploy [ores/deploy@29905e5]: test deployment to ores* (non-production) (duration: 01m 04s)
  • 18:15 awight@tin: Started deploy [ores/deploy@29905e5]: test deployment to ores* (non-production)
  • 18:06 herron: restarting puppetdb service on nitrogen
  • 17:56 urandom: Decommissioning restbase2001-a.codfw.wmnet (T179422)
  • 17:55 elukey: stop ircecho on einstenium (puppet shower from nitrogen)
  • 17:40 ema: stop cache_text/upload rolling reboots, resuming tomorrow
  • {{safesubst:SAL entry|1=17:38 urandom: Restart Cassandra, restbase1010-{a,b,c}.eqiad.wmnet (T178177)}}
  • 17:33 urandom: Clearing Cassandra snapshots (T179422)
  • 17:32 moritzm: rolling reboot of scb in codfw to pick up new kernel (and openssl updates)
  • {{safesubst:SAL entry|1=17:02 urandom: Restarting Cassandra, restbase2001-{a,b,c} to apply OpenJDK upgrade}}
  • 16:47 demon@tin: Synchronized multiversion/submodules.json: no op (duration: 00m 47s)
  • 16:12 ema: start cache_text/upload rolling reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m and 1.1.0g
  • 15:46 gehel: rolling restart of cassandra maps-test for logging change
  • 15:43 godog: roll-restart thumbor in eqiad for kernel upgrade
  • 15:37 urandom: T179420: recreating wiktionary definition schemas
  • 15:29 mobrovac@tin: Finished deploy [restbase/deploy@eab2948]: revert definition switch, wrong schema - T179420 (duration: 06m 46s)
  • 15:22 mobrovac@tin: Started deploy [restbase/deploy@eab2948]: revert definition switch, wrong schema - T179420
  • 15:12 _joe_: added a runner for htmlCacheUpdate on cewiki too
  • 15:10 mobrovac@tin: Finished deploy [restbase/deploy@c5dd1e2]: Switch wiktionary definitions to use the next-gen storage - T179420 (duration: 07m 52s)
  • 15:02 mobrovac@tin: Started deploy [restbase/deploy@c5dd1e2]: Switch wiktionary definitions to use the next-gen storage - T179420
  • 14:52 godog: roll-restart thumbor for kernel upgrades
  • 14:38 mobrovac: restbase creating wiktionary definition schemas for T179420
  • 14:36 godog: reboot wezen for kernel upgrade
  • 14:33 godog: reboot lithium for kernel upgrade
  • 14:26 elukey: rolling restart of kafka on kafka-jumbo* for jvm security updates
  • 14:20 moritzm: rebooting mw canaries to 4.9.51 kernel (also picking up openssl/openssl1.1 updates)
  • 14:19 moritzm: rearming keyholder on naos
  • 14:14 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ShortUrl on pa.wiki - T178919 (duration: 00m 46s)
  • 14:11 hashar: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=pawiki ShortUrl - T178919
  • 14:08 moritzm: rebooting tureis for kernel update
  • 14:07 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: add _ in appendix talk namespace at mywiktionary - T179907 (duration: 00m 45s)
  • 14:02 moritzm: actually holding mw canary reboots until SWAT is over
  • 14:01 moritzm: rebooting mw canaries to 4.9.51 kernel (also picking up openssl/openssl1.1 updates)
  • 13:50 herron: rebooted fermium (lists) for kernel update
  • 13:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comment about db1105 current status (duration: 00m 47s)
  • 13:43 herron: starting rolling mx reboots for kernel update
  • 13:31 marostegui: Deploy schema change on s5 codfw master (db2023) with replication, this will generate lag on codfw - T174569
  • 13:19 moritzm: rebooting naos for kernel update
  • 13:17 gehel: reboot maps eqiad cluster for upgrades
  • 13:10 moritzm: rebooting wasat for kernel update
  • 12:50 mutante: hafnium, tungsten: groupdel perf-roots to go with gerrit:389663 (T179728)
  • 12:00 mobrovac@tin: Finished deploy [restbase/deploy@eab2948]: Use the new storage for wikidata.org - T179417 (duration: 08m 14s)
  • 11:52 mobrovac@tin: Started deploy [restbase/deploy@eab2948]: Use the new storage for wikidata.org - T179417
  • 11:43 mobrovac: restbase truncating cassandra 2 non-WP tables for T179417
  • 11:42 mobrovac: restbase truncating cassandra 2 non-WP tables for T179420
  • 11:29 moritzm: installing java security updates/restarting cassandra on restbase2001 (cassandra3 node)
  • 11:07 ema: cache_misc rolling reboots: upgrading kernel to 4.9.51, libssl to 1.0.2m and 1.1.0g
  • 10:59 ema: reboot cp3007: upgrading kernel to 4.9.51, libssl to 1.0.2m and 1.1.0g
  • 10:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db2092 as recentchanges multi-instance host on s1 and s3 T178359 (duration: 00m 45s)
  • 10:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2092 as recentchanges multi-instance host on s1 and s3 T178359 (duration: 00m 45s)
  • 10:41 moritzm: restarting ntpd on dns recursors to pick up openssl update
  • 10:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2091 as recentchanges multi-instance host on s2 and s4 T178359 (duration: 00m 45s)
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db2091 as recentchanges multi-instance host on s2 and s4 T178359 (duration: 00m 46s)
  • 10:24 elukey: create staging database on db1108 (researchers scratch pad) - T177405
  • 10:19 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2086 as recentchanges multi-instance host on s5 and s7 T178359 (duration: 00m 45s)
  • 10:09 gehel: reboot of maps codfw cluster for upgrades
  • 09:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db2086 as recentchanges multi-instance host on s5 and s7 T178359 (duration: 00m 45s)
  • 09:58 paravoid: updating certspotter to 0.5 in apt and tegmen/einsteinium
  • 09:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db2089 as recentchanges multi-instance host on s6 and s5(s8) T178359 (duration: 00m 46s)
  • 09:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2089 as recentchanges multi-instance host on s6 and s5(s8) T178359 (duration: 00m 45s)
  • 09:13 marostegui: Stop MySQL on db1038 - host to be decommissioned - T177911
  • 09:11 moritzm: installing java security updates/restarting cassandra on restbase2002
  • 09:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1038 from config - T177911 (duration: 00m 45s)
  • 09:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3db0cc4]: Do not set Host header for requests to jobrunner (duration: 00m 49s)
  • 09:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1038 from config - T177911 (duration: 00m 47s)
  • 09:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@3db0cc4]: Do not set Host header for requests to jobrunner
  • 08:49 marostegui: Run redact_sanitarium for hifwiktionary on db1095 (sanitarium) - T173647
  • 08:48 marostegui: Optimize pagelinks and templatelinks on s7 master - db1062 - T174509
  • 07:18 mobrovac@tin: Started restart [electron-render/deploy@8dd5f13]: Electron stuck, restarting - T174916
  • 07:11 ema: reboot pinkunicorn for kernel (4.9.51) and openssl (1.0.2m) upgrades
  • 06:19 marostegui: Deploy alter table on s7 codfw master (db2029) with replication, this will cause lag in codfw - T174569
  • 02:29 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Nov 7 02:29:54 UTC 2017 (duration 6m 39s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.6) (duration: 07m 19s)

2017-11-06

  • 22:34 bawolff: Deploy patch for