You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log

From Wikitech-static
Revision as of 23:30, 5 February 2020 by imported>Stashbot (ebernhardson: delete search indices duplicated on multiple clusters for: hywwiki, chrwiktionary, gcrwiki, mnwwiki, noboard_chapterswikimedia nqowiki nrmwiki outreachwiki and srnwiki)
Jump to navigation Jump to search

2020-02-05

  • 23:30 ebernhardson: delete search indices duplicated on multiple clusters for: hywwiki, chrwiktionary, gcrwiki, mnwwiki, noboard_chapterswikimedia nqowiki nrmwiki outreachwiki and srnwiki
  • 23:08 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@a51f927]: Update mobileapps to a7928fa (duration: 10m 48s)
  • 22:57 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@a51f927]: Update mobileapps to a7928fa
  • 22:07 mutante: Gerrit - added ppchelko to 'wmf-deployment' Gerrit group (he is already in deployment admin group) (T244389)
  • 21:37 arlolra@deploy1001: Finished deploy [parsoid/deploy@01d9d3d]: Updating Parsoid to 74730a3 (duration: 03m 07s)
  • 21:33 arlolra@deploy1001: Started deploy [parsoid/deploy@01d9d3d]: Updating Parsoid to 74730a3
  • 21:31 mutante: killing and restarting wikibugs, it was reporting each update twice
  • 20:51 joal@deploy1001: Finished deploy [analytics/refinery@a47f0d5] (thin): Analytics regular weekly deploy (duration: 00m 07s)
  • 20:51 joal@deploy1001: Started deploy [analytics/refinery@a47f0d5] (thin): Analytics regular weekly deploy
  • 20:51 joal@deploy1001: Finished deploy [analytics/refinery@a47f0d5]: Analytics regular weekly deploy (duration: 13m 28s)
  • 20:50 mutante: ores1004 - systemctl start celery-ores-worker
  • 20:45 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.18 refs T233866 (duration: 01m 07s)
  • 20:44 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.18 refs T233866
  • 20:37 joal@deploy1001: Started deploy [analytics/refinery@a47f0d5]: Analytics regular weekly deploy
  • 20:34 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw1269.eqiad.wmnet
  • 20:25 dzahn@cumin1001: conftool action : set/weight=25; selector: name=mw1267.eqiad.wmnet
  • 20:25 mutante: mw1267 restarting php7.2-fpm
  • 20:21 joal@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@714e2d0]: Deploy bug fix version (duration: 00m 08s)
  • 20:21 joal@deploy1001: Started deploy [analytics/hdfs-tools/deploy@714e2d0]: Deploy bug fix version
  • 20:09 twentyafterfour: Preparing to deploy wmf/1.35.0-wmf.18 to group1 wikis refs T233866
  • 20:09 moritzm: installing git security updates for jessie
  • 20:00 moritzm: installing unzip security updates
  • 19:44 mutante: LDAP - added spramduya to wmf group (T243802)
  • 19:38 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Clean up VisualEditor settings (duration: 01m 07s)
  • 19:38 ebernhardson: restart mjolnir-kafka-bulk-daemon across eqiad, daemons appear stuck and not reading new messages
  • 19:19 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T238029 Enable InukaPageView logging on production Wikipedias (duration: 01m 07s)
  • 19:15 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Sync back revert of 975b4bbb9 (duration: 01m 06s)
  • 19:10 jforrester@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 18:35 vgutierrez: pooling cp5012 - T242093
  • 18:23 vgutierrez: rebooting cp5012 - T242093
  • 18:21 elukey: restart memcached on mc1025 with 8 threads (rollback - revert https://gerrit.wikimedia.org/r/#/c/570370/, run puppet, restart memcached)
  • 17:51 mutante: ganeti1017 - rebooting (not in use yet)
  • 17:34 reedy@deploy1001: Synchronized php-1.35.0-wmf.18/languages/: T244300 (duration: 01m 13s)
  • 17:33 reedy@deploy1001: Synchronized php-1.35.0-wmf.18/includes/: T244300 (duration: 01m 14s)
  • 16:53 urandom: Sessionstore deployment (mediawiki-config) is done
  • 16:37 ppchelko@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: gerrit:569678 Config: Enable sessionstore on group0 and 1 T243106 (duration: 01m 08s)
  • 16:25 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T232140 Restore wgLogoHD to wikis without a MinervaCustomLogos defined (duration: 01m 09s)
  • 16:07 elukey: update puppet compiler's facts
  • 15:54 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 15:52 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
  • 15:29 effie: restart php-fpm on canaries - T236800
  • 15:24 effie: Rollout php-apcu_5.1.17+4.0.11-1+0~20190217111312.9+stretch~1.gbp192528+wmf2 to api, app and jobrunner canaries - T236800
  • 15:15 vgutierrez: depooling & reimaging cp5012 as buster - T242093
  • 15:12 ema: cp: unset Accept-Encoding from ats-be requests to applayer T242478
  • 14:35 vgutierrez: updating acme-chief to version 0.24 - T244236
  • 14:32 _joe_: restarting mcrouter at nice -19 on mw1331 for testing effects of that change
  • 14:30 vgutierrez: upload acme-chief 0.24 to apt.wm.o (buster) - T244236
  • 14:26 XioNoX: push inital flowspec config to all routers
  • 14:23 vgutierrez: pooling cp5006 - T242093
  • 14:13 ema: cp1075: back to leaving Accept-Encoding as it is due to unrelated applayer issues T242478
  • 13:46 marostegui: Decrease buffer pool size on db1107 for testing - T242702
  • 13:45 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 13:43 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime
  • 13:42 akosiaris: undo the manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency. Restart php-fpm
  • 13:41 ema: cp1075: unset Accept-Encoding on origin server requests T242478
  • 13:39 Amir1: EU SWAT is done
  • 13:38 ema: cp: disable puppet and merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/570311/ T242478
  • 13:35 XioNoX: rollback traffic steering off cr2-eqord
  • 13:29 akosiaris: manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency
  • 13:25 XioNoX: reboot cr2-eqord for software upgrade - yaaaaa
  • 13:24 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.18/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: Cache PropertyInfoLookup internally (T243955) (duration: 01m 07s)
  • 13:17 XioNoX: increase ospf cost for cr2-eqord links