You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server admin log/Archive 25: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Peachey88
m (→‎July 28: fix template call)
imported>Nintendofan885
(Nintendofan885 moved page Server admin log/Archive 25 to Server Admin Log/Archive 25: Match top page)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 
#REDIRECT [[Server Admin Log/Archive 25]]
== September 30 ==
* 23:29 logmsgbot: spage Synchronized wmf-config/InitialiseSettings.php: Create log group for Echo (duration: 00m 11s)
* 23:27 logmsgbot: spage Synchronized php-1.25wmf1/extensions/Echo: Echo no-op (change reverted) (duration: 00m 09s)
* 22:55 ori: re-enabling puppet on mw1019
* 22:36 ori: disabling puppet on mw1019 to enable debug logging in apache
* 22:09 mutante: removing linne from DNS - was already shutdown about 24 hours before
* 21:57 K4-713: updated prod civicrm to 477a5107a0c93ceac5214
* 21:44 ori: Spike of bitter irony from Nemo_bis on #wikimedia-operations starting 21:43 UTC
* 21:33 logmsgbot: ori Synchronized php-1.25wmf1/languages/Language.php: I672c699c (2/2) (duration: 00m 03s)
* 21:33 logmsgbot: ori Synchronized php-1.25wmf1/includes/specialpage/SpecialPageFactory.php: I672c699c (1/2) (duration: 00m 07s)
* 21:23 Nemo_bis: widespread reproducible 503 errors on wikidata and elsewhere
* 20:55 andrewbogott: powering down virt0, just to see what breaks
* 20:48 andrewbogott: shutting down pdns on virt0
* 20:48 andrewbogott: shutting down opendj on virt (temporary, a preview of tomorrow)
* 18:50 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 16s)
* 18:49 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 15s)
* 18:41 mutante: pc1001-1003 - can't generate tmp files for percona monitoring checks -> puppet fail
* 18:24 mutante: killing silver from icinga and puppet
* 18:23 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 16s)
* 18:21 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.25wmf1
* 18:05 logmsgbot: ori Synchronized wmf-config/HHVMRequestInit.php: (no message) (duration: 00m 07s)
* 18:04 K4-713: re-enabled all queue consumers
* 17:56 ejegg: updated civicrm from e83c999f39e6ae847d9b48e38c8c825fc10d1635 to b6c350f620c8dc1f3410de179c19cbcbdeb62270
* 17:19 K4-713: disabled qc jobs and TY mail send for pending civi deploy
* 15:45 hashar: Updating our Jenkins job builder fork  686265a..ee80dbc (no job changed)
* 15:42 bblack: rebooting mexia
* 15:33 logmsgbot: demon Synchronized docroot/bits/favicon/wikipedia.ico: Favicons are my favorite icons, especially when they're only 18% of the size of the original (duration: 00m 04s)
* 15:16 logmsgbot: demon Synchronized php-1.25wmf1/extensions/Wikidata: (no message) (duration: 00m 11s)
* 15:14 logmsgbot: demon Synchronized php-1.25wmf1/extensions/VisualEditor: (no message) (duration: 00m 08s)
* 15:12 akosiaris: merging https://gerrit.wikimedia.org/r/#/c/163735/1, changing the LDAP master from sanger to ldap-mirror for inbound mail
* 15:12 andrewbogott: running sync-common on virt1000
* 15:12 logmsgbot: demon Synchronized visualeditor.dblist: (no message) (duration: 00m 04s)
* 15:11 logmsgbot: demon Synchronized visualeditor-default.dblist: (no message) (duration: 00m 04s)
* 15:06 logmsgbot: demon Synchronized wmf-config/Wikibase.php: (no message) (duration: 00m 04s)
* 15:06 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
* 14:26 _joe_: restarted apache on mw1196, lots of apc errors
* 14:22 logmsgbot: oblivian gracefulled all apaches
* 12:10 mark: Stopped exim daemon on mchenry
* 09:41 godog: removed obsolete /etc/puppet/hiera from strontium and palladium, /etc/puppet/hieradata is the new location
* 09:24 godog: reboot ms-be2001 as a test
* 04:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 30 04:18:48 UTC 2014 (duration 18m 47s)
* 03:17 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-30 03:17:15+00:00
* 02:41 logmsgbot: ori Synchronized 503.html: Ia88b306ef: Make the 503 error page consistent with other 5xx error pages (duration: 00m 08s)
* 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-30 02:34:07+00:00
* 01:00 Krinkle: Jenkins connection seemed in order with integration-slave1007 and 8, but disconnecting and relaunching the slave agents immediately resulted in them getting jobs assigned. Cause unknown, problem resolved for now.
* 00:58 Krinkle: integration-slave1007 and integration-slave1008 have not gotten any jobs in the past 24h. integration-slave1006 however has gotten loads of action. Investigating load balancing issue.
* 00:24 mutante: linne - shutting down, revoking puppet cert, salt key, puppet/icinga ...
* 00:12 logmsgbot: maxsem Synchronized w/skins-1.5: (no message) (duration: 00m 03s)
* 00:12 MaxSem: https://gerrit.wikimedia.org/r/#/c/162520/ broke stuff, reverted
* 00:10 logmsgbot: maxsem Synchronized live-1.5: (no message) (duration: 00m 03s)
 
== September 29 ==
* 23:59 logmsgbot: maxsem Synchronized w: https://gerrit.wikimedia.org/r/#/c/162520/ (duration: 00m 03s)
* 23:58 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/163773/ (duration: 00m 03s)
* 23:50 logmsgbot: maxsem Synchronized php-1.25wmf1/extensions/MultimediaViewer/: second try... (duration: 00m 04s)
* 23:42 logmsgbot: maxsem Finished scap: SWATting a bunch of stuff (duration: 18m 44s)
* 23:32 andrewbogott: stopped apache, nova-scheduler, keystone, puppetmaster on virt0
* 23:31 bd808: /var/lib/jenkins-slave/tmpfs 100% full on lanthanum.eqiad.wmnet
* 23:26 andrewbogott: disabling puppet on virt0 so I can kill off services one by one...
* 23:23 logmsgbot: maxsem Started scap: SWATting a bunch of stuff
* 23:17 logmsgbot: maxsem Synchronized docroot/: <mutante> he killed the dolphin (duration: 00m 06s)
* 23:13 Reedy: dist-upgraded logstash1001 and reboot
* 22:47 Reedy: dist-upgrade logstash1002 and reboot
* 22:36 Reedy: dist-upgrade on logstash1003 and rebooting
* 22:34 Reedy: restarted elasticsearch on logstash1003 post java upgrades
* 22:30 Reedy: packages upgraded on logstash1002
* 22:28 mutante: silver - shutting down, wait with wiping it for a few days, just incase
* 22:28 Reedy: packages upgraded on logstash1001
* 22:24 Reedy: elasticsearch upgradeed to 1.3.2 on logstash1003
* 22:18 andrewbogott: renaming labs-ns1 to labs-ns0 and labs-ns2 to labs-ns1
* 22:02 Reedy: elasticsearch upgradeed to 1.3.2 on logstash1002
* 22:01 mutante: silver - revoke puppet cert, salt-key, stopping services, disable monitoring
* 21:58 mutante: stopping udp2log-vumi on silver - not needed anymore per Yuvipanda
* 21:12 Reedy: elasticsearch upgradeed to 1.3.2 on logstash1001
* 20:50 bd808: Ran sync-common on tmh1002.eqiad.wmnet for cscott's failed sync-dir there
* 20:49 bd808: Ran sync-common on tmh1001.eqiad.wmnet for cscott's failed sync-dir there
* 20:29 logmsgbot: cscott Synchronized wmf-config: Switch default PDF renderer to OCG (duration: 00m 15s)
* 20:04 subbu: deployed Parsoid version deed30b2
* 19:41 ottomata: restarted varnishkafka on cp3019 to troubleshoot drerrs
* 19:26 Reedy: doing rolling upgrade of elasticsearch on logstash100[1-3]
* 17:59 cscott: updated OCG to version 89d8f29a24295b05d0643abe976fea83b56575c9
* 17:58 logmsgbot: ori Synchronized php-1.24wmf22/includes/password/Pbkdf2Password.php: I3b0a1de69: Test for string in Pbkdf2Password::crypt() (duration: 00m 05s)
* 17:47 bblack: stopped powerdns and disabled puppet on virt1000 to prevent further cache pollution w/ bad data in public caches
* 15:57 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT enable collection extension svwikiversity (duration: 00m 06s)
* 15:53 hashar: Zuul jobs reregistered
* 15:46 hashar: Zuul lost all Jenkins jobs :(
* 15:24 logmsgbot: manybubbles Synchronized php-1.24wmf22/extensions/UploadWizard/: SWAT update UploadWizard (duration: 00m 05s)
* 15:17 logmsgbot: manybubbles Synchronized php-1.25wmf1/extensions/Wikidata/: SWAT update wikidata to fix hhvm issues. (duration: 00m 14s)
* 15:05 logmsgbot: manybubbles Synchronized wmf-config/wikitech.php: SWAT sync wikitech file - is a noop I believe (duration: 00m 05s)
* 15:02 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT fix config type of flow. (duration: 00m 06s)
* 13:32 hashar: Restarted Zuul
* 13:27 hashar: Zuul: tweaking configuration files {{gerrit|162584}}
* 09:31 godog: deployed new swift ring to eqiad-prod
* 08:21 hashar: Restarting Jenkins to have a plugin installed/loaded properly
* 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep 29 03:25:11 UTC 2014 (duration 25m 10s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-29 02:26:50+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-29 02:15:34+00:00
* 02:14 bblack: restarting squid on carbon (webproxy)
 
== September 28 ==
* 23:27 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: switch db1042 load groups to db1056 (duration: 00m 06s)
* 23:17 springle: powercycle db1042
* 23:15 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1042, locked up (duration: 00m 07s)
* 23:12 bblack: restarted apache on mw1123 + mw1196
* 23:11 bblack: test
* 23:11 bblack: restarted apache on mw1123 + mw1196
* 20:28 ori: Puppet failures appear to be caused by apt-get timeouts
* 10:09 _joe_: updated bash (again) across the whole cluster
* 03:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep 28 03:24:20 UTC 2014 (duration 24m 19s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-28 02:28:14+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-28 02:17:01+00:00
 
== September 27 ==
* 18:35 logmsgbot: ori Synchronized php-1.25wmf1/extensions/Scribunto/engines/LuaSandbox/Engine.php: Capture traces for bug 71045 (duration: 00m 13s)
* 18:35 logmsgbot: ori Synchronized php-1.24wmf22/extensions/Scribunto/engines/LuaSandbox/Engine.php: Capture traces for bug 71045 (duration: 00m 17s)
* 04:03 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 27 04:03:25 UTC 2014 (duration 3m 24s)
* 02:46 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-27 02:46:53+00:00
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-27 02:25:05+00:00
 
== September 26 ==
* 23:26 mutante: switched noc.wikimedia.org to terbium, behind misc-web
* 22:01 bd808: sudo apache2ctl graceful on logstash100[123] for ldap revert
* 22:00 bd808: running puppet on logstash100[123] to revert ldap change
* 21:56 bd808: sudo apache2ctl graceful on logstash100[123] for ldap change
* 21:35 andrewbogott: gracefulled apache on neon
* 21:21 mutante: graceful'ed apache on neon
* 20:45 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Disable HHVM beta-feature on wikidatawiki (duration: 00m 06s)
* 19:58 awight: update CRM from 25159fcfc29921b08de86f12121fb292139be09d to  3e42bac8cb7f58f5e504946f4944c69ca5553e60
* 19:42 mutante: removing root's public_html from fenari - backup kept just in case
* 19:15 AaronS: Deployed security patches to CentralAuth
* 19:09 Krinkle: git-deploy: Deploying integration/slave-scripts 08147c42ea42e1a5eca1d29
* 19:08 logmsgbot: aaron Synchronized php-1.25wmf1/extensions/CentralAuth: (no message) (duration: 00m 07s)
* 19:06 logmsgbot: aaron Synchronized php-1.24wmf22/extensions/CentralAuth: (no message) (duration: 00m 08s)
* 18:15 Nemo_bis: untruncated: andrewbogott> ldap is broken for gerrit, should be working elsewhere
* 18:14 legoktm: ldap is broken
* 18:09 K4-713: re-enabled donations queue consumer
* 17:50 awight: CRM queue consumer disabled
* 17:43 andrewbogott: upgraded libgnutls26 on ytterbium
* 17:35 andrewbogott: "git reset --hard origin" to remove that terrible hotfix on palladium and strontium.
* 17:28 awight: CRM jobs reenabled
* 17:22 hoo: Manually ran rebuildEntityPerPage.php for Wikidata
* 17:16 andrewbogott: hotfixing /var/lib/git/operations/puppet in hopes of fixing gerrit so I don't have to hotfix no more
* 17:08 awight: updated crm from 06c9546f9b68f6ecbaaf510944418aa52f9ed0fb to 25159fcfc29921b08de86f12121fb292139be09d
* 17:02 awight: disabling CRM jobs for deployment...
* 15:29 andrewbogott: puppet is now moving all labs instances to new ldap servers:  ldap-eqiad and ldap-codfw
* 15:02 cscott: documented what I'm going to clear the OCG queues at https://wikitech.wikimedia.org/wiki/OCG#Pruning_the_queue
* 14:36 bblack: address for ns1 switched in our local dns data - https://gerrit.wikimedia.org/r/163164
* 13:57 hoo: Manually declared the global rename Secretary-> VlsergeyBot done after it twice timed out on pages moves on ruwiki
* 13:39 akosiaris: moved mathoid to low-traffic lvs servers@eqiad
* 12:48 cscott: cleared OCG caches again when I woke up to buy me more time to investigate the issue properly.
* 08:44 awight: rollback: revision for civicrm locked to  06c9546f9b68f6ecbaaf510944418aa52f9ed0fb
* 08:30 _joe_: updated hhvm on mw1053, kicked the jr a couple of times, working again now
* 08:29 awight: large_donation schema migration 7000
* 08:28 awight: skip over wmf_civicrm schema migration 7022 -- *why* did I make that unsafe
* 08:24 awight: fundraising_code_update: revision for civicrm changed  from  06c9546f9b68f6ecbaaf510944418aa52f9ed0fb to 5aca00fd4573f0fe8f385baa7238172f6ae54438
* 08:19 awight: disabling CRM jobs during deployment
* 08:09 cscott: cleared OCG queues and cache to quiet icinga; will try to get to the root cause tomorrow.
* 07:41 hashar: Updated our Jenkins Job Builder fork 2d74b16..686265a
* 07:06 logmsgbot: ori Synchronized php-1.24wmf22/extensions/WikimediaEvents: Update WikimediaEvents for 791e14cfc1d (duration: 00m 05s)
* 06:53 logmsgbot: ori Synchronized php-1.25wmf1/extensions/WikimediaEvents: Update WikimediaEvents for 0e087daea5 (duration: 00m 07s)
* 06:41 cscott: updated OCG to version f3a6c1cbba118d4a5e1aa019937dc50159fc823d
* 04:43 _joe_: updating bash, USN-2363
* 04:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 26 04:10:12 UTC 2014 (duration 10m 11s)
* 03:09 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-26 03:09:47+00:00
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-26 02:36:45+00:00
* 00:14 awight: turning off Civi jobs before deployment
 
== September 25 ==
* 23:31 logmsgbot: maxsem Synchronized php-1.25wmf1/skins/Vector/: https://gerrit.wikimedia.org/r/#/c/163021/ (duration: 00m 03s)
* 23:15 logmsgbot: maxsem Synchronized php-1.25wmf1/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/162971/ (duration: 00m 04s)
* 23:12 logmsgbot: maxsem Synchronized php-1.25wmf1/includes/resourceloader/ResourceLoaderSiteModule.php: https://gerrit.wikimedia.org/r/#/c/163024/ (duration: 00m 03s)
* 23:10 logmsgbot: maxsem Synchronized php-1.25wmf1/includes/api/ApiQueryAllUsers.php: https://gerrit.wikimedia.org/r/#/c/163027/ (duration: 00m 03s)
* 23:08 logmsgbot: maxsem Synchronized php-1.24wmf22/includes/api/ApiQueryAllUsers.php: https://gerrit.wikimedia.org/r/#/c/163026/ (duration: 00m 03s)
* 23:02 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/163048 (duration: 00m 03s)
* 22:58 logmsgbot: ori Synchronized php-1.24wmf22/extensions/Wikidata: Update Wikidata for I0acd2096d21b (duration: 00m 11s)
* 21:41 mutante: powercycling mw1053
* 20:36 mutante: no !log
* 20:36 legoktm: manually migrated "NickK" to a global account
* 20:29 mutante: repooled mw1051
* 19:49 bd808: Restarted logstash on logstash1001. udp2log events were not being recorded.
* 19:30 logmsgbot: reedy Synchronized php-1.25wmf1/: (no message) (duration: 00m 46s)
* 19:24 logmsgbot: reedy Synchronized php-1.24wmf22/resources/src/mediawiki.ui/components/buttons.less: (no message) (duration: 00m 14s)
* 19:22 bblack: ntp work done on hosts
* 19:18 logmsgbot: reedy Synchronized php-1.25wmf1/: (no message) (duration: 00m 55s)
* 19:17 logmsgbot: reedy Synchronized php-1.24wmf22/extensions/CentralAuth/: (no message) (duration: 00m 14s)
* 18:55 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:47 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.25wmf1
* 18:08 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf22
* 17:20 logmsgbot: reedy Finished scap: testwiki to 1.25wmf1 and build l10n cache (duration: 28m 36s)
* 16:52 logmsgbot: reedy Started scap: testwiki to 1.25wmf1 and build l10n cache
* 16:41 Reedy: Purged php-1.24wmf9
* 16:38 logmsgbot: reedy Purged l10n cache for 1.24wmf20
* 15:31 bblack: testing ntpd changes on acamar, achernar, chromium, hydrogen, nescio, and baham (puppet-agent disabled)
* 15:19 logmsgbot: mattflaschen Synchronized wmf-config/CommonSettings.php: Extend GettingStarted bucketting period end date to Sept. 28 (duration: 00m 07s)
* 12:36 godog: update bash on elastic1014 analytics1021 elastic1013
* 11:33 _joe_: gracefully reloaded apache on mw1139 and mw1199, apc issues
* 11:29 logmsgbot: aude Synchronized php-1.24wmf22/extensions/Wikidata/extensions/Wikibase/lib/config/WikibaseLib.default.php: fix apc issues (duration: 00m 06s)
* 11:03 _joe_: updated bash on elastic1007
* 10:57 godog: upgraded bash on labsdb1003
* 10:31 Nemo_bis: SAL is here
* 09:22 godog: graphite temporarily down, fix incoming
* 06:13 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062 (duration: 00m 07s)
* 03:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 25 03:58:02 UTC 2014 (duration 58m 1s)
* 03:02 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-25 03:02:46+00:00
* 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-25 02:32:56+00:00
* 02:08 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Use Debian-packaged texvc on Trusty app servers (duration: 00m 04s)
* 01:39 ori: gracefuling apaches
* 00:55 mutante: icinga - manually deleted duplicate host labs-ns1 to fix icinga config and reloads
 
== September 24 ==
* 23:21 ejegg: Updated paymentswiki from 3ac5dd1c3fade37b6f3a4879aef8ea71b3bbbf08 to 83464deed3b66da655ca5d1086852237c4793b71
* 23:17 logmsgbot: catrope Synchronized php-1.24wmf22/extensions/VisualEditor: SWAT (duration: 00m 04s)
* 23:14 logmsgbot: catrope Synchronized php-1.24wmf22/resources/lib/oojs-ui/: SWAT (duration: 00m 05s)
* 23:12 greg-g: restarted jouncebot, he wasn't announcing deploy windows
* 23:00 mutante: OCG - scheduled downtime/disabled notifications for LVS check
* 22:44 andrewbogott: salted a bash update on labs instances, which turned out to be updated already.
* 22:09 cscott: icinga VS HTTP IPv4 on ocg.svc.eqiad.wmnet test is most likely due to `du -s` of a 6G cache directory, not critical.  timeouts can be increased to quiet it.  i will look into adding a -quick parameter or some such tomorrow to make the health check faster.
* 20:56 cscott: updated OCG to version 48acb8a2031863e35fad9960e48af60a3618def9
* 20:43 logmsgbot: aaron Synchronized php-1.24wmf22/includes/cache/bloom: ad8a7a761d5f3bd086bbd6c88870e83c701e59e3 (duration: 00m 04s)
* 20:00 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 19:47 logmsgbot: yurik Synchronized php-1.24wmf22/extensions/ZeroBanner/: Updating to master (duration: 01m 10s)
* 19:46 logmsgbot: yurik Synchronized php-1.24wmf21/extensions/ZeroBanner/: Updating to master (duration: 01m 07s)
* 19:14 logmsgbot: yurik Finished scap: updating Graph, JsonConfig, ZeroBanner & ZeroPortal to master for 21 & 22 (duration: 07m 46s)
* 19:07 logmsgbot: yurik Started scap: updating Graph, JsonConfig, ZeroBanner & ZeroPortal to master for 21 & 22
* 18:55 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: Updating interwiki cache (duration: 00m 14s)
* 18:53 logmsgbot: reedy Synchronized php-1.24wmf22/extensions/WikimediaMaintenance: (no message) (duration: 00m 14s)
* 17:13 manybubbles: lowered throttling on Elasticsearch index transfer from one node to another speed because I hate excitement
* 15:38 Nemo_bis: cscott> i'm working on the OCG health issue above.  i'll let you know when i know what's going on. icinga-wm> PROBLEM - OCG health on ocg1002 is CRITICAL
* 15:37 logmsgbot: demon Synchronized php-1.24wmf22/extensions/CentralAuth: (no message) (duration: 00m 05s)
* 15:21 logmsgbot: demon Synchronized php-1.24wmf22/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php: (no message) (duration: 00m 05s)
* 15:01 logmsgbot: demon Synchronized wmf-config/Wikibase.php: (no message) (duration: 00m 06s)
* 14:57 Jeff_Green: restarted service ocg on ocg1001
* 14:40 manybubbles: finished deployment - load spikes look to be gone.  yay
* 14:22 logmsgbot: manybubbles Synchronized php-1.24wmf21/extensions/CirrusSearch/: Switch implementation of Cirrus link counting jobs to hopefully lower overall load. (duration: 00m 04s)
* 14:21 logmsgbot: manybubbles Synchronized wmf-config: More cirrus config to lower load (duration: 00m 04s)
* 14:17 logmsgbot: manybubbles Synchronized wmf-config: Cirrus config to lower load (duration: 00m 04s)
* 14:14 logmsgbot: manybubbles Synchronized php-1.24wmf22/extensions/CirrusSearch/: Switch implementation of Cirrus link counting jobs to hopefully lower overall load. (duration: 00m 06s)
* 14:08 manybubbles: starting deployment to lower cirrus load spikes
* 13:19 manybubbles: *disabled*
* 13:17 manybubbles: disable row awareness on Cirrus's elasticsearch cluster - might help balance load better.  too much load was on  one row
* 13:04 hashar: Zuul proceeding queue again
* 13:00 hashar: Jenkins: disconnecting Gearman client from Zuul and reconnecting
* 12:59 hashar: Zuul / Jenkins stuck
* 09:33 hashar_: Jenkins switched mwext-UploadWizard-qunit back to Zuul cloner by applying pending change {{gerrit|161459}}
* 09:33 hashar_: restarting zuul-merger
* 09:32 hashar_: restarting zuul
* 09:19 hashar_: Upgrading Zuul to f0e3688  Cherry pick https://review.openstack.org/#/c/123437/1 which fix {{bug|71133}} ''Zuul cloner: fails on extension jobs against a wmf branch''
* 05:41 legoktm: ran script to back populate bug 70620 on metawiki (/home/legoktm/ca/populateBug70620.php on terbium)
* 04:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 24 04:29:53 UTC 2014 (duration 29m 52s)
* 03:34 logmsgbot: tstarling Finished scap: (no message) (duration: 12m 09s)
* 03:22 logmsgbot: tstarling Started scap: (no message)
* 03:21 logmsgbot: tstarling scap failed: RuntimeError scap requires SSH agent forwarding (duration: 00m 00s)
* 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-24 03:12:54+00:00
* 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-24 02:39:39+00:00
* 02:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1062 (duration: 00m 06s)
* 01:25 mutante: tridge - shutting down
 
== September 23 ==
* 23:47 logmsgbot: maxsem Synchronized php-1.24wmf22/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
* 23:15 logmsgbot: maxsem Synchronized wmf-config/CommonSettings.php: fail! (duration: 00m 04s)
* 23:12 logmsgbot: maxsem Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/162297/ (duration: 00m 03s)
* 23:06 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/MassMessage/: https://gerrit.wikimedia.org/r/#/c/161002/ (duration: 00m 03s)
* 22:04 logmsgbot: aaron Synchronized php-1.24wmf22/includes/jobqueue/JobRunner.php: f23f1ad35f02f6a17c9b5842aa6d8c152a273639 (duration: 00m 04s)
* 21:54 logmsgbot: ebernhardson Finished scap: Bump flow submodule (and change an i18n message) in 1.24wmf21 and 1.24wmf22 (duration: 28m 14s)
* 21:25 logmsgbot: ebernhardson Started scap: Bump flow submodule (and change an i18n message) in 1.24wmf21 and 1.24wmf22
* 20:24 cscott: updated OCG to version 1cf9281ec3e01d6cbb27053de9f2423582fcc156
* 19:38 mutante: stopped etherpad, added repairPad.js, attempted repair of pad 'WRN201409', started etherpad
* 18:30 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:28 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 16s)
* 18:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf22
* 16:59 logmsgbot: aaron Synchronized wmf-config/jobqueue-eqiad.php: Removed redundant config due to new job runner (duration: 00m 05s)
* 16:29 _joe_: manually created /srv/mediawiki bind mount on searchidx1001; moved old contents to /a/mediawiki-stale, to avoid filling the disk
* 15:33 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT remove C and MW namspace aliases from ckbwiki (duration: 00m 07s)
* 15:24 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT add *.beeldbank.cultureelerfgoed.nl to upload list (duration: 00m 04s)
* 15:16 logmsgbot: manybubbles Synchronized php-1.24wmf21/extensions/CirrusSearch/: SWAT update Cirrus for better error handling (duration: 00m 04s)
* 15:08 logmsgbot: manybubbles Synchronized php-1.24wmf22/extensions/CirrusSearch/: SWAT deploy cirrus backports (duration: 00m 05s)
* 13:48 akosiaris: change url-downloader ip to point to the new one
* 13:01 logmsgbot: manybubbles Synchronized wmf-config/: Throttle cirrus jobs some more. (duration: 00m 04s)
* 12:24 logmsgbot: manybubbles Synchronized wmf-config/: Some new cirrus config (duration: 00m 07s)
* 09:16 godog: deployed codfw-prod swift ring to palladium
* 04:49 logmsgbot: tstarling Synchronized php-1.24wmf21/languages/Language.php: profiling (duration: 00m 05s)
* 04:10 logmsgbot: tstarling Synchronized php-1.24wmf21/languages/Language.php: profiling (duration: 00m 05s)
* 03:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 23 03:42:29 UTC 2014 (duration 42m 28s)
* 03:29 logmsgbot: tstarling Synchronized wmf-config/CommonSettings.php: fix profiling (duration: 00m 07s)
* 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-23 02:43:48+00:00
* 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-23 02:30:38+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-23 02:17:31+00:00
* 00:26 mutante: tridge - revoking puppet cert, deleting salt key, decom ...
 
== September 22 ==
* 23:49 logmsgbot: ebernhardson Synchronized php-1.24wmf22/extensions/LiquidThreads/: Bump LiquidThreads submodule in 1.24wmf22 (duration: 00m 06s)
* 23:48 logmsgbot: ebernhardson Synchronized php-1.24wmf22/extensions/UploadWizard/: Bump UploadWizard submodule in 1.24wmf22 (duration: 00m 04s)
* 23:46 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/LiquidThreads/: Bump LQT submodule in 1.24wmf21 (duration: 00m 04s)
* 23:35 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/UploadWizard/: sync UploadWizard in 1.24wmf21 (duration: 00m 07s)
* 23:32 logmsgbot: ebernhardson Synchronized php-1.24wmf21/includes/rcfeed/MachineReadableRCFeedFormatter.php: Use safe attribute accessor for RecentChange (duration: 00m 04s)
* 23:30 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/UploadWizard/: Bump UploadWizard submodule in php-1.24wmf21 (duration: 00m 04s)
* 23:30 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/Flow/: Bump flow submodule in php-1.24wmf21 (duration: 00m 06s)
* 23:17 logmsgbot: ebernhardson Synchronized wmf-config/InitialiseSettings.php: Set wgUploadNavigationUrl for eowiki (duration: 00m 05s)
* 21:04 bd808: production-logstash-eqiad healed by restarting elasticsearch on logstash1002 after OOM + split brain
* 20:54 bd808: split brain on logstash1002 preceded by by java OOM for elasticsearch
* 20:52 bd808: logstash1002 went split brain from rest of logstash elastic search cluster. restarting
* 20:24 subbu: deployed Parsoid ff9476f9
* 19:31 hashar: Jenkins is broken for extensions patches proposed against the wmf branches {{bug|71133}}
* 18:32 Krinkle: lanthanum tmpfs filled up again, purged manually (bug 71128)
* 17:22 ori: updated HHVM on beta cluster to HHVM to 3.3.0-20140918+wmf1
* 17:00 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Push Cirrus' non-content enwiki shards apart (no-op) (duration: 00m 04s)
* 15:52 godog: reboot ms-be2001 into PXE to test a re-install
* 15:07 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Graph extension on mediawiki.org [[gerrit:161908]] (duration: 00m 09s)
* 15:02 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add securepoll-create-poll right to sysop on testwiki [[gerrit:161653]] (duration: 00m 09s)
* 15:01 logmsgbot: anomie Synchronized wmf-config/CommonSettings.php: SWAT: Add REL1_24 as branch in ExtensionDistributor [[gerrit:161666]] (duration: 00m 10s)
* 14:12 hashar: Jenkins deleted job mediawiki-core-lint , replaced by mediawiki-core-phplint
* 12:10 apergos: shutdown of db1050 to install trusty
* 10:04 hashar: Jenkins back and fully operational
* 09:55 hashar: restarting jenkins
* 09:37 hashar_: Jenkins: deleting old mediawiki extensions jobs (<tt>rm -fR /var/lib/jenkins/jobs/*testextensions-master</tt>).  They are no more triggered and superseded by the <tt>*-testextension</tt> jobs.
* 03:36 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep 22 03:36:40 UTC 2014 (duration 36m 39s)
* 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-22 02:41:29+00:00
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-22 02:29:09+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-22 02:16:20+00:00
 
== September 21 ==
* 22:43 ori: ms-be1008 overloaded starting 18:00:24 UTC, syslog says "BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:2196]". machine became unresponsive at 21:35, coinciding with a spike of 5xxs, lasting until Coren powercycled it at 22:10.
* 03:37 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep 21 03:37:31 UTC 2014 (duration 37m 30s)
* 03:16 springle: labsdb1001 mysqld restarted in gdb; crash loop with a labs user's table
* 02:46 logmsgbot: ori Synchronized wmf-config/throttle.php: I7bb42b49a: Increase account creation throttle on enwiki for Cochrane colloquium. (duration: 00m 07s)
* 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-21 02:41:36+00:00
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-21 02:29:51+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-21 02:16:56+00:00
 
== September 20 ==
* 22:28 Krinkle: Reloading Zuul to deploy I0170766cfc06b8e6
* 20:30 andrewbogott: rebooting virt1006 to make good and sure it doesn't spontaneously re-enter the compute pool
* 20:29 andrewbogott_afk: moved all VMs off of virt1006, disabled compute service
* 03:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 20 03:46:00 UTC 2014 (duration 45m 59s)
* 02:46 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-20 02:46:05+00:00
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-20 02:33:34+00:00
* 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-20 02:19:34+00:00
 
== September 19 ==
* 22:16 RoanKattouw: Restarting Jenkins
* 21:57 logmsgbot: spage Synchronized php-1.24wmf21/extensions/Flow/modules/new/components/flow-board.js: Flow bug 71054 backport (duration: 00m 04s)
* 20:50 ori: restarted HHVM and cleared bytecode cache on all HHVM app servers
* 20:47 _joe_: restarted hhvm on mw1018, cleaning the cache as well
* 20:25 ori: Deployed Ic71064e08 (type hint fix for Wikidata) to wmf21/22.
* 19:09 bblack: restarted hhvm on mw1021
* 18:59 _joe_: rolling restart of hhvm servers
* 18:22 bblack: restarting hhvm on mw1020 (again!)
* 18:19 hashar: Jenkins: reverting job mwext-VisualEditor-qunit to previous state (i.e. without Zuul cloner)
* 18:17 bblack: restarting hhvm on mw1020
* 17:57 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I3e1bd5e4bb: Don't manipulate the environment to determine TZ offset (Bug: 71036) (duration: 00m 13s)
* 17:30 bblack: turned down apache prefork procs on fenari to reduce swapping
* 17:16 ottomata: initiating controlled shutdown of kafka broker analytics1021 to test some kafkatee weirdness, as well as a potential kafka/zookeeper bug
* 17:07 bblack: restarting apache on fenari
* 16:21 bblack: restarted hhvm on mw1019 + 1021
* 14:57 hashar: Jenkins friday deploy: migrate all MediaWiki extension qunit jobs to Zuul cloner.
* 14:37 akosiaris: initiated rsync of tridge data that is to be kept to nas1001-a
* 13:56 springle: killing any sleeping connection on enwiki db slaves to make room
* 13:56 mark: Stopped jobrunners on mw1001-1003
* 12:36 springle: temporarily disable log fsync on enwiki slaves
* 12:14 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1072 with ReadAheadNone (duration: 00m 09s)
* 11:32 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1072. seems more susceptible to replag; find out why. (duration: 00m 10s)
* 09:14 _joe_: restarted hhvm on mw1053, stuck to 100% cpu since last restart (activating stats)
* 05:01 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 19 05:01:54 UTC 2014 (duration 1m 52s)
* 03:45 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-19 03:45:33+00:00
* 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-19 03:11:43+00:00
* 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-19 02:38:25+00:00
* 00:43 cscott: updated OCG to version ce16f7adb60d7c77409e2e11ba0e5d6cce6955d5
 
== September 18 ==
* 23:55 logmsgbot: ori Started scap: Add HHVM as a beta feature
* 23:54 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I2466f6b6e: Add HHVM to beta feature whitelist (duration: 00m 08s)
* 23:52 logmsgbot: ori Synchronized php-1.24wmf22/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 06s)
* 23:51 logmsgbot: ori Synchronized php-1.24wmf21/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 06s)
* 23:25 logmsgbot: catrope Synchronized php-1.24wmf22/resources/lib/oojs-ui/: oojs-ui bugfixes (duration: 00m 06s)
* 23:13 logmsgbot: catrope Synchronized php-1.24wmf22/extensions/VisualEditor/: SWAT (duration: 00m 08s)
* 23:04 logmsgbot: catrope Synchronized php-1.24wmf21/extensions/UploadWizard/: SWAT (duration: 00m 08s)
* 19:57 Jeff_Green: iridium.wm.o exim conf checked, puppet reenabled
* 19:54 Jeff_Green: magnesium.wm.o exim conf checked, puppet reenabled
* 19:50 Jeff_Green: sodium.wm.o exim conf checked, puppet reenabled
* 19:48 logmsgbot: reedy Synchronized php-1.24wmf22/extensions/Flow/: (no message) (duration: 00m 16s)
* 19:45 Jeff_Green: iodine.wm.o exim conf checked, puppet reenabled
* 19:44 Jeff_Green: polonium.wm.o exim conf checked, puppet reenabled
* 19:35 Jeff_Green: lead.wm.o exim conf checked, puppet reenabled
* 19:22 logmsgbot: reedy Synchronized php-1.24wmf22: (no message) (duration: 00m 57s)
* 19:16 Jeff_Green: disabling puppet on polonium, lead, sodium, iridium, magnesium, and iodine to monitor rollout of https://gerrit.wikimedia.org/r/155753
* 19:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: rest of group0 to 1.24wmf22
* 19:01 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf21
* 18:59 bblack: restarting apache on fenari
* 18:49 logmsgbot: reedy Finished scap: testwiki to 1.24wmf22 and build l10n cache (duration: 30m 23s)
* 18:44 Jeff_Green: testing exim configuration change on lead.wm.o
* 18:18 logmsgbot: reedy Started scap: testwiki to 1.24wmf22 and build l10n cache
* 17:49 logmsgbot: reedy Started scap: testwiki to 1.24wmf22 and build l10n cache
* 17:08 cmjohnson1: replacing failed disk es1005
* 17:05 logmsgbot: yurik Finished scap: (no message) (duration: 23m 26s)
* 16:43 yurikR: yurik scaping zero - partner needs an l10n message asap
* 16:42 logmsgbot: yurik Started scap: (no message)
* 15:38 hashar: restarting Zuul just to be safe
* 15:06 logmsgbot: anomie Synchronized php-1.24wmf21/resources/src/mediawiki.action/mediawiki.action.view.redirectPage.css: SWAT: mediawiki.action.view.redirectPage: Correct a CSS selector [[gerrit:161239]] (duration: 00m 23s)
* 15:01 logmsgbot: anomie Synchronized php-1.24wmf21/extensions/Wikidata/: SWAT: Update Wikidata to fix broken xml api output [[gerrit:161232]] (duration: 00m 38s)
* 11:40 apergos: forgot to log this earlier: manually started salt minion on radon, elastic1015, searchidx1001, it wasn't running there
* 09:00 godog: updated authdns to 0c2225d
* 08:56 springle: xtrabackup clone db1016 to db2010
* 07:48 godog: re-enabled icinga notifications for ms-be1001
* 07:09 bblack: removing pybal cfg "eqiad/misc_web_https" (unused now, https://gerrit.wikimedia.org/r/161183)
* 06:53 bblack: removing pybal cfg "esams/wikimedialbsecure" (unused, points at maerlant)
* 06:47 bblack: removing pybal symlink "$site/ipv6", also unused (old ipv6 protoproxying)
* 06:45 bblack: removing pybal symlink "$site/text-varnish", seems to be a remnant no longer in use
* 04:20 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 18 04:20:56 UTC 2014 (duration 20m 55s)
* 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-18 03:09:44+00:00
* 02:53 logmsgbot: yurik Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 53s)
* 02:52 yurikR: yurik Fixing graph ext namespace name - otherwise get screen of WMF death on graph: ns visits
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-18 02:36:46+00:00
* 00:32 logmsgbot: marktraceur Finished scap: [SWAT] Move things out of assets/ and into resources/assets/ (duration: 35m 28s)
 
== September 17 ==
* 23:57 logmsgbot: marktraceur Started scap: [SWAT] Move things out of assets/ and into resources/assets/
* 23:47 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: [SWAT] Enable Graph on metawiki and labswiki (duration: 00m 10s)
* 23:42 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/Graph/: [SWAT] Update Graph to master (duration: 00m 08s)
* 23:41 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/Graph/: [SWAT] Update Graph to master (duration: 00m 07s)
* 23:35 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/MultimediaViewer/: [SWAT] Fix reuse dropdown message weirdness (duration: 00m 07s)
* 23:29 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/MultimediaViewer/: [SWAT] Fix reuse dropdown message weirdness (duration: 00m 08s)
* 23:10 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/UploadWizard/: [SWAT] Fix EventLogging schema declarations for UploadWizard (duration: 00m 11s)
* 21:41 mutante: fixing updates on planet feeds - file permissions
* 21:11 manybubbles: restarting rebuilding cirrus's enwiki index now that I've found the reason it wasn't working before - the new index was putting too many shards on an already full node and overwhelming it.  silly allocation algorithm!  thats a bad idea!
* 21:07 logmsgbot: yurik Synchronized php-1.24wmf21/extensions/ZeroPortal/: (no message) (duration: 01m 05s)
* 20:19 godog: rebooting ms-be1006
* 19:00 Krinkle: jenkins-slave tmpfs on lanthanum was filling up (> 500MB). I purged tmp dbs for old jobs. We should get these purged automatically and also increase the size as 500MB is too little.
* 18:59 robh: disabled icinga alerts for ms-be1001, rebooting it to look at its raid bios settings for codfw deployment mirroring
* 18:47 logmsgbot: yurik Synchronized php-1.24wmf20/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 39s)
* 18:43 logmsgbot: yurik Synchronized php-1.24wmf21/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 35s)
* 18:40 logmsgbot: yurik Synchronized wmf-config/: private wikis login/logout page names, zeroportal impersonator acct (duration: 01m 06s)
* 18:23 mutante: phabricator - made aklapper an admin
* 17:26 logmsgbot: andrew rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
* 17:23 logmsgbot: andrew Synchronized wikiversions.json: (no message) (duration: 00m 05s)
* 17:04 manybubbles: cirrus brownout looks just about fixed.  So!  My plan for periodically explicitly merging deletes has some problems.....
* 16:42 gwicke: restarted parsoid on wtp102{2,3,4}
* 16:31 manybubbles: just going to make this clear - the current cirrus brownout doesn't seem to be effecting my queries but we're getting hit with pool counter full events - sadness.  its not caused by switching cirrus to ruwiki's primary backend - its caused by me attempting to perform index maintenance activities.
* 16:23 akosiaris: restarted node on wtp boxes except wtp1022,wtp1023,wtp1024
* 16:23 manybubbles: caused cirrus brownout by executing a force merge for enwiki's general index.  ooops
* 16:06 logmsgbot: manybubbles Synchronized wmf-config/: set cirrus as primary search backend for ruwiki and make permanent some settings set on the fly (duration: 00m 06s)
* 15:57 manybubbles: manually pushed apart ruwiki and nlwiki's shards as well - might help - updated commit to reflect that
* 15:42 manybubbles: gerrit change to lock that into place is https://gerrit.wikimedia.org/r/#/c/160974/ and I'll deploy it in my window in 15 minutes.
* 15:41 manybubbles: manually forcing Cirrus's commonswiki's file index apart from one another in an attempt to lower the consistently high load on elastic1013
* 15:34 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Set wgMetaNamespace for labswiki (duration: 00m 14s)
* 14:54 springle: db1062 out of action for bug hunt https://mariadb.atlassian.net/browse/MDEV-6751
* 14:48 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: (no message) (duration: 00m 16s)
* 14:45 godog: restarted apache2 on magnesium, validate removal of ssl certs
* 13:38 hashar: Zuul upgraded successfully apparently.
* 13:33 hashar: stopping zuul for upgrade
* 13:29 hashar: upgrading Zuul to 2.0.0.286.gb1811ab
* 12:20 hashar: upgrading jenkins 1.565.1 -> 1.565.2
* 09:53 akosiaris: stopped apache2 on fenari, it was leaking memory, puppet restarted it, need to kill this machine ASAP
* 09:52 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s1 db1061 (duration: 00m 08s)
* 06:55 springle: xtrabackup clone db1061 to db2016
* 06:52 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool s1 db1061 for codfw cloning (duration: 00m 07s)
* 06:27 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s7 db1039 (duration: 00m 08s)
* 04:34 logmsgbot: tstarling Synchronized docroot/bits: (no message) (duration: 00m 10s)
* 04:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 17 04:32:17 UTC 2014 (duration 32m 16s)
* 03:17 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-17 03:17:38+00:00
* 03:07 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s6 db1015 (duration: 01m 41s)
* 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-17 02:43:02+00:00
* 02:21 springle: xtrabackup clone db1048 to db2012
* 02:15 springle: xtrabackup clone db1046 to db2011
* 02:00 springle: xtrabackup clone db1016 to db2010
* 01:54 springle: xtrabackup clone db1031 to db2009
* 01:33 springle: xtrabackup clone db1039 to db2029
* 01:33 springle: xtrabackup clone db1015 to db2028
* 01:29 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool s6 db1015 and s7 db1039 (duration: 00m 20s)
* 01:15 Reedy: updateCollation on shwiki done
* 00:59 Reedy: running `mwscript updateCollation.php --wiki=shwiki --previous-collation=uppercase` in screen on tin
* 00:58 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: shwiki collation (duration: 00m 16s)
* 00:53 Reedy: updateCollation on etwiki done
* 00:52 Reedy: updateCollation on etwiktionary done
* 00:48 Reedy: running `mwscript updateCollation.php --wiki=etwiktionary --previous-collation=uppercase` in screen on tin
* 00:47 Reedy: etwikisource collation updated (9918 rows)
* 00:47 Reedy: etwikiquote collation updated (706 rows)
* 00:46 Reedy: etwikimedia collation updated (121 rows)
* 00:46 Reedy: etwikibooks collation updated (280 rows)
* 00:45 Reedy: running `mwscript updateCollation.php --wiki=etwiki --previous-collation=uppercase` in screen on tin
* 00:45 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: et collations (duration: 00m 15s)
* 00:43 Reedy: updateCollation on frwikiversity done
* 00:42 Reedy: running `mwscript updateCollation.php --wiki=frwikiversity --previous-collation=uppercase` in screen on tin
* 00:42 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: frwikiversity collation (duration: 00m 17s)
* 00:40 Reedy: updateCollation on skwiki done
* 00:26 Reedy: Running `mwscript updateCollation.php --wiki=skwiki --previous-collation=uppercase` in screen on tin
* 00:25 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: skwiki collation (duration: 00m 15s)
* 00:18 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 15s)
 
== September 16 ==
* 23:22 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/VisualEditor/: (no message) (duration: 00m 04s)
* 23:21 logmsgbot: maxsem Synchronized php-1.24wmf20/extensions/VisualEditor/: (no message) (duration: 00m 04s)
* 23:16 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/Wikidata: (no message) (duration: 00m 24s)
* 23:15 MaxSem: Wikidata submodule in wmf21 was in the middle of rebase - reset and updating to a newer submodule commit
* 23:12 logmsgbot: maxsem Synchronized php-1.24wmf20/extensions/Wikidata: (no message) (duration: 00m 17s)
* 23:07 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/GettingStarted: https://gerrit.wikimedia.org/r/#/c/160084/ (duration: 00m 08s)
* 21:36 Jeff_Green: SPF record deployed for donate.wikimedia.org
* 21:01 logmsgbot: ejegg Synchronized php-1.24wmf20/extensions/CentralNotice/modules/ext.centralNotice.bannerController/bannerController.js: (no message) (duration: 00m 06s)
* 19:38 csteipp: deployed patches for bugs 70469 and 70672
* 19:17 logmsgbot: catrope Synchronized php-1.24wmf21/extensions/VisualEditor/: Revert IE hacks so Firefox will stop corrupting non-Latin characters (duration: 00m 06s)
* 19:15 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/VisualEditor/: (no message) (duration: 00m 09s)
* 18:32 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 18:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf21
* 17:03 logmsgbot: bd808 Finished scap: No code change scap to test scap internal update (duration: 18m 06s)
* 16:45 logmsgbot: bd808 Started scap: No code change scap to test scap internal update
* 16:43 bd808|deploy: Updated scap to 663f137 (Check php syntax with parallel `php -l`)
* 16:42 bd808|deploy: Trebuchet sync for scap reporting failure from osmium.eqiad.wmnet, mw1053.eqiad.wmnet, searchidx1001.eqiad.wmnet, fenari.wikimedia.org, and mw1110.eqiad.wmnet
* 16:41 bd808|deploy: Trebuchet update for scap reporting failure from osmium.eqiad.wmnet, searchidx1001.eqiad.wmnet, fenari.wikimedia.org and mw1110.eqiad.wmnet
* 16:00 _joe_: mw1018 and mw1021 in the hhvm appservers pool
* 15:35 logmsgbot: reedy Synchronized docroot and w: Update symlinks to use /srv/mediawiki (duration: 00m 16s)
* 15:34 hashar: Jenkins: deleting /srv/ssd/jenkins-slave/workspace/*testextensions-master on gallium and lanthanum.
* 15:25 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 03s)
* 15:23 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 19s)
* 15:20 manybubbles: SWAT complete
* 15:16 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/VisualEditor/: swat update for wmf20 (duration: 00m 25s)
* 15:13 hashar: Jenkins: mediawiki extensions phpunit jobs should pass more or less until the CI system is sent an orbit and dies out horribly. in such a case ping me / phone.
* 15:08 logmsgbot: manybubbles Synchronized php-1.24wmf21/extensions/VisualEditor/: SWAT visual editor update wmf21 (duration: 00m 07s)
* 14:52 ottomata: set vm.dirty_expire_centisecs to 10000 (was 30000) on analytics1021 to experiment with paging and kafka-zookeeper timeouts
* 14:36 godog: stopped htcp-purger on ms1004 RT #8358
* 14:32 godog: silenced ms-be1014 until torrow, pending forced reboot
* 14:28 hashar: Jenkins: breaking continuous integration for MediaWiki repositories. Extensions are now tested with mediawiki/vendor and, mediawiki/core is checked out to the patch branch if it exist. {{gerrit|160656}}
* 14:20 akosiaris_: restarted apache on fenari , it was leaking memory, situation back to normal, cause unknown yet
* 14:12 akosiaris_: stopped apache on fenari . It was in swap, investigating
* 12:35 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s2 db1054, s3 db1027, s4 db1056, s5 db1037 (duration: 00m 10s)
* 12:26 godog: reboot ms-be1014, xfs issues
* 12:22 godog: temporarily chgrp wikidev /var/log/hhvm/error.log on mw1018
* 12:21 logmsgbot: reedy Synchronized php-1.24wmf20/LocalSettings.php: Fix path to be /srv based (duration: 00m 32s)
* 11:25 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 35s)
* 11:12 logmsgbot: reedy Purged l10n cache for 1.24wmf19
* 11:12 logmsgbot: reedy Purged l10n cache for 1.24wmf18
* 11:10 logmsgbot: reedy Purged l10n cache for 1.24wmf15
* 09:21 _joe_: reimaging mw1018 and mw1021 w HAT: removing from pybal, etc.
* 06:29 springle: xtrabackup clone db1037 to db2023
* 05:31 springle: xtrabackup clone db1056 to db2019
* 04:01 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 16 04:01:05 UTC 2014 (duration 1m 4s)
* 03:11 springle: xtrabackup clone db1027 to db2018
* 03:04 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-16 03:04:46+00:00
* 02:53 springle: xtrabackup clone db1054 to db2017
* 02:50 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool s2 db1054, s3 db1027, s4 db1056, s5 db1037 for codfw cloning (duration: 01m 12s)
* 02:39 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1036, depool db1002 (duration: 00m 07s)
* 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-16 02:31:16+00:00
 
== September 15 ==
* 23:32 logmsgbot: maxsem Synchronized php-1.24wmf21/resources/: SWAT: https://gerrit.wikimedia.org/r/#/c/160488/1 https://gerrit.wikimedia.org/r/#/c/160543/ (duration: 00m 06s)
* 23:26 bblack: restarting lvs1001 for HT disable + kernel upgrade
* 23:19 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/VisualEditor/: SWAT: https://gerrit.wikimedia.org/r/#/c/160554/ (duration: 00m 07s)
* 23:12 bblack: restarting lvs1002 for HT disable + kernel upgrade
* 23:07 Krinkle: Running sample job on integration-slave1006 and  warming up npmjs.org cache
* 22:56 Krinkle: Running sample job on integration-slave1008 and warming up npmjs.org cache
* 22:49 Krinkle: Running sample job on integration-slave1007 and warming up npmjs.org cache
* 22:48 Krinkle: Pooling the newly setup Trusty-based Jenkins slaves (integration-slave1006, integration-slave1007 and integration-slave1008)
* 22:42 bblack: dropping static routes for 2620:0:861:ed1a::[d,f,10,11] -> lvs1005 from cr[12]-eqiad (only 11 is of any consequence, misc-web-lb, and they're advertised by bgp and this is preventing failover to lvs1002)
* 21:28 cscott: updated OCG to version 188a3c221d927bd0601ef5e1b0c0f4a9d1cdbd31
* 20:46 subbu: deployed Parsoid version b845bff9
* 18:49 logmsgbot: ejegg Synchronized php-1.24wmf20/extensions/CentralNotice/: Update CentralNotice to remove jquery.json dependency (duration: 00m 23s)
* 18:46 hoo: Sync to tmh100[12] failed, according to awight
* 18:44 logmsgbot: ejegg Synchronized php-1.24wmf21/extensions/CentralNotice/: Update CentralNotice to remove jquery.json dependency (duration: 00m 09s)
* 18:43 manybubbles: performance tests show cirrus should handle jawiki with no problem but if load spirals out of control and I'm not around then revert https://gerrit.wikimedia.org/r/#/c/160465/
* 18:40 hoo: Local part of the global rename of Gnumarcoo => .avgas fatally timed out on itwiki. This needs to be fixed per hand.
* 18:40 manybubbles: Setting Cirrus to jawiki's primary search backend went well but Japan is mostly asleep.  If Elasticsearch load takes a turn for the worse in four or five hours then we'll know how it went.
* 17:14 bd808: Restarted elasticsearch on logstash1003; 2014-09-14T09:33:57Z java.lang.OutOfMemoryError
* 17:09 _joe_: killing salt-call on all mediawiki hosts
* 17:06 bd808: Restarted elasticsearch on logstash1001; 2014-09-15T06:12:09Z java.lang.OutOfMemoryError
* 17:04 bblack: using salt to kill salt-minion everywhere...
* 17:02 bd808: Restarted logstash on logstash1001. I hoped this would fix the dashboards, but it looks like the backing elasticsearch cluster is too sad for them to work at the moment.
* 16:55 bd808: Restarted hung elasticsearch service on logstash1002
* 16:15 manybubbles: jawiki now has cirrus as primary.  we're back to where we were before the great cascading failure of two months ago
* 16:13 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
* 15:29 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/MultimediaViewer/: [SWAT] Several backports for metrics and bugfixes in Media Viewer (duration: 00m 07s)
* 15:27 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/MultimediaViewer/: [SWAT] Several backports for metrics and bugfixes in Media Viewer (duration: 00m 07s)
* 15:18 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/GeoCrumbs/GeoCrumbs.class.php: [SWAT] Handle return value NULL of GeoCrumbs::getParserCache (duration: 00m 07s)
* 15:17 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/GeoCrumbs/GeoCrumbs.class.php: [SWAT] Handle return value NULL of GeoCrumbs::getParserCache (duration: 00m 07s)
* 15:06 logmsgbot: marktraceur Synchronized wmf-config/: [SWAT] Remove 'renameuser' right from bureaucrats on CentralAuth wikis (duration: 00m 09s)
* 14:54 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Bump wikibase memcached key for test.wikidata, test, test2 (duration: 00m 16s)
* 14:54 hashar: Updated Jenkins Job Builder fork:  e5c0c61..2d74b16
* 14:50 logmsgbot: aude Finished scap: Put test.wikidata back on mw1.24-wmf19 extension branch (duration: 37m 27s)
* 14:43 manybubbles: restarting the enwiki cirrus reindex process - it crashed over the weekend.  why you crash and leave error message "1".  "1" is not a useful error message.
* 14:13 logmsgbot: aude Started scap: Put test.wikidata back on mw1.24-wmf19 extension branch
* 13:03 _joe_: fenari is swapping hard, restarting apache who was eating up all the RAM
* 09:20 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: *.scienceimage.csiro.au to the wgCopyUploadsDomains {{gerrit|159999}} {{bug|70771}} (duration: 00m 06s)
* 09:15 hashar: Jenkins: apt-get upgrade on prod slaves (updates php5 / libc / jdk 7)
* 03:09 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1036 (duration: 00m 09s)
* 02:03 logmsgbot: LocalisationUpdate failed: mwversionsinuse returned empty list
* 01:47 logmsgbot: hoo Synchronized wmf-config/liquidthreads.php: Remove global $path (duration: 00m 07s)
* 01:47 logmsgbot: hoo Synchronized wmf-config/flaggedrevs.php: Remove global $path (duration: 00m 10s)
 
== September 14 ==
* 20:37 ori_: enabling puppet on mw1053
* 20:11 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062, locked up (duration: 00m 09s)
* 13:24 _joe_: stopped puppet aand the JR on mw1053
* 12:42 hoo: Ran sync-common on mw1053 to stop "Unrecognized job type 'ChangeNotification'." exceptions
* 11:14 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1005 (duration: 00m 07s)
* 10:37 springle: restart es1005
* 09:56 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1007, depool es1005 (duration: 00m 10s)
* 02:01 logmsgbot: LocalisationUpdate failed: mwversionsinuse returned empty list
* 00:45 ori_: fenari appears to still have twemproxy (in addition to nutcracker); decom'ing.
* 00:29 ori_: restarting apache2 on fenari
 
== September 13 ==
* 04:42 legoktm: global rename for Trevor Parscal (WMF) unstuck itself, yay
* 04:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 13 04:22:04 UTC 2014 (duration 22m 3s)
* 03:51 legoktm: global rename for Trevor Parscal --> Trevor Parscal (WMF) looks stuck on metawiki and mswiki, in queued state for both but showJobs.php says the jobs are active and claimed
* 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-13 03:11:40+00:00
* 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-13 02:38:26+00:00
* 01:45 logmsgbot: ori Synchronized php-1.24wmf21/extensions/Flow: Update flow for I4da934dfe (duration: 00m 06s)
* 01:45 logmsgbot: ori Synchronized php-1.24wmf20/extensions/Flow: Update flow for I4da934dfe (duration: 00m 06s)
* 01:41 logmsgbot: ori Synchronized php-1.24wmf20/extensions/Flow: Update flow for I4da934dfe (duration: 00m 08s)
 
== September 12 ==
* 21:26 csteipp: deployed fixes for bugs 70620, 69008
* 20:37 logmsgbot: mattflaschen Synchronized php-1.24wmf21/extensions/GettingStarted/: Deploy to fix GettingStarted bucketting for users with null registration date (duration: 00m 05s)
* 20:37 logmsgbot: mattflaschen Synchronized php-1.24wmf20/extensions/GettingStarted/: Deploy to fix GettingStarted bucketting for users with null registration date (duration: 00m 07s)
* 19:34 legoktm: running migratePass0.php across all CentralAuth wikis
* 17:43 logmsgbot: ori updated /a/common to {{Gerrit|I4e4187285}}: Rename some constants to clarify their meaning and purpose
* 14:52 manybubbles: rebuilding enwiki's Cirrus index for more performance testing.  Please be faster now.  k?
* 08:37 _joe_: rolling restart of pybal finished. Adding note on Fenari
* 08:19 _joe_: reactivated puppet on all lvs hosts, esams almost done, pending eqiad
* 08:06 _joe_: new pybal conf applied in all of ulsfo
* 07:39 _joe_: changing pybal config place; stopping puppet on all loadbalancers
* 04:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 12 04:27:17 UTC 2014 (duration 27m 16s)
* 03:15 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-12 03:15:57+00:00
* 03:08 logmsgbot: mattflaschen Finished scap: One last CSS fix (wrapping issue for error state) for GettingStarted A/B test (duration: 24m 38s)
* 02:43 logmsgbot: mattflaschen Started scap: One last CSS fix (wrapping issue for error state) for GettingStarted A/B test
* 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-12 02:39:35+00:00
* 01:33 logmsgbot: mattflaschen Synchronized php-1.24wmf21/extensions/GettingStarted/: CSS tweaks for GettingStarted A/B test (duration: 00m 07s)
* 01:32 logmsgbot: mattflaschen Synchronized php-1.24wmf20/extensions/GettingStarted/: CSS tweaks for GettingStarted A/B test (duration: 00m 21s)
* 01:29 logmsgbot: ori Synchronized wmf-config/wikitech.php: Ia5b81076e: Update path reference for /srv/mediawiki (duration: 00m 04s)
* 01:28 logmsgbot: ori updated /a/common to {{Gerrit|Ia5b81076e}}: Update path reference for /srv/mediawiki
* 01:19 ori: manually migrated /u/l/a/common-local to /srv/mediawiki on virt1000
* 00:36 logmsgbot: ori Synchronized php-1.24wmf21/extensions/Wikidata: Update Wikidata to tip of master for I23b7eb54b8e (Bug: 70747) (duration: 00m 08s)
* 00:12 logmsgbot: esanders Synchronized php-1.24wmf21/resources/lib/oojs-ui/: (no message) (duration: 00m 03s)
* 00:12 logmsgbot: esanders Synchronized php-1.24wmf21/extensions/MultimediaViewer/: (no message) (duration: 00m 07s)
* 00:00 logmsgbot: esanders Finished scap: SWAT deploy (duration: 28m 39s)
 
== September 11 ==
* 23:31 logmsgbot: esanders Started scap: SWAT deploy
* 23:29 logmsgbot: mattflaschen Finished scap: Deploy new GettingStarted recommendations A/B test (duration: 99m 34s)
* 23:15 logmsgbot: esanders scap failed: LockFailedError Failed to lock /var/lock/scap: [Errno 11] Resource temporarily unavailable (duration: 00m 00s)
* 23:00 mutante: restarting icinga-wm for config change
* 21:49 logmsgbot: mattflaschen Started scap: Deploy new GettingStarted recommendations A/B test
* 21:14 Krinkle: Stopping/starting zuul
* 21:08 andrewbogott: restarting zuul on gallium
* 20:58 andrewbogott: restarted jenkins, maybe
* 20:56 ori: graceful'd apache on mw1053, missed it earlier
* 20:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I1f3234746: Revert Scribunto: double the Lua CPU limit on the job runners (duration: 00m 05s)
* 20:48 logmsgbot: ori updated /a/common to {{Gerrit|I1f3234746}}: Revert "Scribunto: double the Lua CPU limit on the job runners"
* 20:42 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 20:15 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
* 20:15 andrewbogott: syncing virt1000, again in hopes of moving to wmf20
* 20:08 logmsgbot: reedy Synchronized php-1.24wmf21/extensions/Wikidata/: (no message) (duration: 00m 17s)
* 19:58 Reedy: Running sync-common on mw1024
* 19:52 Reedy: Running manual sync-common on mw1138
* 19:51 logmsgbot: reedy Synchronized wmf-config/: Fix Zero settings (duration: 00m 15s)
* 19:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf21
* 19:44 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf20
* 19:20 mutante: graceful'ed apache on mw1143
* 19:16 Reedy: running sync-common on mw1143
* 19:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
* 19:02 bd808: Restarted elasticsearch on logstash1003 -- Java OOM error in logs and not recovering shards
* 18:54 ori: graceful'd all apaches
* 18:51 ori: graceful'd apache on mw1047, mw1151, mw1137, mw1146 and mw1076
* 18:46 logmsgbot: ori Synchronized php-1.24wmf19/includes/WebStart.php: (no message) (duration: 00m 06s)
* 18:45 logmsgbot: ori Synchronized php-1.24wmf19/includes/profiler/Profiler.php: (no message) (duration: 00m 07s)
* 18:17 logmsgbot: reedy Started scap: testwiki to 1.24wmf21 and build l10n cache take 3
* 18:16 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.Nd45X2RONi" --verbose' returned non-zero exit status 1 (duration: 01m 18s)
* 18:14 logmsgbot: reedy Started scap: testwiki to 1.24wmf21 and build l10n cache
* 18:13 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.IH8przTNHs" ' returned non-zero exit status 1 (duration: 04m 59s)
* 18:08 logmsgbot: reedy Started scap: testwiki to 1.24wmf21 and build l10n cache
* 18:02 manybubbles: raised logging on Elasticsearch cluster temporarily to get more information about merging - a process super important to keeping the index up to date in "real time"
* 17:20 logmsgbot: ori updated /a/common to {{Gerrit|I0bda3deab}}: Replace remaining references to /u/l/a/common
* 17:18 logmsgbot: ori updated /a/common to {{Gerrit|I37b0a8338}}: Get rid of MULTIVER_CDB_DIR_{APACHE,HOME}
* 16:57 andrewbogott: sync-common on virt1000 -- with any luck this will upgrade us to wmf20
* 16:56 logmsgbot: andrew rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
* 16:53 logmsgbot: bd808 Finished scap: Preparing to move wikitech to 1.24wmf20 (second try) (duration: 24m 25s)
* 16:46 andrewbogott: apache graceful on mw1039
* 16:33 bd808|deploy: andrewbogott did apache graceful on mw1120 to stop wikidata APC logspam
* 16:29 logmsgbot: bd808 Started scap: Preparing to move wikitech to 1.24wmf20 (second try)
* 16:22 logmsgbot: andrew Finished scap: Preparing to move wikitech to 1.24wmf20 (duration: 06m 45s)
* 16:19 bd808: Restarted logstash on logstash1001. Log empty and events not being stored in elasticsearch
* 16:15 logmsgbot: andrew Started scap: Preparing to move wikitech to 1.24wmf20
* 15:45 bblack: icinga config is correct now, back to normal puppet updates
* 15:24 bblack: restarted icinga, manually removed some labsy things that were broken in config and temporarily disabled puppet :p
* 14:44 _joe_: php upgrade finished
* 14:23 _joe_: upgrading php across the cluster: libapache2-mod-php5 php5-cli php-pear php5 php5-common php5-curl php5-dev php5-intl php5-mysql php5-xmlrpc
* 13:04 akosiaris: uploaded php5_5.3.10-1ubuntu3.14+wmf1 on apt.wikimedia.org
* 10:00 _joe_: enabled puppet on mw1053
* 09:38 _joe_: gracefulling mw1200 mw1196 and mw1186 as they have APC issues
* 09:21 _joe_: upgrading hhvm and hhvm-luasandbox across the production cluster
* 09:00 akosiaris: upgrading php5 to 5.3.10-1ubuntu3.14+wmf1 on mw1212
* 08:34 _joe_: updating php-pear php5 php5-cli php5-common php5-curl php5-dev php5-intl php5-mysql php5-xmlrpc libapache2-mod-php5 on mw1018, see USN 2344-1
* 03:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 11 03:41:03 UTC 2014 (duration 41m 2s)
* 02:49 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-11 02:49:26+00:00
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-11 02:36:37+00:00
* 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-11 02:23:29+00:00
* 00:28 mutante: graceful'ed Apaches on mw1171, mw1187
* 00:25 logmsgbot: ori Synchronized wmf-config: Id607bf36d: Update remaining references to /u/l/a/common-local (duration: 00m 03s)
* 00:25 logmsgbot: ori Synchronized multiversion: Id607bf36d: Update remaining references to /u/l/a/common-local (duration: 00m 04s)
* 00:22 logmsgbot: ori Synchronized docroot and w: Id607bf36d: Update remaining references to /u/l/a/common-local (duration: 00m 04s)
* 00:07 logmsgbot: ori updated /a/common to {{Gerrit|Id607bf36d}}: Update remaining references to /u/l/a/common-local
 
== September 10 ==
* 23:44 mutante: graceful'ed mw1202 apache
* 23:29 mutante: deleted labstore1003.eqiad.wmnet.org from puppet stored resource db, fixes puppet runs on hosts with ssh host key collection
* 23:26 logmsgbot: oblivian gracefulled all apaches
* 23:22 logmsgbot: maxsem Synchronized php-1.24wmf20/includes/specialpage/SpecialPageFactory.php: https://gerrit.wikimedia.org/r/#/c/159526/ (duration: 00m 03s)
* 23:22 logmsgbot: maxsem Synchronized php-1.24wmf19/includes/specialpage/SpecialPageFactory.php: https://gerrit.wikimedia.org/r/#/c/159526/ (duration: 00m 03s)
* 23:21 logmsgbot: maxsem Synchronized php-1.24wmf20/extensions/CentralAuth/: (no message) (duration: 00m 03s)
* 23:21 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/CentralAuth/: (no message) (duration: 00m 04s)
* 23:19 logmsgbot: maxsem Synchronized php-1.24wmf10/resources/: https://gerrit.wikimedia.org/r/#/c/159513/ (duration: 00m 05s)
* 22:52 mutante: labstore1003 - (earlier) revoked salt and puppet key and signed new after hostname fix - same salt-minion puppet errors that happen after reinstalls
* 19:52 Reedy: Created Echo tables on extension1 for cawikimedia
* 19:51 RobH: puppet disabled on carbon (install server) for a livehack test of config setting
* 18:51 yurikR: yurik CommonSettings.php - zerowiki perm changes
* 18:51 logmsgbot: yurik Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 05s)
* 18:26 logmsgbot: yurik Synchronized php-1.24wmf20/extensions/ZeroBanner: (no message) (duration: 01m 09s)
* 18:22 logmsgbot: yurik Synchronized php-1.24wmf19/extensions/ZeroBanner: (no message) (duration: 01m 11s)
* 18:00 manybubbles: cirrus index rebuild for test2wiki went well - doing the rest of group0
* 17:35 manybubbles: rebuilding cirrus index for test2wiki to test some performance enhancements don't break anything.  test2wiki is too small to see any gain from the enhancements though.
* 17:25 Reedy: mw1126, mw1116, mw1122, mw1146, mw1121, mw1136, mw1114, mw1068 have been gracefulled
* 17:10 bd808: Restarted logstash on logstash1001
* 16:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: nlwiki cirrus (duration: 00m 04s)
* 15:44 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 15s)
* 15:02 logmsgbot: demon Synchronized wmf-config/wikitech.php: no-op (duration: 00m 06s)
* 09:13 godog: rolling restart swift-proxy on ms-fe1*
* 04:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 10 04:17:36 UTC 2014 (duration 17m 35s)
* 03:08 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-10 03:07:59+00:00
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-10 02:36:00+00:00
* 02:28 ori: updated salt key for iridium and restarted salt-minion
* 02:18 mutante: started salt-minion on iridium
 
== September 9 ==
* 23:15 Krinkle: Reloading Zuul to deploy I26bc21ed2938e97e7ed6f6b
* 23:15 logmsgbot: demon Synchronized php-1.24wmf20/extensions/CirrusSearch: Various fixes for things (duration: 00m 05s)
* 23:00 mutante: added wikimedia.org to search in resolv.conf on terbium
* 22:42 logmsgbot: ebernhardson Synchronized wmf-config/InitialiseSettings.php: Deploy config change I158e7c6852 (duration: 00m 04s)
* 22:23 Krinkle: Reloading Zuul to deploy I27024680c74ca0130
* 22:21 logmsgbot: ebernhardson Finished scap: Bump Echo and Flow versions in 1.24wmf19 (duration: 31m 25s)
* 21:49 logmsgbot: ebernhardson Started scap: Bump Echo and Flow versions in 1.24wmf19
* 20:42 akosiaris: service gmetad restart on nickel.wikimedia.org due to ganglia web not working
* 20:15 cscott: updated OCG to version c9a2b4cf2502479eeabed07ab2de728695d96e46
* 19:05 mutante: killed jgonera's screen session on stat1002 - puppet failed to deactivate otherwise
* 18:46 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:32 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 18:31 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Add cawikimedia
* 18:28 logmsgbot: reedy Synchronized multiversion/: (no message) (duration: 00m 14s)
* 18:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf20
* 16:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: eswiki getting cirrus (duration: 00m 04s)
* 15:32 bblack: deploying large DNS change https://gerrit.wikimedia.org/r/#/c/158382/ - be on the lookout for any related fallout from here...
* 15:27 marktraceur: [SCAP] Deployed fix for oojs class names at James_F's behest, sorry for lack of message.
* 15:26 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/MobileFrontend/less/modules/editor/VisualEditorOverlay.less: (no message) (duration: 00m 07s)
* 15:08 logmsgbot: marktraceur Synchronized php-1.24wmf20/tests/phpunit/includes/changes/OldChangesListTest.php: [SWAT] Fix undefined argument (css classes) in OldChangesList. (duration: 00m 07s)
* 15:06 logmsgbot: marktraceur Synchronized php-1.24wmf20/includes/changes/OldChangesList.php: [SWAT] Fix undefined argument (css classes) in OldChangesList. (duration: 00m 07s)
* 11:29 _joe_: git.wikimedia.org works now, no action needed
* 11:26 MatmaRex: git.wikimedia.org is down: Error: 503, Service Unavailable
* 10:04 _joe_: also re-enabling puppet
* 10:02 _joe_: restarting manually apache on mw1178,mw1192,mw1163,mw1130,mw1018 as they started with the wrong pidfile before my fix
* 09:24 _joe_: disabling puppet on appservers
* 08:55 godog: launched "iptables" on tin to check current rules and it loaded iptables modules, logging for future reference
* 08:10 _joe_: re-enabling puppet on appservers and imagescalers, change is good
* 08:08 _joe_: restarted apache2 on mw1018
* 08:06 _joe_: stopping apache on mw1018 for inspection
* 07:36 _joe_: that was on appservers
* 07:36 _joe_: disabling puppet, releasing a potentially harmful apache change
* 04:56 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep  9 04:56:25 UTC 2014 (duration 56m 24s)
* 03:44 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-09 03:44:07+00:00
* 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-09 03:11:27+00:00
* 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-09 02:38:38+00:00
* 01:02 logmsgbot: ebernhardson Synchronized php-1.24wmf20/extensions/Flow/includes/Content/BoardContentHandler.php: Sync BoardContentHandler.php for Flow in 1.24wmf20 (duration: 00m 04s)
* 00:22 mutante: re-enabled mw1070 in pybal
* 00:19 logmsgbot: ebernhardson Finished scap: Repeat SWAT scap deployment due to possible sync-common failure (duration: 38m 50s)
 
== September 8 ==
* 23:59 ori: restarted rsync on mw1070 to unblock scap
* 23:40 logmsgbot: ebernhardson Started scap: Repeat SWAT scap deployment due to possible sync-common failure
* 23:39 logmsgbot: ebernhardson Finished scap: SWAT deploy updates to Flow, Echo and Thanks (duration: 24m 00s)
* 23:34 mutante: disabled mw1070 in pybal because it refused sync
* 23:31 ebernhardson: scap failed to connect to mw1070.  Repeated message: rsync: failed to connect to mw1070.eqiad.wmnet (10.64.16.50): Connection refused (111)
* 23:15 logmsgbot: ebernhardson Started scap: SWAT deploy updates to Flow, Echo and Thanks
* 23:02 logmsgbot: ebernhardson Synchronized wmf-config/InitialiseSettings.php: gerrit:159089 Enable $wgContentHandlerUseDB on mediawikiwiki, testwiki, & test2wiki (duration: 00m 05s)
* 20:14 subbu: deployed Parsoid ce108cb5
* 18:01 logmsgbot: demon Synchronized php-1.24wmf19/extensions/Wikidata: Updating Wikidata to f1d2110 (duration: 00m 09s)
* 17:19 mutante: disabled notifications for puppet freshness on neon
* 16:19 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: svwiki: Cirrus as primary (duration: 00m 04s)
* 15:42 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/Wikidata/: SWAT update wikidata to fix add links widget (duration: 00m 06s)
* 15:32 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/LiquidThreads/: SWAT update liquidthreads to fix some missing images (duration: 00m 04s)
* 15:28 manybubbles: 15:13:53 Synchronized php-1.24wmf19/extensions/WikiLove/: SWAT fix for WikiLove (duration: 00m 04s)
* 15:28 manybubbles: this is the missing log:
* 15:27 manybubbles: sync logging was down so it missed some syncing I just did.
* 15:25 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/WikiLove/: (no message) (duration: 00m 05s)
* 15:20 logmsgbot: manybubbles Synchronized wmf-config: SWAT another cirrus setting update (duration: 00m 04s)
* 15:10 logmsgbot: manybubbles Synchronized wmf-config: SWAT finish updating Cirrus settings (duration: 00m 05s)
* 15:10 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT update some cirrus settings (duration: 00m 04s)
* 15:10 cmjohnson1: shutting down neon for memory upgrade
* 14:41 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1072 (duration: 00m 09s)
* 12:23 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1073, depool db1072 (duration: 00m 06s)
* 11:07 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1073 (duration: 00m 09s)
* 10:55 _joe_: re-enabled puppet, the change results in a no-op as expected
* 10:42 _joe_: disabling puppet on all appservers while updating apache config.
* 04:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: move enwiki api traffic to db1051/db1066 (duration: 00m 09s)
* 03:37 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep  8 03:36:13 UTC 2014 (duration 36m 12s)
* 02:45 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-08 02:44:47+00:00
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-08 02:32:37+00:00
* 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-08 02:19:52+00:00
 
== September 7 ==
* 23:35 Tim: upgrading liblua everywhere
* 20:36 ori: mw1017: upgraded HHVM from 3.3-dev+20140728+wmf5 to 3.3-dev+20140728+wmf6
* 15:12 apergos: manually changed /etc/hosts entry on analytics1004 from having "analyticas1004.eqiad.wmnet" to "analytics1004.eqiad.wmnet"
* 06:15 godog: powercycle ms-be1005, not even responsive on console
* 03:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep  7 03:29:51 UTC 2014 (duration 29m 50s)
* 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-07 02:42:12+00:00
* 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-07 02:30:15+00:00
* 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-07 02:17:44+00:00
 
== September 6 ==
* 03:43 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep  6 03:42:22 UTC 2014 (duration 42m 21s)
* 02:51 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-06 02:50:35+00:00
* 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-06 02:37:41+00:00
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-06 02:24:35+00:00
 
== September 5 ==
* 23:28 logmsgbot: kaldari Synchronized wmf-config/mobile-labs.php: enabling wikigrok on beta labs (en only) (duration: 00m 03s)
* 23:28 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings-labs.php: enabling wikigrok on beta labs (en only) (duration: 00m 04s)
* 23:27 logmsgbot: kaldari updated /a/common to {{Gerrit|Iec209bde0}}: Map config var for $wgMFEnableWikiGrok
* 22:25 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings-labs.php: enabling wikigrok on beta labs (en only) (duration: 00m 05s)
* 22:25 logmsgbot: kaldari updated /a/common to {{Gerrit|I6039956eb}}: Enable Wikigrok prototype for beta labs (enwiki only)
* 22:24 awight: Deleted Light User and Merkle roles from the CRM
* 20:20 RobH: coms folks still accessing blog data on holmium, powering back up
* 20:18 bblack: restarted cp1056 bits cache and re-enabled in pybal
* 18:34 mark: Depooled cp1056 for testing
* 17:50 logmsgbot: ori Synchronized docroot and w: Iaa7518613: Fix spelling in symlink (duration: 00m 15s)
* 17:45 logmsgbot: ori Synchronized docroot and w: I55a01a712: Fix relative symlinks for bits/static-master (duration: 00m 13s)
* 13:00 Jeff_Green: lutetium dist-upgrade and reboot
* 12:04 legoktm: running extensions/GlobalCssJs/removeOldManualUserPages.php for [[m:GlobalCssJs]]
* 07:59 springle: dump es1007 to db1004, tokudb external storage page compression test. ok to kill in emergency
* 06:40 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool es1007 (duration: 00m 07s)
* 04:36 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep  5 04:35:11 UTC 2014 (duration 35m 10s)
* 04:27 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1002 (duration: 00m 06s)
* 04:01 springle: reboot es1002, fs check
* 03:47 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-05 03:46:28+00:00
* 03:46 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool es1002 for upgrade (duration: 00m 07s)
* 03:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1062 and db1068 (duration: 02m 06s)
* 03:10 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-05 03:09:20+00:00
* 02:56 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062 and db1068 for upgrade (duration: 00m 56s)
* 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-05 02:38:32+00:00
* 01:01 manybubbles: applied same elasticsearch configuration to dewiki, eswiki, zhwiki, and frwiki
* 00:18 manybubbles: configured elasticsearch to force enwiki's content shards to stay off of the same nodes.  That ought to help performance.
 
== September 4 ==
* 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/VisualEditor/: (no message) (duration: 00m 05s)
* 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/ZeroPortal/: (no message) (duration: 00m 04s)
* 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/Flow/: (no message) (duration: 00m 05s)
* 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/includes/specials/: (no message) (duration: 00m 04s)
* 23:31 logmsgbot: catrope Synchronized php-1.24wmf19/extensions/ZeroPortal: (no message) (duration: 00m 05s)
* 23:30 logmsgbot: catrope Synchronized php-1.24wmf19/extensions/Flow: (no message) (duration: 00m 05s)
* 22:52 logmsgbot: reedy Finished scap: consistency (duration: 20m 44s)
* 22:31 logmsgbot: reedy Started scap: consistency
* 22:28 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep  4 22:27:45 UTC 2014 (duration 54m 38s)
* 21:54 bd808: sync-dir failure was really on osmium, not mw1161; confusing error messages are confusing
* 21:50 bd808: Running sync-common on mw1161 to try and reproduce error seen during sync-file
* 21:43 logmsgbot: spage Synchronized wmf-config/InitialiseSettings.php: Enable Flow on pages, including frwiki and hewiki (duration: 00m 09s)
* 21:40 logmsgbot: spage updated /a/common to {{Gerrit|Ib0aaa60f0}}: Enable Flow on several pages
* 21:08 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-04 21:07:10+00:00
* 20:56 MaxSem: Running cleanupPageProps.php from terbium, now for realz
* 20:42 mutante: restarting icinga-wm, making it join #wikidata for custom output
* 20:16 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-04 20:15:35+00:00
* 19:56 Reedy: mw1088 and mw1100 rsync errors during the manual l10n update
* 19:25 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-04 19:23:57+00:00
* 18:32 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf20
* 18:26 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf19
* 18:11 logmsgbot: reedy Synchronized php-1.24wmf19: (no message) (duration: 00m 55s)
* 18:10 logmsgbot: reedy Synchronized php-1.24wmf20: (no message) (duration: 00m 35s)
* 18:09 logmsgbot: reedy Finished scap: testwiki to 1.24wmf20 and build l10n cache (duration: 41m 33s)
* 18:05 mutante: restarting service gitblit on antimony
* 17:48 RobH: correction, simply surpressing alerts for the host in icinga is the better move, as the host isnt reclaimed yet, so not removing holmium from pupeptstoreddb
* 17:46 RobH: stopping puppet on holmium and removing it from puppetstoreddb so it doesnt show in icinga once updated
* 17:45 RobH: shutting down holmium, as blog has migrated for a month now.  Not yet wiping system, please leave for me (robh)
* 17:27 logmsgbot: reedy Started scap: testwiki to 1.24wmf20 and build l10n cache
* 16:44 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 17s)
* 16:36 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 15:53 logmsgbot: andrew Synchronized private/WikitechPrivateSettings.php: (no message) (duration: 00m 01s)
* 15:40 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 15:18 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: plwiki gets Cirrus (duration: 00m 06s)
* 14:56 bd808: ori updated scap to 773f95f (change deploy_dir to /srv/mediawiki) ~15 hours ago
* 08:16 _joe_: running sync-common on mw1017, trying to debug the hhvm bad state
* 06:37 godog: clear slowlog on elastic1004
* 05:25 jeremyb: temp hack fix deployed for morebots (here and labs, not the other instances)
* 04:47 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1035, warm up (duration: 00m 08s)
* 04:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep  4 04:31:28 UTC 2014 (duration 31m 27s)
* 03:43 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-04 03:42:34+00:00
* 03:13 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-04 03:11:58+00:00
* 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-04 02:40:45+00:00
* 01:08 mutante: production wants project name?
* 01:02 andrewbogott: the SAL still works, but the bot fails to acknowledge.  Something to do with a change on wikitech
* 00:59 andrewbogott: testing the log
* 00:43 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
 
== September 3 ==
* 23:52 logmsgbot: reedy Synchronized php-1.24wmf15/includes/EditPage.php: (no message) (duration: 00m 14s)
* 23:43 logmsgbot: ori Synchronized docroot and w: (no message) (duration: 00m 05s)
* 23:28 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 23:04 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/157855/ https://gerrit.wikimedia.org/r/#/c/158265/ (duration: 00m 04s)
* 21:09 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Disable GlobalUsage on labswiki (duration: 00m 15s)
* 20:59 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 20:47 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 15s)
* 20:45 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 15s)
* 20:38 logmsgbot: andrew Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 05s)
* 20:38 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 04s)
* 20:37 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 04s)
* 20:16 subbu: deployed Parsoid version 78e55c6b (deploy repo sha c0761179)
* 18:52 logmsgbot: yurik Synchronized wmf-config: enabling graph ext on zerowiki & collabwiki (duration: 01m 06s)
* 18:51 MaxSem: Running sync-common on mw1163
* 18:48 logmsgbot: yurik Synchronized php-1.24wmf18/extensions/Graph/: (no message) (duration: 01m 09s)
* 18:47 logmsgbot: yurik Synchronized php-1.24wmf19/extensions/Graph/: (no message) (duration: 01m 05s)
* 16:52 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 03s)
* 16:52 logmsgbot: andrew Synchronized private/WikitechPrivateLdapSettings.php: (no message) (duration: 00m 03s)
* 16:51 logmsgbot: andrew Synchronized private/WikitechPrivateSettings.php: (no message) (duration: 00m 05s)
* 16:51 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 03s)
* 16:19 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 03s)
* 16:18 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 04s)
* 16:16 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 05s)
* 15:41 _joe_: mw1020 correctly reimaged, putting it in the hhvm pool
* 15:27 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - Update another cirrus config - this time maybe it will work (duration: 00m 05s)
* 15:12 manybubbles: deployed throttling for Cirrus job named cirrusSearchLinksUpdate - it handles updating the index when a transcluded page changes - we'll have to check on the backlog over the next few hours/days to see if it stabilizes
* 15:11 logmsgbot: manybubbles Synchronized php-1.24wmf19/extensions/Wikidata/: (no message) (duration: 00m 07s)
* 15:07 manybubbles: mw1020 gets WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!  during sync-dir call
* 15:07 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy cirrus config changes - make sure to get mw1020 (duration: 00m 04s)
* 15:05 manybubbles: https://gerrit.wikimedia.org/r/#/c/157861/ didn't work as expected - dropped everything out of using the all field......
* 15:03 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy cirrus config changes (duration: 00m 06s)
* 14:53 cmjohnson1: running sync-common on mw1178
* 14:52 cmjohnson1: adding mw1178 back to pybal
* 12:42 _joe_: typo: mw1020, not mw1120
* 12:41 _joe_: mw1120: remove from pybal, schedule downtime, reimage to HAT
* 11:23 godog: run gmond on elastic1002 manually to debug ES collector issues
* 11:17 godog: run gmond on elastic1001 manually to debug ES collector issues
* 07:55 _joe_: re-enabling mw1192, what we were seeing was probably load and not anything else
* 06:56 ori: restarted memcached on virt1000 due to cache pollution from migration (different memc drivers w/different encoding)
* 04:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep  3 04:53:50 UTC 2014 (duration 53m 49s)
* 03:51 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-03 03:50:17+00:00
* 03:17 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-03 03:16:37+00:00
* 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-03 02:42:24+00:00
* 00:18 mutante: deleted PDF files older than 3d and a huge 1G one on ocg1001 in reaction to monitoring complaints
* 00:00 logmsgbot: ori Synchronized php-1.24wmf18/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 03s)
 
== September 2 ==
* 23:49 logmsgbot: ori Synchronized php-1.24wmf19/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 03s)
* 23:32 logmsgbot: reedy Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 13s)
* 23:22 logmsgbot: catrope Synchronized php-1.24wmf19/includes/OutputPage.php: 5094c0d9c (duration: 00m 05s)
* 23:14 Krinkle: Running extensions/GlobalCssJs/removeOldManualUserPages.php per [[m:GlobalCssJs]]
* 22:49 logmsgbot: reedy Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 14s)
* 22:37 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 13s)
* 22:26 logmsgbot: reedy Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 13s)
* 22:10 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
* 21:57 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 21s)
* 21:55 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 24s)
* 21:51 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 25s)
* 21:45 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 23s)
* 21:37 logmsgbot: reedy Synchronized wmf-config/db-eqiad.php: Wikitech db (duration: 00m 22s)
* 21:34 logmsgbot: bd808 Finished scap: no-op scap to build l10n for wikitech (duration: 55m 48s)
* 20:39 logmsgbot: bd808 Started scap: no-op scap to build l10n for wikitech
* 20:35 logmsgbot: bd808 Synchronized wmf-config/wikitech.php: eebc99a Require before instatiate (duration: 00m 04s)
* 20:31 logmsgbot: bd808 Synchronized private/PrivateSettings.php: Absolute path for WikitechPrivateSettings.php (duration: 00m 05s)
* 20:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
* 20:05 MaxSem: Running cleanupPageProps.php everywhere
* 19:51 MaxSem: Running cleanupPageProps.php on mw.org and meta
* 19:14 logmsgbot: reedy Synchronized wmf-config/Wikibase.php: Bump epoch (duration: 00m 14s)
* 19:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf19, added labswiki too
* 18:56 logmsgbot: bd808 Synchronized fishbowl.dblist: Add labswiki (wikitech) (duration: 00m 05s)
* 18:25 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 08s)
* 17:49 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 14s)
* 17:39 logmsgbot: andrew Synchronized multiversion/MWMultiVersion.php: (no message) (duration: 00m 04s)
* 17:14 logmsgbot: andrew Finished scap: Deploying wikitech config (duration: 33m 03s)
* 17:01 bd808: Fetched f711ea7 to /a/common on tin; not syncing because of in-process scap.
* 16:41 logmsgbot: andrew Started scap: Deploying wikitech config
* 16:21 logmsgbot: andrew scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.SCRILhxGxO" ' returned non-zero exit status 1 (duration: 01m 17s)
* 16:20 logmsgbot: andrew Started scap: Deploying wikitech config
* 16:17 ottomata: installing newer version of webstatscollector on oxygen and gadolinium, restarting filter process on oxygen
* 16:08 logmsgbot: andrew Synchronized /a/common/private/WikitechPrivateSettings.php: (no message) (duration: 00m 04s)
* 16:07 logmsgbot: andrew Synchronized /a/common/private/PrivateSettings.php: (no message) (duration: 00m 03s)
* 16:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Commons gets Cirrus as primary (duration: 00m 04s)
* 15:44 godog: bring mw1114 -> mw1131 to weight 15
* 15:21 logmsgbot: marktraceur Synchronized wmf-config/: [SCAP] SpecialCite is now CiteThisPage (duration: 00m 07s)
* 15:17 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: [SCAP] Enable the TemplateData GUI editor on Norwegian Wikipedia (duration: 00m 07s)
* 15:14 logmsgbot: marktraceur updated /a/common to {{Gerrit|Ia1758b21e}}: depool db1035 for upgrade, move s3 vslow/dump to db1019
* 15:06 logmsgbot: marktraceur Synchronized php-1.24wmf19/includes/EditPage.php: [SCAP] Revert "Toolbar: Only show on WikiText pages" (duration: 00m 07s)
* 15:05 logmsgbot: marktraceur Synchronized php-1.24wmf18/includes/EditPage.php: [SCAP] Revert "Toolbar: Only show on WikiText pages" (duration: 00m 08s)
* 12:36 godog: increase weight to 15 for mw1132 -> mw1148
* 10:00 _joe_: depooling mw1192, high CPU temperatures; we may need to check fan status
* 07:20 _joe_: powercycling mw1192, blank console, unresponsive
* 07:02 springle: removed all-but-latest large slow logs on elastic1004 and elastic1014
* 06:22 springle: removed txt files filling up db1047 /tmp, looked like analytics SELECT INTO OUTFILE, dated mid-August
* 05:58 springle: dump s3 db1035 to db1069:3313
* 05:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1067, warm up (duration: 00m 08s)
* 04:59 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1067 (duration: 00m 07s)
* 03:35 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1035 (duration: 00m 07s)
* 03:16 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep  2 03:15:22 UTC 2014 (duration 15m 21s)
* 02:53 springle: restarted dbstore1002 mysqld for upgrade
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-02 02:25:36+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-02 02:14:30+00:00
 
== September 1 ==
* 23:00 Krinkle: Running extensions/GlobalCssJs/removeOldManualUserPages.php per [[m:GlobalCssJs]]
* 21:50 ori: disabled gerrit account Caothu9669; spam
* 19:12 Reedy: Deleted php-1.24wmf[6-8] from apaches via dsh
* 19:01 logmsgbot: reedy Purged l10n cache for 1.24wmf13
* 19:01 logmsgbot: reedy Purged l10n cache for 1.24wmf14
* 19:00 logmsgbot: reedy Purged l10n cache for 1.24wmf15
* 18:59 logmsgbot: reedy Purged l10n cache for 1.24wmf16
* 18:58 logmsgbot: reedy Purged l10n cache for 1.24wmf17
* 16:44 ottomata: removed some large slow query logs from elastic* nodes, need to look into this...
* 12:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1044, take 2 (duration: 00m 06s)
* 12:04 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1044 (duration: 00m 06s)
* 11:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: pool db1044, warm up (duration: 00m 06s)
* 09:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1027 (duration: 00m 07s)
* 07:27 godog: deploy latest ring to swift eqiad-prod
* 07:11 godog: powercycle ms-be1010 "cpu soft lockup" on console
* 05:28 springle: xtrabackup clone db1027 to db1044
* 05:26 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1027 while cloning (duration: 00m 07s)
* 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep  1 03:14:22 UTC 2014 (duration 14m 21s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-01 02:27:39+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-01 02:16:18+00:00
 
== August 31 ==
* 19:56 hashar: Jenkins updated HHVM to (3.3-dev+20140728+wmf5) over (3.3-dev+20140728+wmf4)
* 14:38 bblack: restarted apache on strontium
* 14:34 bblack: restarted apache on tungsten, machine is overloaded
* 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 31 03:14:41 UTC 2014 (duration 14m 40s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-08-31 02:28:42+00:00
* 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-31 02:17:24+00:00
* 02:00 ori: Stopped HHVM jobrunner and disabled Puppet on mw1053 due to bug 70177.
 
== August 30 ==
* 08:18 godog: restart mailman on sodium, pending https://gerrit.wikimedia.org/r/#/c/156766/
* 08:01 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1071, warm up (duration: 00m 07s)
* 06:33 jgage: analytics1021 back in service after election
* 06:14 jgage: upgraded & rebooted analytics1021
* 06:01 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: return db1037 to normal load (duration: 00m 06s)
* 05:03 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1001 (duration: 00m 06s)
* 04:06 springle: upgrade es1001 to mariadb 10
* 03:56 springle: xtrabackup clone db1037 to db1071
* 03:55 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: reduce db1037 load while cloning (duration: 00m 06s)
* 03:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: pool db1073, depool db1071 (duration: 00m 07s)
* 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 30 03:17:57 UTC 2014 (duration 17m 56s)
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-08-30 02:32:06+00:00
* 02:21 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-30 02:20:39+00:00
 
== August 29 ==
* 21:15 logmsgbot: ori Synchronized wmf-config: I812c0bb6c: Scrap unused Twemproxy config files (duration: 00m 04s)
* 21:12 logmsgbot: ori updated /a/common to {{Gerrit|I812c0bb6c}}: Scrap unused Twemproxy config files
* 21:03 mutante: restarted uwsgi on tungsten
* 20:49 mutante: tungsten extremely busy, graphite down, logging in since 5 minutes :p
* 20:48 mutante: powercycling ms-be1006 - BUG: soft lockup - CPU#0 stuck ...
* 19:19 mutante: installing package upgrades on iron, bast1001
* 18:41 logmsgbot: aaron Synchronized php-1.24wmf18/maintenance/findMissingFiles.php: 994d4a556a070156fd04fb4951492f10696cc63c (duration: 00m 03s)
* 18:30 logmsgbot: ori Synchronized php-1.24wmf19/resources/src/mediawiki.action/mediawiki.action.view.redirect.js: I19221a25a: mediawiki.action.view.redirect: Work around a IE 10+ HTML5 history API bug (duration: 00m 06s)
* 18:30 logmsgbot: ori Synchronized php-1.24wmf18/resources/src/mediawiki.action/mediawiki.action.view.redirect.js: I19221a25a: mediawiki.action.view.redirect: Work around a IE 10+ HTML5 history API bug (duration: 00m 07s)
* 15:36 hashar_: Jenkins: pooled a new slave 10.68.16.162 as wikidata-jenkins3 on behalf of addshore / wmde
* 15:04 _joe_: shutting down mw1163, filled RT 8243 for repair.
* 14:54 _joe_: re-enabled mw1130
* 14:41 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable Wikibase badges css, follow up from last night deploy (duration: 00m 06s)
* 14:22 _joe_: syncing mw1130
* 14:06 _joe_: disable mw1130 from the api pool whil it gets resynced
* 12:30 Krinkle: Running extensions/GlobalCssJs/removeOldManualUserPages.php per [[m:GlobalCssJs]]
* 11:06 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1070 (duration: 00m 09s)
* 08:04 hashar: Jenkins: in the jenkins-job-builder-config branch 'cloudbees' has been merged in 'master'. Unifying CI and browser tests jobs!  \O/
* 07:05 _joe_: re-enabling puppet on the jobrunner, to check if the luasandbox fix works
* 06:33 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: return db1056 to normal load (duration: 00m 06s)
* 04:14 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 29 04:13:03 UTC 2014 (duration 13m 2s)
* 03:31 springle: xtrabackup clone db1056 to db1070
* 03:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: reduce db1056 load while cloning (duration: 00m 06s)
* 03:15 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-08-29 03:10:26+00:00
* 02:48 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1070. pool db1072. (duration: 00m 07s)
* 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-29 02:37:20+00:00
* 01:49 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1070. pool db1072. (duration: 00m 06s)
* 01:27 godog: repool ms-fe1002
* 01:06 cmjohnson1: shutting down ms-fe1002 to relocate racks
* 01:04 godog: depool ms-fe1002
* 01:02 godog: repool ms-fe1001
* 00:57 logmsgbot: ori Synchronized php-1.24wmf18/extensions/WikimediaEvents: Ib44fe0898: Inject 'wgPoweredByHHVM' JS config var if powered by HHVM (duration: 00m 03s)
* 00:56 logmsgbot: ori Synchronized php-1.24wmf19/extensions/WikimediaEvents: Ib44fe0898: Inject 'wgPoweredByHHVM' JS config var if powered by HHVM (duration: 00m 04s)
* 00:38 cmjohnson1: shutting down ms-fe1001 for rack relocation
* 00:34 godog: depool ms-fe1001
* 00:32 godog: repool ms-fe1004
* 00:27 mutante: restarting gmetad on nickel
* 00:04 cmjohnson1: shutting down ms-fe1004 to relocate racks
 
== August 28 ==
* 23:58 godog: depool ms-fe1004
* 23:51 godog: repooling ms-fe1003
* 23:40 logmsgbot: maxsem Synchronized php-1.24wmf19/maintenance/: https://gerrit.wikimedia.org/r/#/c/156979/ (duration: 00m 04s)
* 23:39 logmsgbot: maxsem Synchronized php-1.24wmf19/includes/: https://gerrit.wikimedia.org/r/#/c/156979/ (duration: 00m 06s)
* 23:38 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/156994/ (duration: 00m 05s)
* 23:36 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/Echo: https://gerrit.wikimedia.org/r/#/c/157008/ (duration: 00m 04s)
* 23:35 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/Thanks/: https://gerrit.wikimedia.org/r/#/c/156898/ (duration: 00m 04s)
* 23:34 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/156968/ (duration: 00m 05s)
* 23:30 logmsgbot: maxsem Synchronized php-1.24wmf18/extensions/GlobalCssJs/: https://gerrit.wikimedia.org/r/#/c/157009/ (duration: 00m 04s)
* 23:27 K4-713: Updated fraud filters on payments
* 22:52 logmsgbot: aaron Synchronized php-1.24wmf18/maintenance/findMissingFiles.php: (no message) (duration: 00m 07s)
* 22:15 mutante: restarted tools.morebots production instance - can i log now?
* 22:13 cmjohnson1: ms-fe1003 down for relocation
* 22:13 mutante: test
* 21:15 robh: bast2001.wikimedia.org now online in codfw.
* 21:15 robh: i never admin logged when install2001.wikimedia.org went online the other day, opps.
* 21:15 ori: last sync was of Iac37a2369: resourceloader: Don't register raw modules client-side
* 21:14 logmsgbot: ori Synchronized php-1.24wmf18/includes/resourceloader/ResourceLoaderStartUpModule.php: (no message) (duration: 00m 03s)
* 20:57 logmsgbot: krinkle Synchronized php-1.24wmf19/includes/resourceloader/ResourceLoaderStartUpModule.php: fd5b963458c19 (duration: 00m 06s)
* 20:33 ottomata: shutting down elastic1016
* 20:16 ottomata: temporarily disable puppet on gadolinium
* 19:36 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 19:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf19
* 19:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf18
* 19:07 logmsgbot: reedy Finished scap: testwiki to 1.24wmf19 (duration: 43m 00s)
* 18:25 godog: install build-essential and fakeroot on tin
* 18:24 logmsgbot: reedy Started scap: testwiki to 1.24wmf19
* 17:26 logmsgbot: aaron Synchronized rpc: 9564e93ecd4953126d91b99d7728f63401a4dc86 (duration: 00m 07s)
* 17:13 ^d: elastic: excluded the elastic1016 node from shard allocation, shards draining so we can take it down for disk testing
* 16:01 ottomata: restarted webstats-collector on gadolinium
* 13:18 mark: Reactivated cr2-eqiad AS3257 transit link
* 10:44 springle: xtrabackup clone db1051 to db1073
* 10:18 godog: restarting mailman on sodium
* 08:52 godog: restarted apache on mw1134
* 08:03 godog: killed stray mailman processes on sodium (no pid file) and restarted mailman
* 06:11 springle: xtrabackup clone db1051 to db1072
* 06:09 springle: restarted morebots
 
== August 26 ==
* 21:04 hashar: Updating our Jenkins Job Builder fork 0268581..e5c0c61 . Will let us define variables in 'default' section and override them when invoking a job template ( https://review.openstack.org/#/c/100020/ )
* 19:58 bd808: Ran sync-common on mw1053.eqiad.wmnet to recover from failure during last scap
* 19:48 logmsgbot: aude Finished scap: Update new messages for Wikibase (duration: 07m 16s)
* 19:41 logmsgbot: aude Started scap: Update new messages for Wikibase
* 19:39 logmsgbot: aude Synchronized wmf-config/Wikibase.php: add Wikibase badges css setting (duration: 00m 10s)
* 19:25 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable new serialization format for wikidata (duration: 00m 08s)
* 19:10 logmsgbot: reedy Synchronized php-1.24wmf18/extensions/Echo/: (no message) (duration: 00m 14s)
* 19:05 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable otherprojects sidebar beta feature (duration: 00m 15s)
* 18:54 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf18
* 18:53 logmsgbot: reedy Synchronized php-1.24wmf18/extensions/MassMessage: (no message) (duration: 00m 14s)
* 18:52 logmsgbot: reedy Synchronized php-1.24wmf17/extensions/MassMessage: (no message) (duration: 00m 16s)
* 18:19 jgage: Failover from analytics1010-eqiad-wmnet to analytics1004-eqiad-wmnet successful
* 17:47 logmsgbot: bd808 Synchronized private/PrivateSettings.php: Syncing file rather than symlink (duration: 00m 04s)
* 17:36 bd808: mw1010.eqiad.wmnet was out of sync too. I suspect there is something wrong with the fanout update step in scap
* 17:26 bd808: /usr/local/apache/common-local out of date on mw1161.eqiad.wmnet; updated via sync-common
* 17:25 bd808: sync-* not updating terbium properly; sync-common from terbium manually got several config changes; maybe a problem with mw1161.eqiad.wmnet rsync mirror
* 17:14 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 04s)
* 17:11 logmsgbot: demon Synchronized wmf-config/PrivateSettings.php: adjust swift auth url for cirrus (duration: 00m 04s)
* 17:05 cmjohnson: swapping failed disk labsdb1003 slot 1
* 16:42 bd808: Ran sync-common on osmium to verify that it now rebuilds l10n cache by default (and it does!)
* 16:36 legoktm: running removeOldManualUserPages.php (GlobalCssJs) for users who requested it
* 16:29 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Again, with feeling (duration: 00m 04s)
* 16:26 logmsgbot: bd808 Finished scap: no-op scap to test scap code update (duration: 13m 31s)
* 16:20 bd808|DEPLOY: Rsync sloooow to fenari "16:18:52 fenari  INFO    - Finished rsync common (duration: 04m 38s)"
* 16:12 logmsgbot: bd808 Started scap: no-op scap to test scap code update
* 16:07 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 04s)
* 16:07 bd808|DEPLOY: Updated scap to 116027f (Make sync-common update l10n cdb files by default)
* 15:05 logmsgbot: anomie Synchronized wmf-config: SWAT: Enable GlobalCssJs on all CentralAuth wikis minus loginwiki [[gerrit:154432]] (duration: 00m 09s)
* 13:32 hashar: Jenkins mediawiki-core-qunit job has been switched to Zuul cloner and pass! :-D
* 13:29 _joe_: re-enabling puppet, change aborted as not all sites are served via hhvm on the hhvm appservers (true story). Will re-do once all configs are in their place
* 13:12 _joe_: disabling puppet on all appservers while deploying an apache change
* 12:48 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: db1054 to normal load (duration: 00m 06s)
* 12:33 hashar: Jenkins reverted mediawiki-core-qunit to use Zuul cloner {{gerrit|156268}}.  Gotta play with it on a new job name since it does not work out of the box as expected.
* 12:12 hashar: Jenkins migrating mediawiki-core-qunit to use Zuul cloner {{gerrit|156268}}
* 12:03 akosiaris: disable puppet on labsdb1006 for planet osm import
* 11:53 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: pool db1054, warm up (duration: 00m 08s)
* 09:04 godog: reboot ms-be1011, unresponse on network and console
* 08:28 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1036 (duration: 00m 06s)
* 05:41 springle: xtrabackup clone db1036 to db1054
* 05:39 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1036 while cloning (duration: 00m 06s)
* 05:28 springle: upgrade & restart db1054, fs check
* 04:48 logmsgbot: demon Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 06s)
* 04:27 springle: labsdb1002 back up
* 04:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 26 04:06:34 UTC 2014 (duration 6m 33s)
* 03:23 ^d: restarting elasticsearch on elastic1001, elastic1003 and elastic1008. icinga may complain briefly.
* 03:11 springle: filesystem issues on labsdb1002. stopped mysqld
* 03:05 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-26 03:04:18+00:00
* 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-26 02:33:00+00:00
 
== August 25 ==
* 23:58 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Add 'movefile' to 'eliminator' user group on jawiki (duration: 00m 03s)
* 23:53 logmsgbot: maxsem Finished scap: SWAT: CentralNotice update (duration: 29m 58s)
* 23:23 logmsgbot: maxsem Started scap: SWAT: CentralNotice update
* 23:19 logmsgbot: maxsem Synchronized php-1.24wmf17/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/156188/ (duration: 00m 05s)
* 23:17 logmsgbot: maxsem Synchronized php-1.24wmf18/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/156188/ (duration: 00m 04s)
* 23:15 logmsgbot: maxsem Synchronized php-1.24wmf18/includes/htmlform/HTMLCheckField.php: https://gerrit.wikimedia.org/r/#/c/156015/ (duration: 00m 05s)
* 20:06 subbu: deployed parsoid version 5b5a5ed5
* 17:24 godog: reboot ms-be1004 to pick up kernel upgrade
* 17:13 godog: rebooting ms-be1002 to pick up updated kernel
* 16:54 ottomata: stopping puppet on cp3021.  Testing an increase of http://kafka.queue.buffering.max.ms/ in order to avoid dropping messages during broker metadata  change (e.g. leader elections)
* 16:48 hashar: Jenkins pooled in a new slave [https://integration.wikimedia.org/ci/computer/wdjenkins-node1/ wdjenkins-node1] that will be used to run Wikidata jenkins jobs.  Work in progress with addshore.  It is not running jobs yet.
* 16:47 godog: reboot ms-be1011, xfsaild errors in dmesg
* 16:25 hashar: Jenkins: disconnecting and reconnecting Gearman plugin from https://integration.wikimedia.org/ci/configure
* 16:06 andrewbogott: wikitech deployment finished.  Note that the OpenStackManager submodule is off of the MediaWiki branch because… the whole submodule setup there is a bit broken on account of a git bug that uses absolute paths to manage submodules.
* 16:01 andrewbogott: deploying tiny OpenStackManager upgrade on wikitech
* 15:58 ottomata: enabled elasticsearch shard allocation row awareness (via rest api)
* 12:45 hashar: hard stopped/restarted Zuul (workflow config error)
* 12:27 hashar: restarting zuul
* 10:15 mark: setup cross-confederation BGP sessions from AS65001 (eqiad) to AS65002 (codfw)
* 05:35 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: {{gerrit|156076}} - Remove centralnotice-admin right assignments on 3 wikis - Basically a noop (duration: 00m 06s)
* 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 25 03:14:13 UTC 2014 (duration 14m 12s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-25 02:25:58+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-25 02:14:14+00:00
 
== August 24 ==
* 23:17 ^d: slow indexing log going pretty bonanzas on elastic101[35]. Probably others too? Filling /var/log.
* 12:02 mark: Removed IPv6 subnet 2620:0:860:2::/64 from cr2-pmtpa:irb.101
* 03:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 24 03:17:46 UTC 2014 (duration 17m 45s)
* 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-24 02:31:13+00:00
* 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-24 02:18:52+00:00
 
== August 23 ==
* 11:33 mark: Manually removed IPv6 addresses from fenari
* 11:23 mark: Deactivated IPv6 router-advertisement on cr2-pmtpa
* 11:21 mark: Manually removed IPv6 address from mchenry
* 10:36 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1004. pool db1053. (duration: 00m 07s)
* 03:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 23 03:06:13 UTC 2014 (duration 6m 12s)
* 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-23 02:21:59+00:00
* 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-23 02:17:28+00:00
* 01:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1056 (duration: 00m 06s)
* 00:29 ori: disabled puppet on osmium again to debug a leak; please don't re-enable
 
== August 22 ==
* 18:10 logmsgbot: ori updated /a/common to {{Gerrit|I338d72a47}}: Do not define MEDIAWIKI before loading WebStart.php
* 17:43 ottomata: moving sqstat udp2log filter from analytics1003 to analytics1026, reqstats might blip for a sec...
* 17:41 ori: nuking /srv/deployment/rcstream on rcs1002 to verify trebuchet package provider reprovisions it
* 15:54 springle: xtrabackup db1056 to db1053
* 15:53 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1056 while cloning (duration: 00m 07s)
* 15:33 ^d: elastic1008: fixed /etc/hosts to point to actual IP instead of loopback
* 15:18 springle: upgrade & restart db1053, fs check
* 15:08 bd808: Still no apache2.log on fluorine or in logstash. Log seems to be available on fenari.
* 14:51 springle: switched s1 sanitarium and labsdb replication to db1069:3311 mariadb 10
* 14:39 mark: Removed IPv6 subnets 2620:0:860:1::/64 (squid subnet) and 2620:0:860:3::/64 (sandbox subnet) from cr2-pmtpa configuration
* 04:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 22 04:10:47 UTC 2014 (duration 10m 46s)
* 03:19 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-22 03:18:44+00:00
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-22 02:32:11+00:00
* 00:03 logmsgbot: ori Finished scap: SWAT: d3de89777, 7abfe0d5e7, 8ec9853c32b, 476e9e90bd01 (duration: 06m 29s)
 
== August 21 ==
* 23:57 logmsgbot: ori Started scap: SWAT: d3de89777, 7abfe0d5e7, 8ec9853c32b, 476e9e90bd01
* 21:58 logmsgbot: ori Synchronized php-1.24wmf17/resources/src/mediawiki/mediawiki.js: I8d27442d1: Workaround for bug introduced by Icf6ede09b (duration: 00m 03s)
* 21:57 manybubbles: performing elasticsearch upgrade on elastic1015
* 21:02 logmsgbot: ori Synchronized php-1.24wmf17/resources/src/mediawiki/mediawiki.util.js: Touch resources/src/mediawiki/mediawiki.util.js (duration: 00m 06s)
* 20:44 godog: rolling restart of swift-proxy on ms-fe1*
* 20:11 godog: restarted swift-proxy on ms-fe1001
* 19:55 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
* 19:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf18
* 19:46 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf17
* 19:44 logmsgbot: reedy Finished scap: testwiki to 1.24wmf18 (duration: 34m 01s)
* 19:31 mutante: disabled mw1178 in pybal
* 19:27 godog: restarted memcached on ms-fe1004
* 19:23 reedy|webirc: mw1178 returned [255]: ssh: connect to host mw1178 port 22: Connection timed out
* 19:23 reedy|webirc: mw1019 returned [127]: bash: sync-common: command not found
* 19:09 logmsgbot: reedy Started scap: testwiki to 1.24wmf18
* 18:28 manybubbles: *victim*
* 18:27 manybubbles: trying to recover from weird Elasticsearch upgrade failure by redoing the upgrade on one node while also blowing away the data directory during the upgrade.  elastic1005, you are my first victem.
* 17:28 cmjohnson1: removing mw1130 from pybal
* 14:53 hashar: Jenkins: updated PHP CodeSniffer MediaWiki standard on all slaves.
* 14:36 hashar_: Jenkins: updating mediawiki code sniffer repo bf82117..bc4e590
* 10:02 hashar: Jenkins installed plugin [https://wiki.jenkins-ci.org/display/JENKINS/Throttle+Concurrent+Builds+Plugin Throttle Concurrent Builds].
* 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Aug 21 03:20:47 UTC 2014 (duration 20m 46s)
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-21 02:34:56+00:00
* 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-21 02:19:26+00:00
* 00:08 MatmaRex: (manybubbles contd.) …a single node going down but I expect the cluster to stay "yellow" during the process- no alerts.
* 00:07 manybubbles: bd808 needs to plan a logstash upgrade soon - let it be logged
* 00:05 manybubbles: if anyone is reading the SAL for fun or sees an error in Elasticsearch cluster in the next 24 hours - we're performing an elasticsearch upgrade.  We've set it up this time so its super slow and boring.  So boring I'm going to sleep through it.  If you see more then transient complaining from icinga about elasticsearch you can call me/have someone with access to the contact list call me.  I expect icinga to complain about a
* 00:00 manybubbles: unattended rolling restart of Elasticsearch cluster is going just fine - adding the 30 minute sleep between servers and turning down the replication rate makes it pretty boring. 
 
== August 20 ==
* 23:07 awight: stopping the Thank You job
* 22:50 ori: disabled puppet on osmium to debug memory leak
* 21:46 logmsgbot: marktraceur Synchronized php-1.24wmf17/extensions/MultimediaViewer/: Add disable-by-default option to MultimediaViewer (duration: 00m 07s)
* 21:09 logmsgbot: marktraceur Synchronized wmf-config: Turn off Media Viewer for logged-in users at Commons. (duration: 00m 07s)
* 21:06 logmsgbot: marktraceur updated /a/common to {{Gerrit|I226bd1468}}: Add item-redirect to OAuth permissions
* 19:50 hashar: Restarting Zuul to prettify build results {{bug|66095}}
* 19:48 logmsgbot: awight Synchronized php-1.24wmf17/extensions/CentralNotice: push CentralNotice updates, including new hide cookie format (duration: 00m 05s)
* 19:47 logmsgbot: awight Synchronized php-1.24wmf16/extensions/CentralNotice: push CentralNotice updates, including new hide cookie format (duration: 00m 04s)
* 19:46 logmsgbot: awight Synchronized php-1.24wmf16/extensions/CentralNotice: push CentralNotice updates, including new hide cookie format (duration: 00m 07s)
* 16:11 manybubbles: elastic1001 upgrade went well - upgrading elastic1002 now
* 15:48 hashar: dns: Jenkins will now complain whenever you attempt to send tabs in any file of operations/dns.git {{bug|69478}}
* 15:17 manybubbles: manually lowered elasticsearch recovery speeds to stem off high load caused by healing the restart of elastic1001 - we were slowing down enough that we were filling the pool counter
* 15:05 logmsgbot: anomie Synchronized wmf-config/CommonSettings.php: SWAT: Add item-redirect to OAuth permissions [[gerrit:155257]] (duration: 00m 09s)
* 15:01 logmsgbot: anomie Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/: SWAT: Touch files on advice of Wikidata folks (duration: 00m 09s)
* 15:01 logmsgbot: anomie Synchronized wmf-config/Wikibase.php: SWAT: Fix config for specialSiteLinkGroups in Wikibase [[gerrit:155218]] (duration: 00m 09s)
* 14:49 manybubbles: installing elasticsearch 1.3.2 on elasticsearch1001 only right now as a test
* 14:47 manybubbles: upgrading elasticsearch plugins on all elasticsearch servers in preparation to upgrade to elasticsearch 1.3 - if we roll back we'll have to redeploy the plugins
* 14:10 ottomata: changing group ownership and permissions on raw webrequest data in hdfs.  Users now must be in the analytics-privatedata-users group to access.
* 13:47 manybubbles: experimenting with lowering merge factor on enwiki's Cirrus index - should improve query performance at the cost of more background tasks in the Elasticserach cluster
* 13:36 ottomata: disabling puppet on analytics1027 temporarily
* 13:10 godog: reboot ms-be1003, xfs errors/panics
* 12:03 logmsgbot: ori updated /a/common to {{Gerrit|Ic3fe1ef83}}: Update all symlinks to /apache
* 11:36 hashar: Updating Jenkins Job Builder fork 666e953..0268581
* 11:06 hashar_: mw1019 is missing sync-common causing sync issues.
* 11:06 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: new domain www.veikkos-archiv.com to wgCopyUploadsDomains {{gerrit|155239}} {{bug|69777}} (duration: 00m 03s)
* 11:05 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: new domain www.veikkos-archiv.com to wgCopyUploadsDomains {{gerrit|155239}} {{bug|69777}} (duration: 00m 03s)
* 10:33 logmsgbot: ori Synchronized w/touch.php: Ic9d8837b1: Canonicalize some remaining references to /apache symlink (duration: 00m 05s)
* 10:33 logmsgbot: ori Synchronized w/mobilelanding.php: Ic9d8837b1: Canonicalize some remaining references to /apache symlink (duration: 00m 05s)
* 10:26 logmsgbot: ori updated /a/common to {{Gerrit|Ic9d8837b1}}: Canonicalize some remaining references to /apache symlink
* 10:16 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Id2d5cfa4c: Canonicalize path to $wgSiteMatrixFile (duration: 00m 06s)
* 09:40 godog: uploaded hhvm_3.3-dev+20140728+wmf5 to carbon
* 09:27 hashar: restarted Jenkins Gearman plugin.
* 04:06 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Aug 20 04:05:41 UTC 2014 (duration 5m 40s)
* 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-20 03:11:08+00:00
* 02:40 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-20 02:39:40+00:00
 
== August 19 ==
* 23:17 logmsgbot: catrope Synchronized php-1.24wmf17/extensions/MobileFrontend: (no message) (duration: 00m 04s)
* 23:14 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Set wmgWikibaseSiteGroup for wikinews (duration: 00m 05s)
* 22:59 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Fix badges css on Wikidata (duration: 00m 11s)
* 22:30 logmsgbot: aude Finished scap: Update Wikidata, WikimediaMessages and ZeroBanner (duration: 22m 02s)
* 22:08 logmsgbot: aude Started scap: Update Wikidata, WikimediaMessages and ZeroBanner
* 22:03 logmsgbot: aude Synchronized php-1.24wmf17/extensions/ZeroBanner: Update, per yurik (duration: 00m 18s)
* 21:22 logmsgbot: aude Synchronized wikidataclient.dblist: Enable Wikibase on Wikinews (duration: 00m 08s)
* 21:21 logmsgbot: aude Synchronized wmf-config: Config changes to enable Wikibase on Wikinews (duration: 00m 14s)
* 21:12 aude: added and populated sites table for wikinews
* 21:05 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Fix badges css and populateSitesTable script in Wikibase (duration: 00m 14s)
* 20:26 RoanKattouw: Restarting Jenkins, it seems to be stuck
* 19:58 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: enabling Wikidata to also be a client, e.g. use lua (duration: 00m 09s)
* 19:58 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enabling Wikidata to also be a client, e.g. use lua (duration: 00m 12s)
* 19:53 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/resources/: (no message) (duration: 00m 11s)
* 19:48 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/modules/SitesModule.php: (no message) (duration: 00m 09s)
* 19:48 logmsgbot: aude Synchronized wmf-config/Wikibase.php: fix config for special site links on Wikidata (duration: 00m 11s)
* 19:37 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/modules/SitesModule.php: (no message) (duration: 00m 11s)
* 19:26 logmsgbot: aude Synchronized wmf-config/Wikibase.php: allow adding site links to Wikidata (non-entity) pages on Wikidata (duration: 00m 08s)
* 19:21 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable item redirects on Wikidata (duration: 00m 08s)
* 19:16 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable badges on Wikidata (duration: 00m 08s)
* 19:06 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: group1 back to wmf17
* 19:05 logmsgbot: demon Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/changes/EntityChange.php: (no message) (duration: 00m 05s)
* 18:33 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: all group1 back to wmf16 until WB patch comes
* 18:22 andrewbogott: added virt1009 to the eqiad virt cluster
* 18:17 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki back to wmf16
* 18:13 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to wmf17
* 15:17 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: CopyUploadDomains for Commons [[gerrit:154718]] (duration: 00m 12s)
* 15:15 logmsgbot: anomie Synchronized commonsuploads.dblist: SWAT: Remove emlwiki from commonsuploads.dblist [[gerrit:154714]] (duration: 00m 09s)
* 15:13 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Affiliate namespace on chapcomwiki [[gerrit:154713]] (for real this time) (duration: 00m 09s)
* 15:12 mark: Completed network migration of BGP confideration renumbering: AS65002 -> AS65001, AS65003 -> AS65004, old AS65001 (pmtpa) is part of eqiad for its remaining lifetime
* 15:12 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Affiliate namespace on chapcomwiki [[gerrit:154713]] (duration: 00m 09s)
* 15:10 logmsgbot: anomie Synchronized php-1.24wmf16/includes/parser/Parser.php: SWAT: Fix URL protocol detection regex for file link= parameter [[gerrit:154844]] (duration: 00m 09s)
* 15:04 logmsgbot: anomie Synchronized php-1.24wmf17/includes/parser/Parser.php: SWAT: Fix URL protocol detection regex for file link= parameter [[gerrit:154845]] (duration: 00m 09s)
* 14:50 ottomata: starting stat1003 upgrade to trusty
* 14:37 logmsgbot: demon updated /a/common to {{Gerrit|I035cebe20}}: Configure swift-backed snapshots for Cirrus in beta
* 14:05 logmsgbot: demon Synchronized wmf-config/CirrusSearch-labs.php: beta swift config, no-op (duration: 00m 04s)
* 13:39 hashar_: Jenkins upgrading hhvm on the Trusty Jenkins slave integration-slave1006-trusty : Unpacking hhvm (3.3-dev+20140728+wmf4) over (3.3-dev+20140728+wmf3)
* 13:12 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: Fix $wgRestrictionLevels ordering {{bug|69640}} (duration: 00m 04s)
* 10:19 hashar: Jenkins: bringing back irc bot wmf-insecte in #wikimedia-qa . Will be used to notify failures/fixe of the beta cluster jenkins jobs
* 09:58 godog: depool mw1019 from appservers, testing trusty+hhvm reinstall RT #8153
* 07:39 bblack: strontium ok, icinga-wm back
* 07:17 hashar: Jenkins: manually cleared out a tmpfs partition on lanthanum.eqiad.wmnet which was causing all MediaWiki / extensions jobs to fail completely. {{bug|69731}}.  We need disk space monitoring which is {{bug|69733}}.
* 07:09 bblack: ... and strontium passenger is failing to start up correctly again.  icinga-wm disabled to avoid spam
* 07:07 bblack: restarted apache2 service on strontium/palladium, expect another small spike of puppet fail->ok
* 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 19 03:20:21 UTC 2014 (duration 20m 20s)
* 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-19 02:36:21+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-19 02:15:14+00:00
 
== August 18 ==
* 23:03 andrewbogott: isolated virt1006, re-enabling puppet on virt1000 and virt1006
* 22:36 andrewbogott: disabling puppet on virt1000 and virt1006 while I try to convince the scheduler to overlook virt1006
* 22:01 bblack: done futzing w/ puppetmasters+neon, all agents enabled and bot back online
* 21:28 hashar: Zuul processing again. Definitely need to write doc about how to unstuck it
* 21:02 hashar: Zuul / Jenkins stalled again :-/
* 21:02 hashar: Zuul / Jenkins stalled again :-/
* 19:35 bblack: testing new passenger perf params on strontium/palladium.  agents on those two and icinga-wm still disabled
* 19:04 bblack: restarted service apache2 on strontium - passenger for puppet master was dead again
* 17:00 andrewbogott: added a (yuvi-built) python-txstatsd package to trusty on Carbon.
* 16:37 bd808: deployment-prep Restarted Apache and HHVM on deployment-mediawiki02 to pick up removal of /etc/php5/conf.d/mail.ini
* 16:26 logmsgbot: yurik Synchronized php-1.24wmf17/extensions: Syncing JsonConfig,ZeroPortal,ZeroBanner (duration: 01m 13s)
* 16:22 logmsgbot: yurik Synchronized php-1.24wmf16/extensions: Syncing JsonConfig,ZeroPortal,ZeroBanner (duration: 01m 22s)
* 16:18 legoktm: migrateAccount.php finished, 2014-08-18 15:42:12 processed 1528652 usernames (22.9/sec), 10 (0.0%) fully migrated, 7938 (0.5%) partially migrated
* 16:05 hashar: Jenkins tox based jobs are now runnable in parallel {{gerrit|154834}}
* 15:36 manybubbles: swat complete
* 15:29 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - enable cirrus optimization - weighted all fields - on group0 wikis (duration: 00m 07s)
* 15:29 logmsgbot: manybubbles Synchronized wmf-config/CirrusSearch-common.php: SWAT - drop unused Cirrus parameter (duration: 00m 05s)
* 15:25 logmsgbot: manybubbles Synchronized php-1.24wmf16/extensions/CentralAuth: SWAT - two centralauth fixes (duration: 00m 05s)
* 15:22 bblack: resuming slowly wiping varnish caches for mmap update (49 hosts to go), expect small 5xx spikes every ~1.5 hrs for the next few days
* 15:22 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - noop - sync files adding bouncehandler to betalabs (duration: 00m 04s)
* 15:19 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - create portal/portal talk namespaces on kowikisource (duration: 00m 04s)
* 15:18 logmsgbot: manybubbles Synchronized php-1.24wmf17/extensions/CentralAuth/: SWAT - two centralauth fixes (duration: 00m 04s)
* 15:13 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - create eliminator role on viwiki (duration: 00m 05s)
* 15:11 logmsgbot: manybubbles Synchronized php-1.24wmf17/extensions/Wikidata/: (no message) (duration: 00m 07s)
* 15:08 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - Add global-renamer group to metawiki (duration: 00m 04s)
* 14:50 hashar: Jenkins: reverting PHP CodeSniffer upgrade  {{gerrit|154825}}.We are back to 1.4.7. Previous patch had some issue.
* 14:42 hashar: Jenkins: upgrading PHP Codesniffer from 1.4.7 to 1.4.8 (thanks to addshore {{gerrit|154053}})
* 14:39 bd808: No apache2.log in fluorine:/a/mw-log; Last file in /a/mw-log/archive is apache2.log-20140816.gz
* 14:31 bd808: Restarted logstash on logstash1001; event volume was lower than expected
* 13:49 hashar: restarting zuul. Got stuck again.
* 13:29 hashar_: Restarted Zuul, some items where stuck in queue.  Retrigger your jobs (revote +2 / new patchset / 'recheck' comment)
* 13:23 logmsgbot: reedy Synchronized php-1.24wmf17/extensions/ExtensionDistributor: Unbreak ExtensionDistributor (duration: 00m 13s)
* 13:18 hashar: Zuul stuck, looking.
* 13:06 Reedy: Large amount of incoming traffic to bast1001 is me uploading files
* 12:11 godog: rebalanced swift object ring in eqiad
* 09:34 godog: reenabled puppet on neon and started ircecho
* 09:23 godog: stop ircecho again on neon, disable puppet on neon
* 09:11 godog: restarted apache2 on strontium
* 08:58 godog: stopped ircecho on neon while diagnosing puppet failure
* 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 18 03:12:27 UTC 2014 (duration 12m 26s)
* 03:06 hoo: Ran sync-common on mw1053 to stop "Unrecognized job type 'ChangeNotification'." exceptions
* 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-18 02:30:17+00:00
* 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-18 02:18:52+00:00
 
== August 17 ==
* 21:07 legoktm: running migrateAccount.php without --safe or --auto on terbium for bug 69291
* 18:45 hashar: Zuul upgraded
* 18:41 hashar: Upgrading Zuul to latest version (that is not a friday afterall)
* 09:22 springle: ongoing schema change wikidatawiki & testwikidatawiki wb_entity_per_page.epp_redirect_target. osc_host.sh processes on terbium ok to kill in emergency
* 04:34 ottomata: restarted udp2log on oxygen
* 03:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 17 03:04:22 UTC 2014 (duration 4m 21s)
* 02:49 springle: killed stuff on labsdb1002 using all disk for temp tables. investigating
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-17 02:23:08+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-17 02:13:35+00:00
 
== August 16 ==
* 18:12 bblack: (amssq33: and yes, removing from fe/be cache pools)
* 18:11 bblack: powering off amssq33, it's clipping network traffic at peak times due to bad ethernet connection negotiated down to 100Mbps (see existing RT 7933 in esams queue)
* 18:02 bblack: ms-be1006: syslog indicates it started generating repeated "BUG: soft lockup" 10 minutes before dying, in XFS kernel code again...
* 17:55 bblack: rebooting ms-be1006, ping-dead in icinga for 23m, console was unresponsive
* 17:37 bblack: restarted apache2 on palladium... looks like something went horribly wrong with its puppet of itself that somehow killed off puppetmaster service?
* 03:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 16 03:06:29 UTC 2014 (duration 6m 28s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-16 02:26:02+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-16 02:16:00+00:00
 
== August 15 ==
* 20:59 logmsgbot: kaldari Synchronized php-1.24wmf16/extensions/MobileFrontend/less: fixing iOS search bug (duration: 00m 05s)
* 17:58 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Enable redirects on test.wikidata (duration: 00m 07s)
* 15:53 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Update test.wikidata (duration: 00m 07s)
* 15:50 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Fix database error and snak value display on test wikidata (duration: 00m 09s)
* 15:00 ori: re-enabled puppet on mw1017
* 13:33 ori: disabling puppet on mw1017 to test rsyslog config
* 03:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 15 03:50:23 UTC 2014 (duration 50m 22s)
* 03:04 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-15 03:03:49+00:00
* 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-15 02:33:21+00:00
* 00:24 logmsgbot: ori Finished scap: SWAT: cherry picks for TMH and Echo (duration: 14m 38s)
* 00:09 logmsgbot: ori Started scap: SWAT: cherry picks for TMH and Echo
 
== August 14 ==
* 23:24 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Bump cache epoch and add badges setting on test.wikidata (duration: 00m 32s)
* 23:13 logmsgbot: aude Finished scap: Update branch for test.wikidata (duration: 16m 48s)
* 22:57 logmsgbot: aude Started scap: Update branch for test.wikidata
* 22:26 logmsgbot: aaron Synchronized php-1.24wmf16/includes/DefaultSettings.php: 67bf481ce1644ff194d7565107d9b8ffe11bf4b7 (duration: 00m 07s)
* 22:23 logmsgbot: aaron Synchronized wmf-config/CommonSettings.php: Increased wgParsoidCacheUpdateTitlesPerJob to 12 to lower the backlog (duration: 00m 07s)
* 22:14 logmsgbot: aude scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="test2wiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.kFlVQdKnM2" ' returned non-zero exit status 255 (duration: 00m 40s)
* 22:13 logmsgbot: aude Started scap: Update branch for test.wikidata
* 21:49 logmsgbot: reedy Synchronized php-1.24wmf17/includes/context/RequestContext.php: (no message) (duration: 00m 15s)
* 21:10 godog: restarted hhvm on mw1053
* 20:47 _joe|away: stopping puppet, jobrunner on mw1053; HHVM is eating memory like godzilla
* 19:29 bblack: puppeting labmon1001, etc
* 18:57 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:55 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 14s)
* 18:26 mutante: stopped ircecho on neon temporarily
* 18:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf17
* 18:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf16
* 17:45 AaronSchulz: /srv/deployment/jobrunner updated to 795baf3ca4ce8308597dd74e5242aa5bfbbe961d
* 17:39 logmsgbot: aaron Synchronized rpc: 6c0ece687bb6ff3fec0ca7e80a587525ebf18a70 (duration: 00m 08s)
* 16:52 _joe_: uploaded new hhvm package 3.3-dev+20140728+wmf4
* 16:23 logmsgbot: reedy Synchronized php-1.24wmf17/extensions/CentralAuth/: (no message) (duration: 00m 13s)
* 16:23 logmsgbot: reedy Synchronized php-1.24wmf16/extensions/CentralAuth/: (no message) (duration: 00m 14s)
* 15:49 Reedy: Running sync-common on mw1053
* 15:48 logmsgbot: reedy Finished scap: testwiki to 1.24wmf17 (duration: 33m 13s)
* 15:47 Jeff_Green: adjust wiki-mail._domainkey DNS record to allow sending from 'wiki*@" addresses, instead of just wiki@
* 15:23 _joe_: powercycling mw1053, which looks like the victim of hhvm-induced ooms
* 15:15 logmsgbot: reedy Started scap: testwiki to 1.24wmf17
* 14:01 _joe_: puppet re-enabled on the appserver
* 12:38 _joe_: stopping puppet on appservers while deploying a delicate change.
* 12:12 manybubbles|away: cirrus index rebuilds are still proceeding without issue.  Going to continue to let them run and keep half an eye on them.  enwiki is nearly done.  Commons and wikidata are done.  Many of group1 are done - we're up to eswiktionary now - but there are many to go.
* 09:30 _joe_: the hhvm jobrunner is back in production, seems healthy, see https://logstash.wikimedia.org/#/dashboard/elasticsearch/hhvm_jobrunner
* 08:09 _joe_: reactivated the jobrunner on mw1053, with promising results. Puppettization pending (in ~ 1 hour)
* 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Aug 14 03:11:33 UTC 2014 (duration 11m 32s)
* 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-14 02:29:52+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-14 02:16:34+00:00
 
== August 13 ==
* 21:58 manybubbles: cirrus index rebuild is proceeding without trouble - I'm going to let it continue over night.
* 21:46 andrewbogott: re-enabled puppetmaster on virt1000; apache changes seem stable now.
* 21:18 _joe_: stopped puppet on virt1000, our fail
* 13:23 springle: killed a mass of SpecialWhatLinksHere queries on enwiki
* 12:51 manybubbles: restarting rebuilding Cirrus indexes to pick up weighted all field
* 10:35 godog: bump swift weights for ms-be1013 ms-be1014 ms-be1015 to 2500
* 08:38 hashar: gallium removing some sun-java6* packages coming from old lucid era
* 07:47 hashar: upgrading Java on contint servers gallium and lanthanum , restarting Jenkins related process
* 04:03 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Aug 13 04:02:23 UTC 2014 (duration 2m 22s)
* 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-13 03:11:38+00:00
* 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-13 02:40:36+00:00
 
== August 12 ==
* 23:34 logmsgbot: hoo Synchronized tests/multiversion/MWMultiVersionTest.php: (no message) (duration: 00m 11s)
* 23:32 logmsgbot: hoo Synchronized php-1.24wmf16/skins/Vector/skinStyles/mediawiki.special.preferences.less: Fix missing tab images on Special:Preferences (duration: 00m 10s)
* 23:26 hoo: Had to abort scap on mw1053 (which is depooled) manually
* 23:26 logmsgbot: hoo Finished scap: Update WikimediaMessages (superprotect messages for wmf16) (duration: 46m 16s)
* 22:40 logmsgbot: hoo Started scap: Update WikimediaMessages (superprotect messages for wmf16)
* 22:21 logmsgbot: hoo Synchronized php-1.24wmf16/extensions/ProofreadPage/: Fix JS error while editing (duration: 00m 10s)
* 19:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf16
* 19:06 logmsgbot: reedy Synchronized php-1.24wmf16/includes/specials/SpecialRecentchangeslinked.php: Fix FR bug (duration: 00m 14s)
* 17:55 AaronSchulz: populateBacklinkNamespace.php finished on all wikis
* 17:13 springle: restart mysqld on labsdb1002, upgrade to mariadb 10.0.13 for bugfix
* 16:57 Jeff_Green: removed aluminium.wikimedia.org from production
* 16:50 springle: restart mysqld on labsdb1001, upgrade to mariadb 10.0.13 for bugfix
* 15:08 bblack: flipping ulsfo traffic back to ulsfo
* 11:51 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: Set siteGroup for testwikidata (duration: 00m 11s)
* 11:21 hashar: Jenkins: clearing up some obsolete symbolic links under gallium.wikimedia.org:/var/lib/jenkins/jobs/*/builds/  Running in a screen as user jenkins
* 05:01 springle: rsync ~1TB labsdb1001 to labsdb1003, throttled ~25MB/s
* 04:25 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: s2: repool db1009. s3: repool db1035. (duration: 00m 06s)
* 03:45 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: s2: depool db1009. repool db1018. adjust db1036 load. (duration: 00m 07s)
* 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 12 03:14:34 UTC 2014 (duration 14m 33s)
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-12 02:32:09+00:00
* 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-12 02:18:36+00:00
 
== August 11 ==
* 22:33 awight: update CRM schema to wmf_civicrm:7021
* 21:47 andrewbogott: removed the old puppet-freshness check which should have no effect but may instead produce a torrent of alert spam  https://gerrit.wikimedia.org/r/#/c/142560/
* 04:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 11 03:59:17 UTC 2014 (duration 59m 16s)
* 03:05 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-11 03:04:15+00:00
* 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-11 02:33:02+00:00
 
== August 10 ==
* 23:47 logmsgbot: ori Synchronized php-1.24wmf16/extensions/MassMessage/includes: Revert MassMessage to 9884fbb50a (duration: 00m 06s)
* 23:36 logmsgbot: ori Synchronized php-1.24wmf16/extensions/MassMessage: Update MassMessage for I840c98dca: Fix MassMessage::getMessengerUser() after Password API changes (duration: 00m 06s)
* 22:59 logmsgbot: csteipp Finished scap: Deploy Ibe28a69c9fbab00b81c53b1643df722a3f1fbf19 at Eriks request (duration: 26m 13s)
* 22:33 logmsgbot: csteipp Started scap: Deploy Ibe28a69c9fbab00b81c53b1643df722a3f1fbf19 at Eriks request
* 16:25 logmsgbot: reedy Finished scap: Rebuild l10n cache for WikimediaMessages (duration: 22m 12s)
* 16:02 logmsgbot: reedy Started scap: Rebuild l10n cache for WikimediaMessages
* 15:01 logmsgbot: hoo Synchronized php-1.24wmf16/extensions/CentralAuth/: (no message) (duration: 00m 25s)
* 15:00 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/CentralAuth/: (no message) (duration: 00m 25s)
* 13:53 Reedy: Grant staff "superprotect" right per Robla/Erik request
* 13:02 logmsgbot: tstarling Synchronized wmf-config/InitialiseSettings.php: Idfa21125 (duration: 00m 05s)
* 13:02 logmsgbot: tstarling Synchronized wmf-config/CommonSettings.php: Idfa21125 (duration: 00m 06s)
* 12:08 mutante: re-enabling puppet and services on tarin
* 11:57 mutante: tarin - stopping poolcounterd, gmond,.. (Tampa, should really not be in use)
* 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 10 03:14:52 UTC 2014 (duration 14m 51s)
* 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-10 02:33:15+00:00
* 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-10 02:19:53+00:00
 
== August 9 ==
* 15:22 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I313b09ffc: Don't require native CDB support to load {interwiki,trustedxff}.cdb (duration: 00m 05s)
* 14:25 Reedy: Removed <= MediaWiki 1.24wmf5
* 13:29 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: move s4 api traffic back to db1042 (duration: 00m 06s)
* 11:32 mutante: added Ryan Lane to NDA LDAP group
* 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug  9 03:20:07 UTC 2014 (duration 20m 6s)
* 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-09 02:36:52+00:00
* 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-09 02:19:03+00:00
 
== August 8 ==
* 21:24 Reedy: mw1130 seems to be dead (unresponsive to ping)
* 21:21 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: (no message) (duration: 01m 04s)
* 21:06 awight: deployed crm default/settings.php
* 17:07 mutante: jenkins/puppet-compiler - granting new LDAP group "nda" the same rights already given to matanya (and wmde even has more)
* 16:06 bblack: datacenter traffic mapping back to normal, varnish fix/wipe/restart/etc work on pause for the weekend in a stable state
* 16:02 andrewbogott: merging https://gerrit.wikimedia.org/r/#/c/150273/ which affects every puppet log everywhere...
* 14:22 mutante: RT - reverted permission change for access requests requestors per robh
* 13:50 mutante: RT - granted permission to show ticket summary for role requestor in queue access-requests
* 12:49 akosiaris: uploaded ruby-jsduck 5.3.4-1wmftrusty1 and ruby-rkelly-remix 0.0.6-1trusty1 on apt.wikimedia.org
* 12:33 ori: testwiki up, judgement poor
* 12:28 hashar: Jenkins: somehow the ArtifactDeployer plugin got upgraded on Aug 7th 20:57 UTC despite it being broken {{bug|69197}}.  Attempting manual downgrade
* 12:13 hashar: reloading Jenkins
* 12:07 akosiaris: ifconfig br0 0.0.0.0 on platinum to get rid of the IP on that interface and have facter work more reliably. This does not matter right now as it is an evaluation machine but logging it for completeness
* 12:03 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
* 11:32 _joe_: rebooting mw1017
* 11:29 akosiaris: mw1130 has broken disk
* 11:09 ori: running rsync-common on mw1017
* 11:02 logmsgbot: hoo Synchronized php-1.24wmf16/extensions/CentralAuth/: Another shot towards bug 39996 (duration: 01m 04s)
* 11:01 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/CentralAuth/: Another shot towards bug 39996 (duration: 01m 04s)
* 09:29 _joe_: reimaging mw1017 aka testwiki.
* 06:03 springle: ongoing schema changes: rev_content_model, rev_content_format. on terbium, osc_host.sh processes ok to kill in emergency
* 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug  8 03:12:21 UTC 2014 (duration 12m 20s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-08 02:28:39+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-08 02:16:13+00:00
 
== August 7 ==
* 19:19 jgage: rebooting analytics1021 for kernel upgrade
* 18:55 bblack: starting the process of fixing upload cache sizes, there will be periodic slim 5xx spikes...
* 16:31 Jeff_Green: temporarily disabling icinga notifications for ocg100[123] ocg service check
* 16:09 logmsgbot: krinkle Synchronized php-1.24wmf16/extensions/GlobalCssJs/GlobalCssJs.hooks.php: 4bbf4e0ed92f9a09 (duration: 00m 05s)
* 15:48 mutante: zirconium - attempt to fix apache site setup manually
* 15:46 logmsgbot: reedy Synchronized wmf-config/extension-list-labs: (no message) (duration: 00m 13s)
* 15:38 logmsgbot: reedy Synchronized php-1.24wmf16/maintenance/findMissingFiles.php: (no message) (duration: 00m 20s)
* 15:37 logmsgbot: reedy Synchronized php-1.24wmf15/maintenance/findMissingFiles.php: (no message) (duration: 00m 17s)
* 15:12 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
* 14:43 akosiaris: uploaded varnish_3.0.5plus~x-wm7trusty1 on apt.wikimedia.org (for usage in trusty labs machines, notably cxserver)
* 14:23 mutante: shutting down elastic1018
* 14:12 ^d: elastic1018: blacklisted from shard allocation since it's dead
* 14:05 mutante: depooled elastic1018 - service wasnt running and signs of broken hardware (SSD)
* 13:57 mark: Temporarily set max connections to swift from cp1049 backend varnish from 1000 to 2000
* 13:56 mutante: starting elasticsearch on elastic1018
* 12:23 hashar: Zuul upgraded labs branch to match production (i.e. have same version of Zuul cloner)
* 12:20 hashar: restarting Zuul
* 11:25 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: I53f76a35ac - No longer allow voyage 'crats to usermerge (duration: 00m 15s)
* 11:13 akosiaris: removed laner@wikimedia.org entirely. It pointed to rlane@wikimedia.org which no longer exists
* 11:11 akosiaris: removed rlane from root@wikimedia.org and usability@wikimedia.org
* 10:45 mutante: iron, bast1001 - installed package upgrades
* 09:13 hashar: Jenkins: polling a new Jenkins slave using Trusty integration-slave1006-trusty [10.68.17.223] with 4 CPU. Copy pasted from 1004-trusty
* 08:32 hashar: Jenkins: switching [https://integration.wikimedia.org/ci/job/analytics-libcidr/|analytics-libcdr job] from https://github.com/wmf-analytics/libcidr/ to https://gerrit.wikimedia.org/r/analytics/libcidr
* 07:44 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: move s4 api traffic to db1056 (duration: 00m 07s)
* 07:39 mark: Set OSPF metric 1000 on cr2-eqiad:xe-5/2/2 (GTT link)
* 05:39 springle: labsdb1002 restart
* 03:48 springle: labsdb1001 restart
* 03:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Aug  7 03:08:49 UTC 2014 (duration 8m 48s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-07 02:27:52+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-07 02:15:45+00:00
 
== August 6 ==
* 21:33 hashar: Jenkins: moved mediawiki-core-regression-hhvm-master to run on Trusty instance
* 20:26 hashar: Jenkins: downgraded ansicolor plugin from 0.4 to 0.3.1  Some colors.js function emits ANSI codes to reset the color which are not properly understood
* 20:06 hashar: I have broke Zuul/Jenkins :-]
* 18:53 hashar: Jenkins slow startup is {{bug|69197}}
* 18:50 hashar: restarting jenkins
* 18:49 hashar: Stopping Jenkins.  Reverting upgrade of artifact deployer plugin
* 18:10 mutante: puppet-catalog-compiler says to "wait while Jenkins is getting ready to work"
* 17:20 hashar: Jenkins process jobs again, the UI will take a bunch of hours to load though due to some issue when initializing
* 17:14 hashar: killed Jenkins
* 17:12 _joe_: stopped the jobrunner on mw1053, was running in fcgi mode unpuppetized and with a broken vhost. Fixed it, it started spawning exceptions. DO NOT enable puppet again
* 17:02 ^d: jenkins restarted, was stuck
* 15:52 hashar: Restarted Zuul and Zuul-merger on gallium to tweak logging settings {{gerrit|152118}}
* 11:30 logmsgbot: hoo Synchronized wmf-config/CommonSettings.php: Grant 'centralauth-rename' to 'steward' (duration: 00m 24s)
* 11:26 logmsgbot: demon Synchronized wmf-config/abusefilter.php: (no message) (duration: 00m 19s)
* 10:10 hashar: Jenkins web interface is back up
* 09:54 logmsgbot: demon Synchronized wmf-config/abusefilter.php: abuse filter settings for fawiki (duration: 00m 21s)
* 07:33 hashar: restarting Jenkins. It apparently like to parse the whole history on reload, so aborting that.
* 07:13 hashar: Upgrading Jenkins plugin and restarting.
* 07:04 hashar: upgrading Jenkins to latest LTS
* 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Aug  6 03:10:06 UTC 2014 (duration 10m 5s)
* 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-06 02:29:00+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-06 02:16:38+00:00
 
== August 5 ==
* 15:07 logmsgbot: root gracefulled all apaches
* 15:03 logmsgbot: root gracefulled all apaches
* 12:30 hasharEat: Upgrading python-gear on gallium and restarting zuul and zuul-merger
* 12:26 akosiaris: uploaded python-gear_0.5.5-1 on apt.wikimedia.org
* 03:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug  5 03:08:30 UTC 2014 (duration 8m 29s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-05 02:27:58+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-05 02:16:52+00:00
* 00:41 springle: ongoing schema changes: ar_content_model, ar_content_format. on terbium, osc_host.sh processes ok to kill in emergency
 
== August 4 ==
* 22:58 bblack: rebooting ms-be1012
* 21:47 ottomata: reenabling puppet on analytics1027
* 21:46 jgage: all kafka brokers upraded to 0.8.1.1 and data replicated: done
* 20:37 ottomata: stopping puppet on analytics1027 to temporarily disable camus cron job
* 19:07 ottomata: starting upgrade of kafka cluster
* 19:02 logmsgbot: maxsem Synchronized php-1.24wmf16/includes/User.php: https://gerrit.wikimedia.org/r/#/c/151691/ (duration: 00m 06s)
* 18:57 jgage: beginning kafka upgrade: disabling puppet on brokers
* 13:17 apergos: stopped labs rsync job from dataset1001, mount of labstore1003 was borked, removed 90GB of stuff on /mnt/data (= /) filesystem, restarted nfsd on dataset1001, dumps back to going
* 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug  4 03:11:03 UTC 2014 (duration 11m 2s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-04 02:27:58+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-04 02:16:46+00:00
 
== August 3 ==
* 03:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug  3 03:28:56 UTC 2014 (duration 28m 55s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-03 02:27:44+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-03 02:16:39+00:00
 
== August 2 ==
* 15:28 godog: reboot ms-be1008, stuck on xfs errors and most processes in D state
* 14:10 Krinkle: Restarting Zuul
* 14:08 hashar: Jenkins / Zuul stuck {{bug|69045}}
* 14:00 Krinkle: Restarting Jenkins in attempt to unstuck the clogged Zuul pipeline for gallium
* 04:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug  2 04:20:45 UTC 2014 (duration 20m 44s)
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-02 02:32:36+00:00
* 02:21 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-02 02:20:02+00:00
* 01:49 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1018, replag (duration: 00m 06s)
* 00:43 Krinkle: Restarting Jenkins on gallium because the pipeline is clogged
 
== August 1 ==
* 20:25 andrewbogott: shorted the logrotate interval on vanadium; disk space critical should resolve soon
* 18:10 logmsgbot: csteipp Synchronized php-1.24wmf16/extensions/CentralAuth: Fix for bug 69007 - logins failing for old style hashes (duration: 00m 06s)
* 17:32 AaronSchulz: Restarted maintenance/populateBacklinkNamespace.php on enwiki
* 17:31 logmsgbot: aaron Synchronized php-1.24wmf15/maintenance/populateBacklinkNamespace.php: e1cea29342f964cd9a720310185b09ca41eb1a4a (duration: 00m 04s)
* 17:16 akosiaris: upgraded etherpad-lite on zirconium to 1.4.0-2. Uploaded etherpad-lite_1.4.0-2 on apt.wikimedia.org
* 17:11 logmsgbot: aaron Synchronized php-1.24wmf15/includes: d218d86dff90a5f0110353c492bd2e8ddaf35497 (duration: 00m 08s)
* 17:09 logmsgbot: aaron Synchronized php-1.24wmf16/includes: f1a8ff7f802b57cc9f452d47c4c762a185ed93c2 (duration: 00m 06s)
* 15:48 logmsgbot: reedy Synchronized php-1.24wmf16/includes/specials/SpecialRecentchangeslinked.php: (no message) (duration: 00m 14s)
* 12:07 apergos: powercycled dataset1001, inaccessible via mgmt console, only visible message was 'mnt.nfs failed'
* 09:10 _joe_: apache mediawiki::web train finished its run. re-enabling puppet on all appservers
* 07:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug  1 07:46:04 UTC 2014 (duration 46m 3s)
* 07:24 _joe_: stopping puppet on appservers to deploy a potentially dangerous case
* 05:16 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: Move enwiki api traffic away from lagging slaves (duration: 00m 07s)
* 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-01 03:11:14+00:00
* 02:40 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-01 02:38:56+00:00
* 00:52 logmsgbot: catrope Synchronized php-1.24wmf16/extensions/VisualEditor/lib/ve/modules/ve/ui/inspectors/ve.ui.CommentInspector.js: Fix typo in class name (duration: 00m 10s)
 
== July 31 ==
* 23:23 logmsgbot: mwalker Synchronized php-1.24wmf16: Updating core and Flow for SWAT (duration: 00m 53s)
* 23:05 logmsgbot: mwalker Synchronized wmf-config: Updating configuration for {{gerrit|150145}} (duration: 00m 05s)
* 21:17 RobH: blog.wikimedia.org cname changed to migrate over to wp servers
* 20:22 AaronSchulz: Started populateBacklinkNamespace.php on s1-s3,s5-s7 (commons already running)
* 20:13 cscott: updated OCG to version d2919c59eb09e09fc87777696411a070620aef45
* 19:40 hashar: Jenkins build its first hhvm extension \O/ https://integration.wikimedia.org/ci/job/php-FastStringSearch-hhvm-build/2/console
* 19:24 Coren_away: labsdb1005 had to blow away the postgres slave: was using all the space on / because DB at wrong spot (should have been /srv/postgres)
* 18:40 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 18:27 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 18:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf16
* 18:02 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf15
* 17:47 logmsgbot: aaron Synchronized wmf-config/CommonSettings.php: Increased "htmlCacheUpdate" throttle limit (duration: 00m 07s)
* 17:46 logmsgbot: reedy Finished scap: testwiki to 1.24wmf16 and build l10n cache (duration: 22m 35s)
* 17:23 logmsgbot: reedy Started scap: testwiki to 1.24wmf16 and build l10n cache
* 14:57 bblack: added labstore1003 to filter labs-in4 terms allow-labstore-(udp|tcp)4 on cr[12]-eqiad
* 14:33 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Allow sysops and 'crats on wikimania2014wiki to grant confirmed (duration: 00m 15s)
* 14:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages to 1.24wmf15
* 14:12 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
* 14:10 logmsgbot: reedy Finished scap: Rebuild 1.24wmf15 l10n cache for WikimediaMessages updates (duration: 22m 40s)
* 14:05 bblack: removed labs-in4 and labs-in6 filters on vlan 1117 (labs-hosts1-a-eqiad) on cr[12]-eqiad
* 13:47 logmsgbot: reedy Started scap: Rebuild 1.24wmf15 l10n cache for WikimediaMessages updates
* 13:44 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/RelatedSites/: (no message) (duration: 00m 15s)
* 13:44 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/WikimediaMessages: (no message) (duration: 00m 14s)
* 12:10 hashar: stopping Jenkins and restarting it
* 12:04 hashar: reloading Jenkins configuration
* 11:37 hashar: Jenkins: upgrading almost all jobs to use a new label 'UbuntuPrecise' {{bug|68340}} {{gerrit|150785}}
* 10:49 hashar: Jenkins: attempting to poll a Trusty slave (integration-slave1004-trusty [10.68.17.148] with label <tt>UbuntuTrusty</tt>).
* 10:32 hashar: Jenkins: tweaking jobs labels, that might eventually screw up Zuul/Jenkins entirely.
* 08:43 _joe_: start rolling reload of nginx to catch up with the new ssl config
* 06:50 springle: labsdb1001 migration complete, should be all systems go
* 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 31 03:18:07 UTC 2014 (duration 18m 6s)
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-31 02:35:29+00:00
* 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-31 02:19:17+00:00
* 02:06 springle: labsdb1001 migrating to mariadb 10, expect read-only and downtime, see labs-l
 
== July 30 ==
* 23:27 logmsgbot: maxsem Synchronized php-1.24wmf15/extensions/MwEmbedSupport/: (no message) (duration: 00m 03s)
* 23:27 logmsgbot: maxsem Synchronized php-1.24wmf15/extensions/Wikidata/: (no message) (duration: 00m 08s)
* 23:26 logmsgbot: maxsem Synchronized php-1.24wmf15/extensions/SyntaxHighlight_GeSHi/: (no message) (duration: 00m 05s)
* 23:23 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/Wikidata: (no message) (duration: 00m 11s)
* 23:13 logmsgbot: maxsem Synchronized wmf-config: (no message) (duration: 00m 05s)
* 21:04 AaronSchulz: Started populateBacklinkNamespace.php on wikidata and commons
* 21:02 bblack: turned icinga email/sms back on
* 20:24 bblack: icinga back online again
* 19:57 bblack: shutting off icinga to make some optimizations
* 19:20 bblack: icinga is now substantially back online.  email/sms still disabled for now, and downtimes/acks need to be re-added for known issues
* 19:06 logmsgbot: csteipp Synchronized php-1.24wmf14/includes/: (no message) (duration: 00m 05s)
* 19:04 logmsgbot: csteipp Synchronized php-1.24wmf15/includes/: (no message) (duration: 00m 07s)
* 18:59 bblack: icinga coming back up again for the first time, expect random strangeness to be ignored
* 18:46 bblack: temporarily hard-disabling email/sms from icinga via 'mv /usr/bin/mail /usr/bin/mail-disabled' on neon to prevent icinga spam on next startup attempt
* 17:55 bblack: stopping icinga service for now while working out other details
* 17:25 tacotuesday: repooled elastic1018 and elastic1019 as well
* 17:21 Coren: labmon1001 rebooting (final check for proper raid+lvm autodetection)
* 17:08 bblack: working on bringing up new neon install (first puppet run, etc)
* 17:01 Coren: labmon1001 rebooting (partitioning changes on primary disks)
* 16:53 tacotuesday: elastic1017 repooled, shards allocating
* 16:13 bd808: scap and dologmsg from tin won't work until neon is back up and running tcpircbot
* 16:07 bd808|deploy: Synchronized touch: no-op sync to test scap update (duration: 00m 05s)
* 16:06 bd808|deploy: scap announce failed -- timeout connecting to tcpircbot on neon.wikimedia.org
* 16:04 bd808|deploy: Updated scap to 4871208 (rely on $PATH for scap scripts)
* 15:21 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/Wikidata/extensions/Wikibase/lib/resources/wikibase.js: touch (duration: 00m 20s)
* 15:17 hashar: upgrading php5 on jenkins slaves
* 15:07 cmjohnson1: shutting down neon
* 14:46 logmsgbot: demon Synchronized wmf-config/CirrusSearch-production.php: (no message) (duration: 00m 04s)
* 14:35 logmsgbot: demon Synchronized wmf-config/PrivateSettings.php: Swift config for Cirrus (duration: 00m 08s)
* 14:30 godog: rolling restart of ms-fe* to pick up search backup user
* 14:17 bblack: rebooting neon again, trying to fix the disk situation
* 14:11 Coren: reinstalling labmon1001 -> change disk partitioning scheme
* 13:50 springle: neon read-only fs. fsck + reboot
* 13:16 manybubbles: rebuiding Cirrus index for commons to pick up weighted all field
* 11:17 _joe_: enabling puppet on all mw* servers
* 11:15 _joe_: re-enabling puppet on mw1019, last bunch of tests, then re-enabling globally
* 10:58 _joe_: re-enabling puppet on mw1018, testwiki upgraded to the new config and looks fine
* 09:25 godog: set weight for ms-be1014 and ms-be1015 to 2300
* 08:58 _joe_: stopping puppet on the appservers, in preparation for releasing change 148099
* 08:30 _joe_: powercycling neon, doesn't respond to requests, ssh hangs, console dark
* 06:41 springle: labsdb1001 work in progress; it may misbehave. see labs-l for updates
* 04:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 30 04:27:56 UTC 2014 (duration 27m 55s)
* 03:39 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-30 03:38:28+00:00
* 02:51 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-30 02:50:14+00:00
* 01:47 bblack: ip addr del for cp4017's ip6_mapped addr on cp4018 (no idea why it was there...)
 
== July 29 ==
* 23:37 logmsgbot: catrope Finished scap: SWAT updates for wmf15, I'm lazy (duration: 07m 02s)
* 23:30 AaronSchulz: Updated /srv/jobrunner to d2298139ea22bf8e48de066a73f28024b140ea33
* 23:30 logmsgbot: catrope Started scap: SWAT updates for wmf15, I'm lazy
* 23:28 logmsgbot: catrope Synchronized php-1.24wmf14/extensions/VisualEditor: (no message) (duration: 00m 05s)
* 23:28 logmsgbot: catrope Synchronized php-1.24wmf14/extensions/MobileFrontend: (no message) (duration: 00m 05s)
* 23:18 logmsgbot: catrope Synchronized wmf-config/: Do not put OCG in sidebar (duration: 00m 04s)
* 23:11 logmsgbot: catrope Synchronized wmf-config/: Enable TemplateData GUI on nlwiki (duration: 00m 05s)
* 23:10 bblack: took OCG service IP out of downtime in icinga, it's live
* 23:06 logmsgbot: mwalker Synchronized wmf-config: Enabling OCG in production (duration: 00m 04s)
* 23:05 logmsgbot: aaron Synchronized rpc: 0df032d957155aa475d99e2b887ba98b9a4c32fd (duration: 00m 07s)
* 23:04 logmsgbot: cscott Synchronized wmf-config: (no message) (duration: 00m 12s)
* 23:03 logmsgbot: cscott updated /a/common to {{Gerrit|Iae1ac79d5}}: Enable OCG in production
* 22:55 cscott: updated OCG to version aeb8623d6ebe41ae7c7e36c57844bd9ea8e6d595
* 22:50 RoanKattouw: Fixed ownership of slot0/cache on wikitech (virt1000), was root:root but should have been www-data:www-data
* 22:24 RoanKattouw: Updated lib/ve submodule inside extensions/VisualEditor on virt1000; wikitechwiki was running a Frankenstein version of VE that was part yesterday's code, part code from April
* 21:47 logmsgbot: ori Synchronized rpc/RunJobs.php: Ia62e9158f: Added a streamlined RunJobs that can be used by redisJobService (2/2) (duration: 00m 03s)
* 21:47 logmsgbot: ori Synchronized multiversion: Ia62e9158f: Added a streamlined RunJobs that can be used by redisJobService (1/2) (duration: 00m 03s)
* 21:44 Reedy: cleared bottuzzu@itwiki watchlist
* 21:32 spagewmf: spage ran `mwscript namespaceDupes.php --wiki=enwiki --prefix Topic`, 5 pages renamed
* 21:22 logmsgbot: spage Synchronized wmf-config/InitialiseSettings.php: Enable Flow on Wikimania testing page (duration: 00m 13s)
* 21:22 logmsgbot: ori updated /a/common to {{Gerrit|Ia62e9158f}}: Added a streamlined RunJobs that can be used by redisJobService
* 21:18 logmsgbot: spage updated /a/common to {{Gerrit|I3b4622e27}}: Wikivoyages back to 1.24wmf14
* 20:54 logmsgbot: aaron Synchronized php-1.24wmf14/includes/media: b45248509c07acb8146d6e735ef68dff193ac290 (duration: 00m 07s)
* 19:46 Krinkle: Reloading Zuul to deploy I7f80ee0b85d29791b7
* 19:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages back to 1.24wmf14
* 19:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages back to 1.24wmf15...
* 19:14 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
* 19:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages back to 1.24wmf14
* 18:43 cmjohnson1: power cycling virt1009
* 18:29 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf15
* 18:28 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/Wikidata/extensions/Wikibase/lib/config/WikibaseLib.default.php: touch (duration: 00m 16s)
* 18:26 bblack: removed "filter { input labs6-in; }" from ae3.1119 (labs-support1-c-eqiad) on cr[12]-eqiad
* 17:52 logmsgbot: aaron Synchronized php-1.24wmf15/includes/media: 76459cebd9cfbb33e9845f7acd8b8c1382cdae61 (duration: 00m 08s)
* 16:56 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Bump $wgCacheEpoch for testwikidata (duration: 00m 08s)
* 16:52 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/Wikidata/: Touch JS (duration: 00m 10s)
* 16:52 logmsgbot: hoo Synchronized php-1.24wmf14/extensions/Wikidata/: Touch JS (duration: 00m 11s)
* 16:50 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Only declare "special" sitegroups for testwikidata (duration: 00m 07s)
* 16:48 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Only declare "special" sitegroups for testwikidata (duration: 00m 08s)
* 16:47 logmsgbot: hoo Finished scap: Updating Wikidata with various changes for testwikidata and a client bug fix. (duration: 27m 27s)
* 16:37 cmjohnson1: replacing defective disk virt1009
* 16:20 logmsgbot: hoo Started scap: Updating Wikidata with various changes for testwikidata and a client bug fix.
* 16:10 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Make testwikidata use the "special" sitelink group. Preparations for submodule updates. (duration: 00m 08s)
* 16:10 bd808: logstash log event volume up after restart
* 16:09 bd808: restarted logstash on logstash1001.eqiad.wmnet; log volume looked to be down from expected levels
* 16:08 _joe_: reenabled puppet on mw1053
* 16:03 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase other projects links per default for ruwiki (duration: 00m 07s)
* 15:13 manybubbles: building cirrus indexes for group0 wikis in place to turn on the weighted all field we'll use for performance improvements later
* 15:06 logmsgbot: manybubbles Synchronized wmf-config: SWAT - deploy cirrussearch all field stage 2 part 2 (duration: 00m 04s)
* 15:06 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - deploy cirrussearch all field stage 2 part 1 (duration: 00m 04s)
* 13:54 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: added Universiteits Museum Utrecht to the wgCopyUploadsDomains array {{gerrit|150163}} (duration: 00m 04s)
* 13:38 ottomata: restarted gmetad on nickel, seems to have brought ganglia back up
* 11:30 _joe_: upgrading packages on mw1053, for testing hhvm with pcre-jit enabled
* 10:35 _joe_: puppet re-enabled on the appservers
* 10:29 _joe_: temporarily stopping puppet on appservers, releasing a potentially dangerous puppet change
* 09:10 _joe_: stopping jobrunner on mw1053, disabling puppet as well - running tests
* 09:02 hashar: restarted zuul-server and zuul-merger on gallium (new version though that is a noop)
* 09:00 hashar_: Zuul bumping Zuul cloner from patchset 21 to patchset 23. Deploying with tag wmf-deploy-2014-07-29-1
* 07:51 akosiaris: uploaded PHP 5.3.10-1ubuntu3.13+wmf1 on apt.wikimedia.org. Puppet will upgrade it across the fleet within 20 mins
* 03:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 29 03:47:39 UTC 2014 (duration 47m 38s)
* 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-29 03:10:31+00:00
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-29 02:35:18+00:00
* 00:44 logmsgbot: aaron Synchronized php-1.24wmf15/maintenance/runJobs.php: fcfa3153e53dc70e6cd190a087e7bd577fe380fb (duration: 00m 03s)
* 00:27 logmsgbot: aaron Synchronized php-1.24wmf15/maintenance: f754c239ce93fc5f2db19e93f4fe8a1d1ba7bc27 (duration: 00m 04s)
* 00:27 logmsgbot: aaron Synchronized php-1.24wmf15/includes: f754c239ce93fc5f2db19e93f4fe8a1d1ba7bc27 (duration: 00m 06s)
 
== July 28 ==
* 23:58 logmsgbot: ori Finished scap: I42c07b64: Update MobileFrontend (duration: 17m 37s)
* 23:41 logmsgbot: ori Started scap: I42c07b64: Update MobileFrontend
* 23:33 logmsgbot: ori Synchronized php-1.24wmf15/extensions/VisualEditor: Update VisualEditor to I944f8fbfa (duration: 00m 04s)
* 23:25 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I369dbad6e: Allow crats to add/remove petitiondata group on foundationWiki (duration: 00m 04s)
* 23:21 AaronS: Updated /srv/jobrunner to 0bb0ad62dd9240e0f67b2ded4519f125de13dfbc
* 23:12 mutante: temp. disabled puppet on neon and ircecho
* 23:06 mutante: graceful apache on palladium
* 21:12 hashar: Gerrit: allowed JenkinsBot to submit patches on wikimedia/bots (and thus on all child repositories)
* 20:50 hashar: operations/puppet.git manifests should no more have leading tabulations {{gerrit|I69ddc72f5a072ac7dc4f67622b65f36a70d3c021}}
* 20:08 bblack: intermittent 5xx are most likely varnish restarts off and on rest of today
* 19:51 hashar: Zuul: stopped / started process to clear up obsoletes changes stuck in queue
* 19:47 hashar: Jenkins/Zuul lost connection somehow. Disabled/Reenabled gearman client in Jenkins
* 19:44 hashar: Jenkins: updated qunit jobs to roam on both gallium and lanthanum (were previously tied to run only on gallium)
* 19:42 ottomata: restarted varnishkafka on some esams hosts that have old misconfigured vk processes
* 19:13 ottomata: restarting varnishkafka on amssq31
* 19:08 ottomata: restarting varnishkafka on cp3013
* 17:46 logmsgbot: aaron Synchronized php-1.24wmf14/includes/jobqueue/JobQueueFederated.php: 87e7bfceb795d065d6157ac8ce3381a7814000b5 (duration: 00m 03s)
* 17:38 logmsgbot: aaron Synchronized php-1.24wmf15/includes/jobqueue/JobQueueFederated.php: 12ce1dc1ec46b06d1160e142ddfaf8dcb1c9f131 (duration: 00m 04s)
* 16:30 andrewbogott: updated wikitech to 1.24wmf15; turned on OAuth
* 16:05 Nemo_bis: andrewbogott> Nikerabbit: I'm upgrading it [wikitech wiki], it'll be flaky for a bit
* 16:00 manybubbles: deone with SWAT
* 15:57 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/VisualEditor/: SWAT - fix visual editor bug - Changes made after reviewing changes are not sent (when caching is enabled) (duration: 00m 07s)
* 15:46 logmsgbot: manybubbles Synchronized php-1.24wmf15/extensions/VisualEditor/: SWAT - fix visual editor bug - Changes made after reviewing changes are not sent (when caching is enabled) (duration: 00m 08s)
* 15:41 hoo: Removed all right holders from closed and inaccessible ukwikimedia (bug 68737)
* 15:39 logmsgbot: manybubbles Synchronized php-1.24wmf15/includes/specials/SpecialRevisiondelete.php: SWAT - fix fatal on revision delete (duration: 00m 08s)
* 15:33 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: SWAT load Mantle before MobileFrontent (duration: 00m 07s)
* 15:31 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/Echo/: SWAT fix bad variable name in echo (duration: 00m 08s)
* 15:23 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - update some permissions on eswiki (duration: 00m 08s)
* 15:17 logmsgbot: manybubbles Synchronized php-1.24wmf15/extensions/Echo/: SWAT - fix incorrect variable name (duration: 00m 08s)
* 15:14 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - add import sources to bhwiki (duration: 00m 08s)
* 15:10 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow/: SWAT update fundraising to fix botched deploy
* 12:28 hashar: Upgrading our Jenkins Job Builder fork ( d833015..666e953 )
* 03:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 28 02:59:35 UTC 2014 (duration 59m 34s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-28 02:25:34+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-28 02:15:00+00:00
 
== July 27 ==
* 05:24 springle: mysqldump s6 dbstore1002 to dbstore1001
* 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 27 02:58:15 UTC 2014 (duration 58m 14s)
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-27 02:24:10+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-27 02:13:44+00:00
 
== July 26 ==
* 21:29 hashar: restarting Zuul to clear up some stalled changes.
* 02:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 26 02:57:52 UTC 2014 (duration 57m 50s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-26 02:25:46+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-26 02:15:31+00:00
 
== July 25 ==
* 22:52 mutante: Bugzilla - upgraded to 4.4.5
* 22:41 mutante: ocg - deleted old log dirs
* 19:28 hashar: Jenkins : disabling gearman plugin and reenabling it (just uncheck/save/check a box in  https://integration.wikimedia.org/ci/configure )
* 19:25 hashar: zuul@gallium:/etc/zuul/wikimedia$ echo status|nc -q 3 localhost 4730|wc -l  ... Yields: 0 .  Which mean jobs are no more registered for some reason.
* 19:24 hashar: Jenkins stalled again yeahhhhh
* 16:59 mutante: powercycled ms-be1010 - unresponsive to ssh, nothing on mgmt
* 16:28 MaxSem: Updating PageImages data for mainspace on Commons from terbium
* 13:09 _joe_: re-enabling puppet, test run on the test host was fine.
* 13:03 _joe_: stopping puppet on all appservers - will reactivate after testing
* 11:26 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 13s)
* 10:50 hashar: contint: manually cleared /tmp on the 3 labs jenkins slaves.
* 10:46 hashar: integration-slave1001.eqiad.wmflabs  is out of disk space ( / /dev/vda1)
* 07:29 springle: shutdown tantalum per mwalker request
* 04:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 25 04:18:45 UTC 2014 (duration 18m 44s)
* 03:31 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-25 03:30:33+00:00
* 02:48 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-25 02:47:17+00:00
* 01:21 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ic29ae11fa: On Labs, disable LuaSandbox's profiling feature to isolate bug 68413 (duration: 00m 04s)
* 00:15 mutante: imported jouncebot from github - https://gerrit.wikimedia.org/r/#/q/project:wikimedia/bots/jouncebot,n,z
* 00:03 K4-713: updated fundraising civicrm to 0639c11636d9
 
== July 24 ==
* 23:26 mutante: created gerrit project for jouncebot
* 23:06 logmsgbot: maxsem Synchronized wmf-config: https://gerrit.wikimedia.org/r/149180 (duration: 00m 05s)
* 21:53 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 20:41 mutante: rebooted wm-bot instance
* 20:21 bblack: restarted backend varnish for parsoid on cp1058
* 20:20 bblack: restarted backend varnish for parsoid on cp1045
* 20:08 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/Translate: (no message) (duration: 00m 15s)
* 20:00 logmsgbot: reedy Synchronized php-1.24wmf15: (no message) (duration: 00m 59s)
* 19:58 logmsgbot: reedy Synchronized php-1.24wmf14: (no message) (duration: 01m 11s)
* 19:24 hashar: restarted Zuul
* 18:44 ori: restarted jobrunners for 01c70b1a892ac3944655f84449e89e4508894101
* 18:41 AaronSchulz: Updated jobrunners to 01c70b1a892ac3944655f84449e89e4508894101
* 18:39 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:34 logmsgbot: aaron Synchronized php-1.24wmf14/includes/jobqueue/aggregator/JobQueueAggregatorRedis.php: ca031131396ee1830e239d0b6a314bb571840c11 (duration: 00m 06s)
* 18:26 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
* 18:24 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf15
* 18:23 ori: Purged apache from SSL cluster; provisioned as a side-effect of I0b02a46f3 + I76a0d237f
* 18:21 godog: updated swift ring to bring ms-be1013 weight to 2300
* 18:17 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf14
* 18:03 logmsgbot: reedy Finished scap: testwiki to 1.24wmf15 and build l10n cache (duration: 31m 12s)
* 17:32 logmsgbot: reedy Started scap: testwiki to 1.24wmf15 and build l10n cache
* 16:38 hashar: restarting Jenkins it is broken again
* 16:10 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Swap wikimania2013wiki to wikimania2014wiki in wmgCentralAuthLoginIcon (duration: 00m 14s)
* 15:55 bd808|deploy: Fetched de8022b to /a/common on tin; prod no-op change needed for beta
* 15:40 bd808|deploy: Fetched c7ae85e to /a/common on tin; prod no-op needed for beta
* 15:39 ottomata: temporarily stopping puppet on analytics1027
* 15:14 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings-labs.php: Fix reference thumbnail settings syntax (duration: 00m 13s)
* 15:13 cmjohnson1: swapping disk 8 es1001
* 15:10 hashar: Clearing out old Zuul references on operations/puppet.git  might cause merge errors
* 15:10 logmsgbot: yurik Synchronized php-1.24wmf14/extensions/ZeroBanner: (no message) (duration: 01m 07s)
* 15:08 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/ZeroBanner: (no message) (duration: 01m 11s)
* 14:30 logmsgbot: yurik Synchronized wmf-config/mobile.php: Font for zero banner (duration: 01m 10s)
* 13:38 hashar: Deleting old Zuul references in the Zuul maintained repository /srv/ssd/zuul/git/mediawiki/core/ on gallium {{bug|68481}} . Should speed up merge operations on that repository.
* 10:10 hashar: Zuul code being installed on lanthanum.eqiad.wmnet  Will let us use a merger daemon there and the Zuul cloner client. {{gerrit|141758}}
* 05:44 springle: labsdb1002 work in progress; it may misbehave. see labs-l for updates
* 03:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 24 03:56:32 UTC 2014 (duration 56m 31s)
* 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-24 03:08:05+00:00
* 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-24 02:36:35+00:00
* 00:44 ori: installing linux-tools on mw1053 to run perf on jobrunner
 
== July 23 ==
* 23:59 logmsgbot: maxsem Finished scap: Pick up messages forgotten during Zero deployment (duration: 26m 42s)
* 23:39 ori: running sync-common on mw1053.eqiad.wmnet
* 23:32 logmsgbot: maxsem Started scap: Pick up messages forgotten during Zero deployment
* 23:26 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/MultimediaViewer/: (no message) (duration: 00m 03s)
* 23:26 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/VisualEditor/: (no message) (duration: 00m 04s)
* 23:19 logmsgbot: maxsem Synchronized php-1.24wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
* 23:18 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
* 22:39 mutante: removed platinum from icinga
* 22:36 _joe_: installed mw1053 as the first hhvm jobrunner, currently stopped. Puppet disabled so that it won't restart the jobrunner automatically
* 21:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I2f366fa93: Use luastandalone on HHVM (duration: 00m 03s)
* 21:17 hashar: Zuul is all good. It just receives too many patches :-]
* 20:31 bd808|deploy: Updated /a/common to 07834a9 (beta cluster: use luastandalone); no sync needed
* 20:30 subbu: deployed parsoid version 47d4bc83
* 20:27 hashar: Having no idea how to fix zuul. Restarting it and killing the whole queue :-/
* 20:14 mutante: contacts.wm - set $base_url in default/settings.php to https URL, and $is_https='on' in bootstrap.inc (unpuppetized?)
* 19:49 logmsgbot: awight Synchronized php-1.24wmf14/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 05s)
* 19:49 logmsgbot: awight Synchronized php-1.24wmf13/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 05s)
* 19:28 logmsgbot: awight Synchronized php-1.24wmf14/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 04s)
* 19:27 logmsgbot: awight Synchronized php-1.24wmf13/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 04s)
* 18:57 hashar: reenabled Gearman plugin in Jenkins.  Jobs have been reregistered and seems to be proceeding again
* 18:55 hashar: back. attempting to fix jenkins
* 18:38 logmsgbot: yurik Synchronized php-1.24wmf14/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 24s)
* 18:36 hashar: can't fix jenkins / zuul right now.  Will be stalled for at least half an hour
* 18:35 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 27s)
* 18:33 hashar: Jenkins disabled and reenabled Gearman plugin. The jobs were no more registered in Zuul gearman server :-(
* 18:32 hashar: Jenkins stalled
* 17:45 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 07s)
* 17:38 godog: launched a script on ms-fe1001 to collect thumb stats, no impact expected
* 17:11 logmsgbot: awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: automatic translate workflow fix for Fundraising/ pages on meta.wmo (duration: 00m 04s)
* 15:38 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/Wikidata: touch (duration: 00m 15s)
* 15:34 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/Wikidata: Fix css issue in entity suggester on Wikidata (duration: 00m 17s)
* 15:19 logmsgbot: reedy Synchronized php-1.24wmf14/resources/Resources.php: Fixing forgotten OOUI messages (duration: 00m 15s)
* 15:11 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Remove flickrApiUrl from (duration: 00m 15s)
* 14:19 akosiaris: upgraded php5 on mw1017 (test.wikipedia.org) deployment-apache0{1,2} (beta) to 5.3.10-1ubuntu3.13+wmf1
* 12:42 hashar: upgraded gdnsd on gallium (used to lint operations/dns.git changes)
* 09:57 hashar: Zuul migrated to zuul user :)
* 09:43 hashar: zuul changing file ownership on gallium for /srv/ssd/zuul/git from jenkins:root to zuul:zuul
* 09:42 hashar: breaking zuul
* 05:29 springle: clone mariadb 10 labsdb1002 to labsdb100[13]
* 04:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 23 04:09:54 UTC 2014 (duration 9m 53s)
* 03:21 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-23 03:20:45+00:00
* 02:50 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-23 02:49:34+00:00
 
== July 22 ==
* 23:37 logmsgbot: ebernhardson Finished scap: Update flow in wmf/1.24wmf14 (duration: 17m 08s)
* 23:20 logmsgbot: ebernhardson Started scap: Update flow in wmf/1.24wmf14
* 21:57 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/WikimediaMessages/: Fix fatal for dumps (duration: 00m 15s)
* 21:52 logmsgbot: reedy Synchronized php-1.24wmf13/extensions/WikimediaMessages/: Fix fatal for dumps (duration: 00m 13s)
* 18:46 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf14 again
* 18:45 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/CirrusSearch/: Fix fatal (duration: 00m 15s)
* 18:13 Reedy: Running sync-common on mw1081
* 18:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias back to 1.24wmf13 due to Wikidata and Cirrus fatals
* 18:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf14
* 17:48 logmsgbot: mwalker Finished scap: Deploying Petition extension to the cluster (duration: 28m 27s)
* 17:19 logmsgbot: mwalker Started scap: Deploying Petition extension to the cluster
* 17:12 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata/extensions/ValueView/lib/jquery.ui/jquery.ui.suggester.js: touch jquery.ui.suggester.js for Wikidata (duration: 00m 05s)
* 17:06 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata: Update Wikidata submodule for test wikidata, for real! (duration: 00m 06s)
* 17:02 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata/extensions/Wikibase/lib/resources/wikibase.js: touch wikibase.js for test wikidata only, fix caching issues (duration: 00m 05s)
* 16:55 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata/extensions/ValueView/lib/jquery.ui/jquery.ui.suggester.js: touch jquery.ui.suggester.js for Wikidata (duration: 00m 05s)
* 16:48 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata: Update Wikidata: js and json dump fixes (duration: 00m 11s)
* 16:26 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: enable WikibaseClient on test wikidata (duration: 00m 07s)
* 16:25 logmsgbot: aude Synchronized wmf-config/Wikibase.php: add settings for enabling WikibaseClient on test wikidata (duration: 00m 04s)
* 16:18 logmsgbot: reedy Purged l10n cache for 1.24wmf12
* 16:18 logmsgbot: reedy Purged l10n cache for 1.24wmf11
* 16:17 logmsgbot: reedy Purged l10n cache for 1.24wmf10
* 15:12 manybubbles: done with SWAT
* 15:11 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - touching InitializeSettings.php to make dblist change go (duration: 00m 06s)
* 15:10 logmsgbot: manybubbles Synchronized commonsuploads.dblist: SWAT add mrwiki to commonsuploads list (duration: 00m 08s)
* 15:06 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/CirrusSearch/: SWAT small cirrus fixes (duration: 00m 08s)
* 14:48 _joe_: removed old, unused puppet 2.7 packages from reprepro for trusty
* 14:00 _joe_: reinstalling mw1053 in 5 minutes, downtime on icinga, puppet disabled, setting to 'false' everywhere in pybal
* 05:31 bblack: authdns servers (mexia, rubidium, eeden) updated to gdnsd-1.11.4~precise1
* 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 22 03:12:10 UTC 2014 (duration 12m 9s)
* 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-22 02:36:27+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-22 02:13:40+00:00
* 00:56 mutante: tungsten,fluorine, search1001-1006 - upgraded libssl
 
== July 21 ==
* 23:44 mutante: graceful apache on magnesium
* 23:42 legoktm: cleaned up stalled global rename of Felipegaspars --> L'editeur
* 23:39 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/148270/ - revert previous change (duration: 00m 04s)
* 23:36 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/TimedMediaHandler/: https://gerrit.wikimedia.org/r/#/c/148241/ (duration: 00m 04s)
* 23:32 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/148249/ (duration: 00m 04s)
* 23:30 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/148128 (duration: 00m 06s)
* 23:29 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/AbuseFilter/: https://gerrit.wikimedia.org/r/#/c/148027/ (duration: 00m 06s)
* 23:27 logmsgbot: maxsem Synchronized php-1.24wmf14/resources/: https://gerrit.wikimedia.org/r/#/c/147854/ (duration: 00m 05s)
* 22:21 mutante: installing package upgrades on bast1001
* 21:50 RobH: shutting down ms1002 for reclaim into labstore1003
* 21:37 ottomata: running kafka preferred-replica-election to rebalance topics
* 21:27 hashar: beta: removed build timeout from beta-update-databases-eqiad  Jenkins jobs. There is a huge schema change being processed by update.php
* 20:09 subbu: deployed parsoid version 1c9277d6
* 17:24 mutante: elastic1009,analytics1004,silver, various misc. boxes - upgrading libssl
* 17:16 mutante: installing package upgrades on iron
* 16:19 godog: restarted uwsgi on tungsten
* 16:02 andrewbogott: updated OpenStackManager on wikitech
* 15:48 logmsgbot: demon Synchronized wmf-config: Undeploying CommunityVoice/ClientSide extensions (duration: 00m 08s)
* 15:30 logmsgbot: demon Synchronized wmf-config/flaggedrevs.php: ukwiki gets FR for NS_MODULE (duration: 00m 04s)
* 15:25 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Set $wgUploadNavigationUrl for plwikisource (duration: 00m 04s)
* 15:25 logmsgbot: demon Synchronized php-1.24wmf14/extensions/CirrusSearch: CirrusSearch to master for 1.24wmf14 (duration: 00m 07s)
* 15:22 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Set $wgForceUIMsgAsContentMsg for zhwikivoyage (duration: 00m 05s)
* 15:17 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: TemplateData for fiwiki (duration: 00m 06s)
* 13:41 _joe_: restarted apache on palladium
* 12:01 apergos: started /usr/local/bin/dumpwikidatajson.sh in root screen session on snapshot1003
* 03:11 springle: restarted apache on strontium
* 02:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 21 02:57:03 UTC 2014 (duration 57m 2s)
* 02:50 logmsgbot: krinkle Synchronized wmf-config/InitialiseSettings.php: I27c6f82af5e9b (duration: 00m 06s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-21 02:25:08+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-21 02:13:42+00:00
 
== July 20 ==
* 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 20 02:57:59 UTC 2014 (duration 57m 58s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-20 02:25:54+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-20 02:14:47+00:00
* 01:40 ori: synced docroot/default/index.html (I005f43b96: Add width/height attributes to img to fix reflow)
 
== July 19 ==
* 23:33 logmsgbot: aaron Synchronized php-1.24wmf4/maintenance: 926e1997b53f563a4e7f3c540e32b45ddb24b3c5 & 017891ba41cc72987bf3cb441004a847d20105b4 (duration: 00m 08s)
* 23:33 logmsgbot: aaron Synchronized php-1.24wmf4/includes: 926e1997b53f563a4e7f3c540e32b45ddb24b3c5 & 017891ba41cc72987bf3cb441004a847d20105b4 (duration: 00m 09s)
* 15:43 bblack: restarted gitblit service on antimony
* 05:02 Krinkle: Ungracefully restarting Zuul to clear the items stuck in the queue (picked a moment with no real items waiting in the queue).
* 03:25 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool es1001 (duration: 00m 06s)
* 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 19 02:56:30 UTC 2014 (duration 56m 29s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-19 02:26:26+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-19 02:15:20+00:00
* 01:09 logmsgbot: awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 4) (duration: 00m 04s)
* 01:08 logmsgbot: awight Synchronized php-1.24wmf13/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 4) (duration: 00m 04s)
* 01:07 logmsgbot: awight Synchronized php-1.24wmf12/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 4) (duration: 00m 04s)
* 00:36 logmsgbot: awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 3) (duration: 00m 04s)
* 00:36 logmsgbot: awight Synchronized php-1.24wmf13/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 3) (duration: 00m 04s)
* 00:36 logmsgbot: awight Synchronized php-1.24wmf12/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 3) (duration: 00m 04s)
 
== July 18 ==
* 22:04 logmsgbot: awight Synchronized php-1.24wmf14: update FundraisingTranslateWorkflow submodule (take 2) (duration: 00m 58s)
* 22:03 logmsgbot: awight updated /a/common/php-1.24wmf14 to {{Gerrit|I1036dae02}}: Update mediawiki/core/vendor to head to 1.24wmf14
* 21:00 logmsgbot: awight Synchronized php-1.24wmf13: update FundraisingTranslateWorkflow submodule (duration: 01m 04s)
* 20:58 awight: for the record, I actually updated to ade90e0e22492d87e6069db3a359b22ef56401a6
* 20:57 logmsgbot: awight updated /a/common/php-1.24wmf13 to {{Gerrit|Id3462554b}}: Made --maxtime a soft limit again
* 20:50 logmsgbot: awight Synchronized php-1.24wmf12: update FundraisingTranslateWorkflow submodule (duration: 00m 49s)
* 20:48 logmsgbot: awight Synchronized php-1.24wmf12: update FundraisingTranslateWorkflow submodule (duration: 00m 21s)
* 20:48 logmsgbot: awight updated /a/common/php-1.24wmf12 to {{Gerrit|Idf3f49941}}: Updating ZeroBanner
* 20:41 MaxSem: Load testing GeoData
* 19:11 mutante: restarted apache on strontium.. sigh
* 18:17 logmsgbot: aaron Synchronized php-1.24wmf13/maintenance/runJobs.php: ae053860dc36a07f05ab9e31299f2da0d2f66e85 (duration: 00m 03s)
* 18:16 logmsgbot: aaron Synchronized php-1.24wmf14/maintenance/runJobs.php: 684c21c325370aa3baac631ae9a006fc8861b952 (duration: 00m 03s)
* 18:05 logmsgbot: aaron Synchronized wmf-config/jobqueue-eqiad.php: Set "daemonized" flag for the redis job queue (duration: 00m 04s)
* 17:33 cmjohnson: replacing disk 2 es1005
* 17:25 mutante: temp. stopped icinga-wm to avoid channel spam
* 17:24 mutante: puppetmaster on strontium had 'Unexpected error in mod_passenger" causing puppet fails all over the place with error 500 on master, resumed normal after graceful
* 17:21 mutante: graceful'ed apache on strontium
* 14:37 godog: rolling reload of proxy-server on swift ms-fe1* to pick up changes
* 13:19 _joe_: re-enabling puppet, applying on a sample of hosts created no change according to my tests.
* 13:13 _joe_: temporarily disabling puppet on mw servers, will re-enable when I'm done with testing (again) the change
* 11:20 godog: restart proxy-server on ms-fe1003, as suspected it wasn't running the latest version
* 11:14 godog: restart proxy-server on ms-fe1003, double checking for a change in numbers reported to graphite
* 10:04 godog: stagger reload swift {account,object,container} server in ms-be.eqiad to pick up recon changes
* 06:01 AaronSchulz: Updated /srv/deployment/jobrunner to 4cddd5033efadf431e138c399b5d86542e32f196
* 03:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 18 03:53:55 UTC 2014 (duration 53m 54s)
* 03:22 ori: Updated jobrunner to d9520c9 and restarted service on all jobrunners
* 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-18 03:08:02+00:00
* 02:45 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: Repool db1021, context RT 7916, warm up (duration: 00m 08s)
* 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-18 02:36:54+00:00
 
== July 17 ==
* 23:49 logmsgbot: mwalker Finished scap: SWAT for {{gerrit|146651}}, {{gerrit|147102}}, {{gerrit|146925}}, {{gerrit|147331}}, {{gerrit|147332}}, and {{gerrit|147206}}
* 23:19 logmsgbot: mwalker Started scap: SWAT for {{gerrit|146651}}, {{gerrit|147102}}, {{gerrit|146925}}, {{gerrit|147331}}, {{gerrit|147332}}, and {{gerrit|147206}}
* 21:02 csteipp: deployed fix for bug68187
* 20:29 ori: updated jobrunner to 71d84ea18d and restarted service
* 18:36 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf14
* 18:30 springle: db1021 raid write-cache failure, BBU at 9%
* 18:14 springle: db1021 disabled sync_binlog, thread tied up on fsync
* 18:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf13
* 18:09 logmsgbot: reedy Synchronized wmf-config/: De-pool db1021 due to increasing replag (duration: 00m 14s)
* 17:40 logmsgbot: reedy Finished scap: testwiki to 1.24wmf14 take 2 (duration: 33m 02s)
* 17:30 Jeff_Green: payments1002 dist upgrade & reboot
* 17:21 mutante: nickel (ganglia) apt-get upgrading packages
* 17:13 Jeff_Green: dist-upgrade and reboot payments1003
* 17:07 logmsgbot: reedy Started scap: testwiki to 1.24wmf14 take 2
* 17:04 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.VsoJsYY6Q2" ' returned non-zero exit status 1 (duration: 02m 46s)
* 17:03 RobH: payments4 is kernel updating (per jgreen)
* 17:01 logmsgbot: reedy Started scap: testwiki to 1.24wmf14
* 15:05 logmsgbot: manybubbles Synchronized php-1.24wmf13/extensions/MultimediaViewer/: SWAT - Moving repo icon back to the right-hand side in Media Viewer (duration: 00m 05s)
* 15:03 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings-labs.php: SWAT deploy to keep us synced, but this is a noop in prod.  only anything in beta. (duration: 00m 05s)
* 07:27 springle: mariadb 10 on labsdb1002:3309 cloning s5 from sanitarium db1054:3308
* 03:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 17 03:32:25 UTC 2014 (duration 32m 24s)
* 02:47 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-17 02:46:24+00:00
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-17 02:23:08+00:00
 
== July 16 ==
* 23:55 logmsgbot: maxsem Synchronized private: Clean up old mobile cruft (duration: 00m 05s)
* 23:17 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 03s)
* 23:13 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 14s)
* 22:34 andrewbogott: temporarily fixed puppet on tin by restarting salt-master and salt-minion.  A proper fix would involve upgrading to a salt version that fixes https://github.com/saltstack/salt/issues/6306
* 22:29 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/ZeroBanner: (no message) (duration: 03m 55s)
* 22:27 ori: restarted jobrunner service on all job runners
* 22:18 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/ZeroBanner: (no message) (duration: 04m 31s)
* 21:50 AaronSchulz: Updated job runners to 186b9b33
* 21:08 legoktm: clearing Magog the Ogre's watchlist on enwp per request (173668 entries)
* 21:01 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 53s)
* 20:56 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 54s)
* 20:22 subbu: deploy parsoid 060dcb54
* 19:56 ottomata: reenabling puppet on analytics1027
* 19:21 ottomata: temp disabling puppet on analytics1027
* 17:57 akosiaris: clean puppet stored config database for osm-db100{1,2}.eqiad.wmnet, updating icinga
* 16:49 Reedy: Restarted jenkins again
* 16:12 Reedy: Restarted jenkins
* 16:11 Reedy: Killed jenkins
* 14:34 _joe_: moving the stale conf-enabled directory away on jobrunners, or when we upgrade to trusty all hell will break loose
* 13:06 logmsgbot: oblivian gracefulled all apaches
* 12:14 logmsgbot: oblivian gracefulled all apaches
* 12:01 _joe_: removed stale files from /etc/apache2/conf-enabled on all mw hosts
* 11:25 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: Take Cirrus as default from more wikis while we figure out load issues (duration: 00m 06s)
* 10:32 _joe_: releasing a new apache config to all mediawikis
* 08:54 godog: repool ms-fe1004
* 08:51 godog: repool ms-fe1003 and depool ms-fe1004
* 08:46 godog: repool ms-fe1002 and depool ms-fe1003
* 08:39 godog: depool ms-fe1002 for swift upgrade
* 05:54 springle: resuming page content model schema changes, osc_host.sh processes on terbium ok to kill in emergency
* 04:22 springle: restarted gitblit on antimony
* 03:04 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 16 03:03:41 UTC 2014 (duration 3m 40s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-16 02:26:12+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-16 02:14:32+00:00
* 01:34 manybubbles: moving shards off of elastic101[789]
 
== July 15 ==
* 23:20 logmsgbot: maxsem Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/146615/ (duration: 00m 04s)
* 23:16 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,146471,n,z (duration: 00m 05s)
* 23:14 logmsgbot: maxsem Synchronized php-1.24wmf13/includes/specials/SpecialVersion.php: (no message) (duration: 00m 04s)
* 23:13 logmsgbot: maxsem Synchronized php-1.24wmf13/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,146471,n,z (duration: 00m 04s)
* 22:35 K4-713: synchronized payments to afa12be34769000bf8
* 21:34 _joe_: disabling puppet on mw1001, tests
* 21:26 logmsgbot: aude Synchronized php-1.24wmf13/extensions/Wikidata: Update submodule to fix entity search issue on Wikidata (duration: 00m 21s)
* 21:15 ori: to test r146607, locally modified upstart conf for jobrunner on mw1001 to log to /var/log/mediawiki, and restarted service
* 20:24 ori: restarted jobrunner on all jobrunners
* 20:23 AaronSchulz: Deployed /srv/jobrunner to 31e54c564d369e89613db48977eec0a5891b6498
* 20:21 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 21s)
* 20:18 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf13
* 20:12 Krinkle: Reloading Zuul to deploy If2312bcf18bdbe8dee
* 20:12 bd808: log volume up after logstash restart
* 20:10 bd808: restarted logstash on logstash1001; log volume looked to be down from "normal"
* 19:55 Reedy: Applied extensions/UploadWizard/UploadWizard.sql to rowiki (re bug 59242)
* 18:53 manybubbles: bouncing elastic1018 to pick up new merge policy.  hopefully that'll help with io thrashing
* 17:58 ori: _joe_ deployed jobrunner to all job runners
* 17:40 manybubbles: my last attempt to lower the concurrent traffic for recovery was a failure - tried again and succeeded.  that seems to have fixed the echo service disruption from taking elastic1017 out of service
* 17:37 ori: updated jobrunner to bef32b9120
* 17:29 manybubbles: elastic1017 went nuts again.  just shutting elasticsearch off on it for now
* 17:17 manybubbles: lowered Elasticsearch concurrent recovery streams to 2 (from 3) and total write rate across those streams to 20MB/sec (from 4MB/sec).  This should prevent io thrash on recovery which looked to cause echo distruptions in service while recovering from some other disruption.
* 16:25 _joe_: all mw servers updated
* 16:10 _joe_: mw1100 and onwards updated
* 16:00 _joe_: mw1060-mw1099 updated
* 15:57 manybubbles: restarting Elasticsearch on elastic1017 - its thrashing the disk again.  I'm still not 100% sure why
* 15:56 _joe_: mw1020-mw1059 updated
* 15:53 _joe_: mw101[0-9] updated
* 15:51 manybubbles: elasticsearch1017 is freaking out again - maybe there is something wrong with it.  odds aren't good it picked up the same shard again after restart and that shard is somehow poison just for it and not the other two nodes with the same shard....
* 15:47 _joe_: starting rolling update of all appservers to apache2 2.2.22-1ubuntu1.6, half of them are on 2.2.22-1ubuntu1.5 now
* 15:42 manybubbles: setting the filter cache on one node in the cluster set it on all.  yay, I guess.  Anyway, I'm going to let it soak for a while.
* 15:32 manybubbles: setting filter cache size to 20% on elastic1001 to see if it takes/helps us
* 15:19 logmsgbot: anomie Synchronized wmf-config/: SWAT: Remove dead ULS variable [[gerrit:145861]] (duration: 00m 10s)
* 15:18 anomie: anomie actually committed a live hack someone left on tin (removing db1035)
* 15:16 logmsgbot: anomie updated /a/common to {{Gerrit|I7ca6a16d5}}: Switch jawiki back to lsearchd
* 13:52 manybubbles: after switching jawiki back to lsearchd by default load is mostly recovered.  the cluster is still healing from bouncing elastic1017 and that'll take a while.  the load will be a bit high during that but searches are coming back in a reasonably amount of time again
* 13:42 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: jawiki back to lsearchd (duration: 00m 05s)
* 13:38 manybubbles: elastic1017 had a load average of 60 - was thashing in io.  bounced Elasticsearch.  lets see if it recovers on its own
* 09:09 _joe_: restarting mailman on sodium, again, for testing
* 08:50 godog: restart mailman on sodium after inodes freed
* 07:27 _joe_: restarted mailman on sodium
* 07:22 _joe_: stopping mailman on sodium for repairing
* 06:54 _joe_: killed jenkins stale process on gallium, stuck in a futex while shutting down
* 04:48 springle: db1035 crash cycle. down for memtest and stuff
* 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 15 03:33:38 UTC 2014 (duration 33m 37s)
* 03:01 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-15 03:00:03+00:00
* 02:34 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1035, crashed (duration: 00m 13s)
* 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-15 02:29:02+00:00
* 02:27 springle: powercycle db1035 unresponsive
 
== July 14 ==
* 23:52 logmsgbot: mwalker Finished scap: Updating for SWAT {{gerrit|146304}}, {{gerrit|146306}}, {{gerrit|146149}}, {{gerrit|146165}}, {{gerrit|146166}}, {{gerrit|146282}}, and {{gerrit|146281}}. Also finishing awight's deploy of FundraisingTranslateWorkflow. (duration: 19m 42s)
* 23:32 logmsgbot: mwalker Started scap: Updating for SWAT {{gerrit|146304}}, {{gerrit|146306}}, {{gerrit|146149}}, {{gerrit|146165}}, {{gerrit|146166}}, {{gerrit|146282}}, and {{gerrit|146281}}. Also finishing awight's deploy of FundraisingTranslateWorkflow.
* 20:22 cscott: updated Parsoid to version d51e64097bb1b18e356584d4f3ddcfd90a6071ba
* 19:57 ori: postponing jobrunner deployment to tomorrow; ran over time
* 19:45 _joe_: doing the same on mw1064, segfaulted for the same reason
* 19:44 _joe_: killed a lone apache2 child on mw1152, stuck in a futex, after a segfault of another apache process. Restarted apache, now working correctly
* 19:03 godog: re-enabling mailman on sodium, missing list config restored
* 18:49 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow on metawiki (t
* 18:45 logmsgbot: awight Synchronized php-1.24wmf13/extensions/FundraisingTranslateWorkflow: Update FundraisingTranslateWorkflow extension (wmf13) (duration: 00m 05s)
* 18:43 logmsgbot: awight Synchronized php-1.24wmf12/extensions/FundraisingTranslateWorkflow: Update FundraisingTranslateWorkflow extension (duration: 00m 05s)
* 18:15 logmsgbot: awight Synchronized wmf-config: Revert: Deploying FundraisingTranslateWorkflow on metawiki (duration: 00m 04s)
* 18:03 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow on metawiki (duration: 00m 05s)
* 18:03 logmsgbot: awight updated /a/common to {{Gerrit|Ie7599fb6e}}: jawiki gets Cirrus as primary search
* 17:43 Krinkle: npm-cache for integration slaves got corrupted again. Depooling/Repooling integration-slave100{1,2,3} onoe by one to clear cache and let it warm up again.
* 17:35 Krinkle: Jenkins slaves in labs are unable to reach zuul.eqiad.wmnet
* 17:10 andrewbogott: purging old local-* service group entries from labs ldap (via purgeOldServiceGroups.php)
* 17:05 godog: started mailman on sodium post-reboot
* 17:04 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: nlwiki getting cirrus as primary (duration: 00m 04s)
* 15:11 logmsgbot: manybubbles Synchronized wmf-config: SWAT update cirrus settings for commons (duration: 00m 04s)
* 15:04 logmsgbot: manybubbles Synchronized wmf-config: SWAT update cirrus settings for commons (duration: 00m 04s)
* 15:01 logmsgbot: manybubbles Synchronized wmf-config: SWAT update cirrus settings for commons (duration: 00m 05s)
* 14:39 _joe_: rebooted nescio, stuck and with console showing just a truncated log (timestamp only)
* 14:33 mutante: powercycling sodium
* 14:02 mutante: stat1002 - "Could not find declared class ::oozie"
* 09:36 legoktm: ran initSiteStats.php on all wikivoyages for bug 64370
* 09:02 godog: repool ms-fe1001 after upgrade, basic testing successful
* 08:33 godog: depool ms-fe1001 for swift icehouse upgrade
* 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 14 02:56:22 UTC 2014 (duration 56m 21s)
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-14 02:23:39+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-14 02:12:54+00:00
 
== July 13 ==
* 22:12 ori: stopping puppet on rcs1001 to debug nginx issue
* 21:03 Krinkle: git-deploy: Deploying integration/slave-scripts I7f2b476807465
* 02:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 13 02:53:33 UTC 2014 (duration 53m 32s)
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-13 02:23:56+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-13 02:13:32+00:00
* 02:12 legoktm: migratePass0.php finished a while back
 
== July 12 ==
* 22:21 legoktm: running foreachwiki extensions/CentralAuth/maintenance/migratePass0.php (bug 67350)
* 22:04 legoktm: checkLocalNames/checkLocalUser finished a few hours ago, I don't have a timestamp (bug 67350)
* 13:51 godog: reboot ms-be1007, xfs problems on sdn, load at 300+
* 07:39 legoktm: started running checkLocalUser.php --delete=1 on all CentralAuth wikis for bug 67350
* 07:37 legoktm: started running checkLocalNames.php --delete=1 on all CentralAuth wikis for bug 67350
* 02:52 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 12 02:51:47 UTC 2014 (duration 51m 46s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-12 02:25:47+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-12 02:15:33+00:00
 
== July 11 ==
* 23:44 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow to labs (take 2) (duration: 00m 04s)
* 23:36 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow to labs (duration: 00m 05s)
* 23:34 logmsgbot: awight updated /a/common to {{Gerrit|I862a4afed}}: Fixup highlightTest.php
* 22:44 mutante: upgraded libssl on wtp*
* 22:33 Krinkle: Restarting Jenkins
* 22:33 Krinkle: Pooled/depooled Jenkins slave on gallium
* 22:31 Krinkle: jenkins/gallium's weekly w(h)ine hour is here.
* 21:31 Krinkle: Reloading Zuul to deploy config change I993eba5ab7b70f924a2b925fea7c196db27c4cc3
* 20:57 ottomata: disabling puppet on analytics1004 (AGH!)
* 20:51 ottomata: bringing up some hadoop journalnodes (and datanodes)
* 20:33 mutante: wikitech - graceful apache for ssl cipher list change
* 18:19 mutante: OTRS - enabled STS, updated SSL cipher list, restarted Apache on iodine
* 15:15 logmsgbot: hoo Synchronized php-1.24wmf13/extensions/Wikidata/: Fix the wbsearchentities API (duration: 00m 13s)
* 15:14 logmsgbot: hoo Synchronized php-1.24wmf12/extensions/Wikidata/: Fix the wbsearchentities API (duration: 00m 16s)
* 13:52 hashar: Jenkins: mediawiki/core change being queued while Jenkins is busy proceeding some history. That is normal, will resume soon ™
* 12:07 hashar: Jenkins: dropping history of mwext-Wikibase-testextensions-master as well
* 12:05 hashar_: Jenkins: manually removing history of mwext-Wikibase-client-tests and mwext-Wikibase-repo-tests . They are no more used since January
* 08:54 hoo: Started rebuildItemsPerSite for wikidatawiki on terbium
* 03:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 11 03:30:11 UTC 2014 (duration 30m 10s)
* 03:01 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-11 03:00:20+00:00
* 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-11 02:30:24+00:00
 
== July 10 ==
* 23:38 logmsgbot: mwalker Finished scap: Updating Core, VE, and GuidedTour for scap, {{gerrit|145400}}, {{gerrit|145401}}, {{gerrit|145431}}, and {{gerrit|145460}} (duration: 16m 26s)
* 23:22 logmsgbot: mwalker Started scap: Updating Core, VE, and GuidedTour for scap, {{gerrit|145400}}, {{gerrit|145401}}, {{gerrit|145431}}, and {{gerrit|145460}}
* 20:00 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 19:46 logmsgbot: reedy Synchronized private: (no message) (duration: 00m 14s)
* 19:45 csteipp: deployed patch for bug65778
* 19:43 hashar: Jenkins upgrading Gearman plugin from 0.0.6 to 0.0.7 . That fix the way jobs labels are registered with Gearman
* 19:16 hashar: Killed jenkins :-(
* 18:37 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
* 18:36 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 14s)
* 18:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf13
* 18:02 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf12
* 17:25 logmsgbot: hoo Synchronized php-1.24wmf13/extensions/Wikidata/: Fix a UI issue and two API related flaws (same version as for wmf12) (duration: 00m 09s)
* 17:21 logmsgbot: hoo Synchronized php-1.24wmf12/extensions/Wikidata/: Fix a UI issue and two API related flaws (duration: 00m 14s)
* 16:04 godog: restarted pdns in turn on virt1000 and virt0 after opendj ulimit change
* 15:56 hashar: gallium running a rather long du command in a screen. Need to have a good figure at how much disk space each jobs consume
* 15:50 logmsgbot: reedy Finished scap: testwiki to 1.24wmf13 and build l10n cache (duration: 32m 09s)
* 15:18 logmsgbot: reedy Started scap: testwiki to 1.24wmf13 and build l10n cache
* 15:15 ottomata: reinstalling analytics1026 and analytics1027
* 14:10 godog: ran swift-dispersion-populate on eqiad and esams swift clusters
* 14:04 godog: cycle-restarting swift proxy-server on ms-fe to apply config updates
* 13:09 godog: restart pdns on virt1000
* 12:48 springle: ongoing schema changes: pl_from_namespace gerrit 117373. on terbium, osc_host.sh processes ok to kill in emergency
* 12:43 godog: restart opendj on virt1000 with higher ulimit -n
* 12:29 godog: restarted opendj on virt1000, ran out of fd
* 10:29 godog: restart profiler-to-carbon on tungsten, seemingly cpu spinning
* 09:48 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 13s)
* 09:30 logmsgbot: oblivian gracefulled all apaches
* 09:23 _joe_: doing a tagged run to sync apache config
* 09:07 hashar: gallium err was July 5th and file was from a minute ago ... ignore me
* 09:06 hashar: gallium deleted /var/lib/puppet/state/agent_catalog_run.lock  from July 5th. Was preventing me to run <tt>puppet agent -tv</tt>
* 08:02 logmsgbot: oblivian gracefulled all apaches
* 07:52 _joe_: doing a tagged run of puppet on all appservers to sync apache config
* 06:40 bblack: all normally-ulsfo traffic is back on ulsfo
* 05:53 awight: edit CRM Drupal permissions
* 03:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 10 03:46:36 UTC 2014 (duration 46m 35s)
* 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-10 03:11:48+00:00
* 02:49 mutante: argon,netmon1001, graceful'led apaches
* 02:48 mutante: netmon1001 - DocumentRoot [/etc/apache2/undef] does not exist
* 02:42 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-10 02:41:29+00:00
* 02:38 mutante: argon,calcium,iron,rhenium,bast1001,oxygen,netmon1001 - upgraded SSL
* 01:47 mutante: argon - Ignoring file 'puppet_base_2.7' in directory '/etc/apt/preferences.d/
* 01:41 awight: update crm schema to wmf_civicrm 7020
* 01:40 awight: update civicrm from 108802336e4d5f4aab9a6dbfa0ea434bddae0060 to 15cf86cb109a448f1982da9c91215eec73f28499
* 01:38 mutante: potassium,hydrogen,search1016,nitrogen,analytics1024,chromium - upgrade SSL
* 01:06 bblack: cleared icinga downtimes for ulsfo (we now have some traffic back there)
* 00:50 logmsgbot: mattflaschen Synchronized php-1.24wmf11/extensions/GuidedTour/: GuidedTour cherry-pick to 1.24wmf11 in support of GettingStarted anonymous editor acquisition test (duration: 00m 09s)
* 00:05 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/144938 (duration: 00m 04s)
 
== July 9 ==
* 23:57 logmsgbot: maxsem Finished scap: SWAT, GettingStarted introduced a new message (duration: 26m 31s)
* 23:44 mutante: deleted systemusers group on neon & mw1077 (to check it doesnt break anything
* 23:31 logmsgbot: maxsem Started scap: SWAT, GettingStarted introduced a new message
* 23:22 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/GettingStarted/: (no message) (duration: 00m 03s)
* 23:17 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/144857/ (duration: 00m 04s)
* 22:42 mark: Enabling PAIX BGP sessions on cr2-ulsfo
* 22:40 mark: Enabling WMF HQ BGP sessions on cr1-ulsfo
* 22:38 mark: Enabling TiNet transit links on cr1-ulsfo
* 22:35 mark: Enabling WMF HQ BGP sessions on cr2-ulsfo
* 22:34 mark: Enabling NTT and HE transit links on cr2-ulsfo
* 22:05 mutante: restarted apache on zirconium for config change
* 20:07 subbu: deployed parsoid 1632288d
* 18:36 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 22s)
* 18:29 logmsgbot: yurik Synchronized php-1.24wmf11/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 24s)
* 17:17 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: eswiki cirrus (duration: 00m 04s)
* 16:51 logmsgbot: csteipp Finished scap: Update CentralAuth for Global Rename (duration: 28m 46s)
* 16:22 logmsgbot: csteipp Started scap: Update CentralAuth for Global Rename
* 16:17 mark: ulsfo is now offline
* 16:16 mark: Shutdown NTT BGP sessions on cr2-ulsfo
* 16:13 mark: Shutdown TiNet BGP sessions on cr1-ulsfo
* 16:10 mark: Shutdown IXP BGP sessions on cr2-ulsfo
* 16:10 mark: Shutdown WMF HQ BGP sessions on cr2-ulsfo
* 16:09 mark: Shutdown WMF HQ BGP sessions on cr1-ulsfo
* 16:02 logmsgbot: hoo Synchronized php-1.24wmf12/extensions/Wikidata/: Update Wikibase to fix a fatal and various JS things (duration: 00m 14s)
* 14:13 hashar: Jenkins: bringing back  puppet-compiler02.eqiad.wmflabs node online. /tmp get filled when running huge catalog compilations which causes Jenkins to unpool the node :/
* 13:30 godog: reboot ms-be1005, raid controller confused (?) after disk replacement
* 12:52 godog: umounted sdg1 on ms-be1005, device disappeared, errors in dmesg
* 12:35 bblack: enabled amssq47 text frontend cache in pybal for esams
* 09:39 hashar: Jenkins had a bit of failure earlier due to the massive configuration update of mediawiki-core and mwext jobs.  If that fails again the best thing is to stop Jenkins on gallium , wait for it to be killed or force kill -9 the java process then restart Jenkins.  Should sort it out
* 09:30 hashar: restarted Zuul to clear out stalled items in queue
* 09:12 hashar: Jenkins being slow because the mediawiki-core* jobs history cache has been wiped out while updating their configuration. Jenkins is busy processing the history :(
* 09:02 hashar: Jenkins killing slave process on lanthanum. Some job is stalled and unrecoverable.
* 08:53 godog: upgrade ms-be1013/1014/1015 (zone5) to icehouse swift
* 08:51 hashar: Jenkins migrating jobs to use $ZUUL_URL instead of git://zuul.eqiad.wmnet  Preparing to scale out Zuul merger to several nodes
* 08:19 godog: upgrade ms-be1009/1010/1011 (zone4)  to swift icehouse
* 08:04 hashar: Jenkins: granted matanya the ability to manually trigger builds. Use case: the puppet compiler!
* 08:02 godog: upgrade ms-be1005/1006/1007 (zone3) to swift icehouse
* 03:37 mutante: ran puppet on neon - false puppet failure alarms
* 02:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul  9 02:54:37 UTC 2014 (duration 54m 36s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-09 02:25:33+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-09 02:14:38+00:00
* 01:26 mutante: Bugzilla - enabled https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security
* 00:50 mutante: restarted gitblit service
 
== July 8 ==
* 22:50 mutante: radon (phab)- package and kernel upgrades, rebooting
* 20:22 legoktm: finished running migrateAccount.php --attachbroken --attachmissing (bug 61876)
* 20:07 legoktm: finished migrateAccount.php --safe, now starting migrateAccount.php --attachbroken
* 20:05 mutante: restarted apache on ytterbium
* 19:47 K4-713: updated payments fraud filters again
* 19:47 legoktm: running migrateAccount.php --safe for accounts only existing on one wiki (bug 39817)
* 19:27 mutante: this should have fixed all the services behind misc. varnish now getting an actual "A" rating on ssllabs
* 19:20 mutante: arr, i meant "nginx", not varnish
* 19:15 mutante: restarting varnish on cp1043/cp1044 (misc cluster)
* 18:55 cmjohnson1: disconnecting serial cable from psw1-c2-eqiad
* 18:50 csteipp: patch for bug66608 deployed to wmf11/12
* 18:50 K4-713: updated fraud filters on payments cluster
* 18:28 logmsgbot: reedy Synchronized robots-private.txt: (no message) (duration: 00m 14s)
* 18:27 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
* 18:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf12
* 15:22 logmsgbot: reedy Purged l10n cache for 1.24wmf10
* 15:21 logmsgbot: reedy Purged l10n cache for 1.24wmf9
* 15:15 logmsgbot: anomie Synchronized php-1.24wmf11/extensions/Scribunto/: SWAT: Fix regression in os.date and os.time at module scope [[gerrit:144559]] (duration: 00m 10s)
* 15:14 logmsgbot: anomie Synchronized php-1.24wmf12/extensions/Scribunto/: SWAT: Fix regression in os.date and os.time at module scope [[gerrit:144511]] (duration: 00m 11s)
* 15:10 logmsgbot: anomie Synchronized php-1.24wmf11/extensions/UploadWizard/UploadWizard.config.php: SWAT: Flickr API is https-only now [[gerrit:144584]] (duration: 00m 10s)
* 15:04 logmsgbot: anomie Synchronized php-1.24wmf12/extensions/UploadWizard/UploadWizard.config.php: SWAT: Flickr API is https-only now [[gerrit:144583]] (duration: 00m 10s)
* 13:34 springle: slow transaction rollback in progress on db1001 librenms. other databases not affected, but librenms writes are timing out
* 13:32 cmjohnson1: replacing disk disk 6 ms-be1005
* 13:30 cmjohnson1: replacing disk 4 ms-be1007
* 12:38 YuviPanda: disregard previous log message, was meant for labs
* 12:37 YuviPanda: graphite reduced metrics count from 65k to 25k, monitoring io performance
* 06:57 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: raise db traffic samplers to normal load (duration: 00m 06s)
* 05:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1010, warm up (duration: 00m 06s)
* 04:51 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1010 for upgrade (duration: 00m 06s)
* 03:26 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul  8 03:25:51 UTC 2014 (duration 25m 50s)
* 03:00 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-08 02:59:33+00:00
* 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-08 02:29:00+00:00
* 01:17 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1061, warm up (duration: 00m 06s)
* 00:57 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1061 for upgrade (duration: 00m 07s)
* 00:02 ^d: gerrit upgraded to 2.8.1-4-ga1048ce from 2.8.1-2-g724b796, back up. Might be slow for a bit while caches warm.
 
== July 7 ==
* 23:38 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/ParserFunctions/: https://gerrit.wikimedia.org/r/#q,144510,n,z (duration: 00m 03s)
* 23:38 logmsgbot: maxsem Synchronized php-1.24wmf12/includes/StubObject.php: https://gerrit.wikimedia.org/r/#/c/144509/ (duration: 00m 03s)
* 23:22 logmsgbot: maxsem Synchronized visualeditor-default.dblist: (no message) (duration: 00m 03s)
* 23:19 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/GWToolset: (no message) (duration: 00m 03s)
* 23:18 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/GWToolset: (no message) (duration: 00m 03s)
* 23:17 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/GWToolset: (no message) (duration: 00m 04s)
* 23:12 logmsgbot: maxsem Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#q,144476,n,z & https://gerrit.wikimedia.org/r/#q,139569,n,z (duration: 00m 05s)
* 23:04 logmsgbot: maxsem Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#q,144476,n,z & https://gerrit.wikimedia.org/r/#q,139569,n,z (duration: 00m 03s)
* 22:41 logmsgbot: ori Synchronized wmf-config/mc.php: I8b66e9339: Make app servers connect to nutcracker on port 11212 (duration: 00m 03s)
* 20:31 logmsgbot: ori Synchronized wmf-config/mc.php: Iea24b092b: Make mw1041 connect to nutcracker on port 11212 (duration: 00m 09s)
* 20:03 subbu: deployed Parsoid 8ef7b6fe
* 17:52 legoktm: deleted rows in centralauth's localnames and localuser tables for bug 67548
* 17:02 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Cirrus on commons as primary (duration: 00m 04s)
* 16:34 logmsgbot: aude Finished scap: Update Wikidata to mw1.24-wmf12 branch for group0 wikis (duration: 22m 33s)
* 16:22 manybubbles: (Cirrus) load tested commons and eswiki over the last hour - both look fine.
* 16:11 logmsgbot: aude Started scap: Update Wikidata to mw1.24-wmf12 branch for group0 wikis
* 15:49 bd808: Logstash event volume looks better after restart. Probably related to bug 63490.
* 15:32 bd808: Restarted logstash on logstash1001 because log volume looked lower than I though it should be.
* 15:16 cmjohnson1: reseating PEM2 cr1-eqiad
* 15:08 godog: powercycled ms-be1007, unresponsive on console and remnants of a stack trace
* 14:49 manybubbles: (Cirrus) Applying cache warmer configuration that went out last Thursday to all wikipedias.
* 12:11 hashar: Jenkins job builder e1ddd23 fails for us :/  Moving back to parent commit
* 12:09 hashar: Updated our Jenkins job builder fork  0972985..e1ddd23
* 09:40 godog: upgrade ms-be1003/1004/1012 (zone2) to swift icehouse
* 09:16 _joe_: restarting rhenium, pings but no ssh since 2 days, serial console is blank and unresponsive
* 09:15 godog: upgrade ms-be1002/1008 (zone1) to swift icehouse
* 02:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul  7 02:52:10 UTC 2014 (duration 52m 9s)
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-07 02:23:48+00:00
* 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-07 02:12:44+00:00
 
== July 6 ==
* 02:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul  6 02:49:21 UTC 2014 (duration 49m 20s)
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-06 02:24:08+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-06 02:13:07+00:00
 
== July 5 ==
* 02:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul  5 02:52:04 UTC 2014 (duration 52m 3s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-05 02:26:08+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-05 02:15:05+00:00
* 01:22 springle: ongoing osc_host.sh schema change jobs on terbium. fine to kill in an emergency
 
== July 4 ==
* 20:05 hoo: Ran sync-common on fenari to update the docs on noc.wikimedia.org
* 15:40 _joe_: restarting salt-minion, killing io hungry job on fenari running since jun 30, 00 AM
* 12:28 akosiaris: executed dist-upgrade on virt1000. Keystone configure phase failed in keystone-manage db-sync and hence dpkg configure failed. It was trying to create an already existing index in the database. Dropped the index, ran dpkg --configure -a to recreate the index (and whatever else keystone-manage db_sync does). All is back to normal.
* 03:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul  4 03:28:29 UTC 2014 (duration 28m 28s)
* 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-04 03:02:49+00:00
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-04 02:32:29+00:00
* 00:28 gwicke: deployed parsoid config change e21a534 to support VE on the OTRS wiki
 
== July 3 ==
* 23:40 mutante: osmium - libboost-dev : Depends: libboost1.54-dev but it is not going to be installed
* 23:33 mutante: rhenium (pmacct / flow) Out of memory: Kill process 3123 (pmacctd) score 1 or sacrifice child
* 23:22 K4-713: updated payments to c5689f385b2f0a7bdc55c5752010e9eb
* 23:17 logmsgbot: mwalker Synchronized php-1.24wmf12/extensions/VisualEditor/: Updating VisualEditor for {{gerrit|144081}} (duration: 00m 12s)
* 21:07 logmsgbot: oblivian gracefulled all apaches
* 20:45 mutante: deleted analytics/kraken branch from ops/puppet via gerrit ui, ack'ed by ottomata
* 20:12 bd808|deploy: Updated scap to ff04431 (restart-nutcracker script)
* 19:53 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 14s)
* 19:48 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 19:48 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 30s)
* 19:47 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 20s)
* 19:36 jgage: rebooting analytics1012 for bios change: cpufreq governor
* 19:27 ottomata: disabling puppet on hadoop related analytics nodes, preparing for reinstall
* 19:21 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 19:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf12
* 19:12 logmsgbot: reedy Synchronized php-1.24wmf11/languages/Language.php: I039547b867b2eab47692dcc018c95b89975bc65d (duration: 00m 40s)
* 18:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedia to 1.24wmf11
* 18:41 logmsgbot: reedy Finished scap: testwiki to 1.24wmf12 and build l10n cache (duration: 30m 47s)
* 18:18 ottomata: doing rolling restarts of zookeeper servers and kafka brokers to load up new zk timeout changes
* 18:10 logmsgbot: reedy Started scap: testwiki to 1.24wmf12 and build l10n cache
* 18:10 logmsgbot: reedy scap aborted: testwiki to 1.24wmf12 and build l10n cache (duration: 27m 26s)
* 17:53 godog: reloading librenms, semi-broke it with a syslog search (again)
* 17:46 godog: reloading librenms, semi-broke it with a syslog search
* 17:42 logmsgbot: reedy Started scap: testwiki to 1.24wmf12 and build l10n cache
* 16:38 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/EventLogging/: bug 67420 (duration: 00m 35s)
* 16:34 paravoid: apt: uploading nutcracker backport for precise
* 08:07 hashar: Jenkins restarted
* 08:00 hashar: upgrading Jenkins (minor version bump 1.554.2 -> 1.554.3)
* 03:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul  3 03:38:46 UTC 2014 (duration 38m 45s)
* 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-03 03:02:15+00:00
* 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-07-03 02:31:35+00:00
 
== July 2 ==
* 23:06 logmsgbot: maxsem Synchronized wmf-config/: (no message) (duration: 00m 07s)
* 22:37 jgage: rebooting analytics1021 to change bios "system profile" from PPW (OS) to PPW (DAPC)
* 22:19 logmsgbot: ebernhardson Finished scap: (no message) (duration: 36m 25s)
* 22:16 jgage: rebooting analytics1022 to check bios cpufreq setting
* 21:43 logmsgbot: ebernhardson Started scap: (no message)
* 21:42 logmsgbot: ebernhardson Synchronized php-1.24wmf10/extensions/Mantle/: Sync new Mantle extension in 1.24wmf10 (duration: 00m 20s)
* 21:40 robh: blog updated to newest release, no downtime
* 21:38 jgage: rebooting analytics1021 to check bios cpufreq setting
* 20:56 paravoid: pfw1-eqiad: s/mchenry/lead/; all smtp_out rules have [ polonium lead ] as destination-address now
* 20:49 paravoid: switching non-wikimedia.org MX to polonium/lead (from polonium/mchenry)
* 20:16 cscott: updated Parsoid to version 6afcb8df
* 19:08 logmsgbot: yurik Synchronized php-1.24wmf11/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal - bug fix (duration: 01m 40s)
* 19:00 logmsgbot: yurik Synchronized php-1.24wmf10/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal - bug fix (duration: 01m 15s)
* 18:42 logmsgbot: yurik Synchronized php-1.24wmf10/extensions/: Reverting previous update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 20s)
* 18:37 logmsgbot: yurik Synchronized php-1.24wmf10/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 55s)
* 18:33 logmsgbot: yurik Synchronized php-1.24wmf11/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 38s)
* 18:33 paravoid: reprepro include, trusty-wikimedia (main/universe): nutcracker, libicu 4.8, libzip 0.11, hhvm, {php,hhvm}-wikidiff2, {php,hhvm}-fss, {php,hhvm}-luasandbox, ffmpeg2theora
* 18:28 yurikR2: yurik ^ was a noop - comment fix
* 18:28 logmsgbot: yurik Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 04s)
* 18:26 logmsgbot: yurik Synchronized docroot/bits/WikipediaMobileFirefoxOS/: (no message) (duration: 01m 03s)
* 16:30 mutante: upgrading jenkins to jenkins_1.554.3_all.deb on the apt repo
* 15:19 manybubbles: done with SWAT for real this time
* 15:17 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 28s)
* 15:16 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 18s)
* 15:08 manybubbles: *SWAT* complete
* 15:07 manybubbles: swap complete - logged off of tin
* 15:04 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 05s)
* 15:04 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
* 15:00 logmsgbot: manybubbles Synchronized wmf-config: SWAT Remove two permissions from some editors on ruwiki (duration: 00m 07s)
* 13:10 hashar: Jenkins being busy deleting history files
* 13:02 hashar: Jenkins: dropping history of puppet related jobs after 90 days. {{gerrit|136992}}
* 12:18 akosiaris: upgraded PH5 to 5.3.10-1ubuntu3.12+wmf1 on deployment-apache01 and deployment-apache02 (beta)
* 12:09 akosiaris: upgraded PHP5 to 5.3.10-1ubuntu3.12+wmf1 on test.wikipedia.org
* 11:05 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: additional upload domain for Erasmus University {{gerrit|143593}} {{bug|67355}} (duration: 00m 06s)
* 08:00 godog: upgrading ms-be1001 to swift icehouse
* 07:45 godog: umounted (empty and broken) sdk1 from ms-be3003 and wipe its first sectors, no more remounts
* 03:00 paravoid: rebooting lead
* 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul  2 02:56:33 UTC 2014 (duration 56m 32s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-02 02:26:24+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-07-02 02:14:53+00:00
 
== July 1 ==
* 23:49 K4-713: updated payments cluster to c5689f385b2f0a7
* 23:43 robh: any francium errors can be ignored, as the software doesn't fully deploy from puppet and its not in service
* 23:36 logmsgbot: maxsem Synchronized wmf-config/FeaturedFeedsWMF.php: https://gerrit.wikimedia.org/r/#/c/136316/ now for realz (duration: 00m 04s)
* 23:34 logmsgbot: maxsem Synchronized wmf-config/FeaturedFeedsWMF.php: https://gerrit.wikimedia.org/r/#/c/136316/ (duration: 00m 04s)
* 23:09 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/143473/ (duration: 00m 05s)
* 23:05 logmsgbot: maxsem Synchronized php-1.24wmf11/resources/: https://gerrit.wikimedia.org/r/#/c/142975/ (duration: 00m 05s)
* 23:04 logmsgbot: maxsem Synchronized php-1.24wmf10/resources/: https://gerrit.wikimedia.org/r/#/c/142975/ (duration: 00m 19s)
* 21:39 hoo: Set email for re-renamed dewiki account "Kolimak". Email and password got lost during a screwed rename.
* 20:36 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/WikimediaMessages: bug 67387 (duration: 00m 15s)
* 20:31 mutante: restarting apache on mw1217
* 20:27 manybubbles: Adding cache warmers to all Cirrus indexes for group1 wikis with more then one shard except commons (commons is busy, it'll have to wait:)
* 19:53 logmsgbot: aude Synchronized wmf-config/Wikibase.php: adjust property suggester setting for wikidata (duration: 00m 11s)
* 19:14 logmsgbot: ori Synchronized php-1.24wmf10/resources/src/jquery.ui-themes/vector/jquery.ui.core.css: Ib09928248: vector/jquery.ui.core.css: Update rule for .ui-helper-hidden-accessible (bug 67243) (duration: 00m 05s)
* 19:14 logmsgbot: ori Synchronized php-1.24wmf11/resources/src/jquery.ui-themes/vector/jquery.ui.core.css: Ib09928248: vector/jquery.ui.core.css: Update rule for .ui-helper-hidden-accessible (bug 67243) (duration: 00m 06s)
* 18:42 andrewbogott: adding virt1008 to labs compute pool
* 18:41 andrewbogott: switching puppet canary from virt1008 to virt1009
* 18:38 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable property suggester on Wikidata (duration: 00m 10s)
* 18:38 logmsgbot: aude Synchronized wmf-config/Wikibase.php: (no message) (duration: 00m 15s)
* 18:30 logmsgbot: aaron Synchronized wmf-config/PrivateSettings.php: removed obsolete swift tampa config (duration: 00m 07s)
* 18:15 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 18s)
* 18:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf11
* 17:54 logmsgbot: demon Synchronized php-1.24wmf10/extensions/Elastica: Updating to master, fixes fatal error (duration: 00m 07s)
* 17:45 manybubbles: rebuilding cirrus index for commons to put it into fewer shards - it should be faster this way
* 17:24 mutante: antimony: git.wikimedia.org]: Ensure set to :present but file type is link so no content will be synced
* 17:24 logmsgbot: hoo Synchronized wmf-config/: Typos typos typso (duration: 00m 08s)
* 17:21 mutante: restarting apache on antimony
* 17:21 mutante: fixing svn.wikimedia.org apache site manually
* 17:08 springle: restarted mysqld on db1046 m2 slave
* 17:03 logmsgbot: demon Synchronized cirrus.dblist: Move remaining pool 4 lsearchd wikis (except commons) to Cirrus (duration: 00m 07s)
* 15:09 manybubbles: done with SWAT deploy
* 15:06 logmsgbot: manybubbles Synchronized php-1.24wmf11/extensions/CirrusSearch/: SWAT code to set up cache warmers (duration: 00m 05s)
* 15:04 logmsgbot: manybubbles Synchronized wmf-config: SWAT - cirrus settings - cache warmers and shard counts (duration: 00m 06s)
* 15:04 ottomata: temporarily disabling puppet on hafnium to test an eventlogging alert
* 14:27 hashar: Stopping Jenkins it has some corrupted threads
* 13:16 Jeff_Green: dist-upgrade and reboot tellurium
* 13:08 Jeff_Green: dist-upgrade and reboot boron
* 12:23 logmsgbot: reedy Synchronized multiversion/: (no message) (duration: 00m 23s)
* 12:22 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 18s)
* 12:16 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 20s)
* 12:03 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: (no message) (duration: 00m 13s)
* 12:00 Reedy: Manually created Echo tables on extension1
* 11:55 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 13s)
* 11:55 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 28s)
* 11:53 Reedy: Manually created wikimania2015wiki database on 10.64.16.18
* 11:48 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
* 11:48 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 14s)
* 11:47 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 16s)
* 10:49 _joe_: nginx restarted on all ulsfo hosts as well, we should be PFS-enabled now
* 10:38 _joe_: esams restart finished, moving to ulsfo
* 10:30 _joe_: all eqiad SSL terminators are now PFS enabled. Moving to rolling restarting esams
* 10:09 _joe_: restarting nginx on ssl100* servers in sequence, to activate PFS
* 08:47 godog: ms-be3003 sdk1 disk to 0 weight
* 07:22 legoktm: finished running checkLocalNames.php and checkLocalUser.php for some wikivoyages to clean up bug 66535
* 07:16 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1067 (duration: 00m 12s)
* 07:06 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1067 during schema changes (duration: 00m 06s)
* 06:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1063 (duration: 00m 06s)
* 06:40 legoktm: starting to run checkLocalNames.php and checkLocalUser.php for some wikivoyages to clean up bug 66535
* 06:34 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1063 during schema changes (duration: 00m 06s)
* 06:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1060 (duration: 00m 06s)
* 06:29 legoktm: ran fixInvalidStudent.php --wiki=enwiki --courseId=359 for bug 66624
* 06:13 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1060 during schema changes (duration: 00m 07s)
* 02:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul  1 02:50:05 UTC 2014 (duration 50m 4s)
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-01 02:23:49+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-07-01 02:14:11+00:00
 
== June 30 ==
* 23:03 awight: update tools from e894f1f77674b6b101ae0e1644e363ca52e319d9 to d605bdc2aaaef2d4b296a4d9567ed2831db86756
* 23:02 logmsgbot: ori Synchronized wmf-config: Iba41a37a1: Keep thumbnail guessing enabled (duration: 00m 05s)
* 22:14 mutante: re-enabled puppet on caesium
* 21:43 mutante: disabling puppet on caesium
* 21:35 Reedy: running mwscript updateSpecialPages.php --wiki=enwiki --only=Mostlinkedtemplates --override on terbium
* 21:25 mutante: fixing releases.wikimedia.org Apache site, delete sites-enabled file broken by puppet, add symlink, graceful
* 21:00 subbu: deployed parsoid 0b365d516
* 19:44 _joe_: restarting pybal on lvs1005
* 19:16 awight: updated payments from a04e536b6923f2228bb7f5fbf2caeed64a888742 to 2b6c527617dcde154cc298dd9697c9d57c9f3620
* 18:41 awight: updated payments from a8138fefd940ba41812e5c07ca6bc74b63cb9bcf to a04e536b6923f2228bb7f5fbf2caeed64a888742
* 17:38 manybubbles: Cirrus reindex update!  all wikipedias finished their in place reindex except ruwiki - that one is running now.  all group1 wikis finished their from mediawiki reindex except commons and mgwiktionary which are running now.  started from mediawiki reindex of all wikipedias exception for enwiki, itwiki, and cawiki which are already long done.
* 17:12 logmsgbot: manybubbles Synchronized cirrus.dblist: Enabled CirrusSearch as the default search backend on 30 more wikis - take five (duration: 00m 04s)
* 17:08 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis - take four (duration: 00m 04s)
* 17:08 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis - for real for real (duration: 00m 04s)
* 17:07 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis - for real (duration: 00m 04s)
* 17:05 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis (duration: 00m 05s)
* 15:43 logmsgbot: manybubbles Synchronized php-1.24wmf11/extensions/Wikidata/: (no message) (duration: 00m 09s)
* 15:35 logmsgbot: manybubbles Synchronized php-1.24wmf11/extensions/VisualEditor/: SWAT Correctly VisualEditor - update full size in MediaSizeWidget (duration: 00m 07s)
* 15:26 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - disable local uploads on Malay Wiktionary (duration: 00m 04s)
* 15:23 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - remove completed mediaviewer surveys (duration: 00m 04s)
* 15:19 _joe_: restarted profiler-to-carbon, stuck since _9_ days, will see that my patch gets deployed.
* 15:15 logmsgbot: manybubbles Synchronized php-1.24wmf10/extensions/ProofreadPage: SWAT - fix ProofreadPage number of pages (duration: 00m 09s)
* 14:48 godog: installed new swift ring on esams, decrease ms-be3003/sdk1 weight
* 14:41 hoo: Cleared out a watchlist with 126652 entries on warwiki to resolve https://bugzilla.wikimedia.org/show_bug.cgi?id=67123
* 13:31 godog: upgrade ms-fe300[12] to swift icehouse
* 10:20 hashar: restarting zuul after a puppet change for /etc/zuul/zuul.conf
* 07:53 godog: upgrading ms-be300[2-4] to swift icehouse
* 02:49 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 30 02:48:28 UTC 2014 (duration 48m 27s)
* 02:28 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1061, warm up (duration: 00m 07s)
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-30 02:23:56+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-30 02:14:23+00:00
* 01:41 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1061 during schema changes (duration: 00m 07s)
 
== June 29 ==
* 22:26 hoo: Manually cleared a watchlist on shwikt with 819846 entries, see https://bugzilla.wikimedia.org/show_bug.cgi?id=67123#c7
* 22:10 hoo: Manually cleared a watchlist with 289436 entries, see https://bugzilla.wikimedia.org/show_bug.cgi?id=67123#c5
* 16:44 hoo: Jenkins/ Zuul not reacting for at least half an hour now
* 16:43 awight: update tools from 3a35482ab1fede2ccfcc49a64ec661b0cb013b81 to e894f1f77674b6b101ae0e1644e363ca52e319d9
* 16:09 awight: updated payments from 6d74002f2634f41f7038daa7357ff6de55ee4880 to a8138fefd940ba41812e5c07ca6bc74b63cb9bcf
* 02:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 29 02:44:35 UTC 2014 (duration 44m 34s)
* 02:22 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-29 02:21:01+00:00
 
== June 28 ==
* 17:16 ori: restarted lucene on search1016 per _joe_
* 12:58 manybubbles: Cirrus reindex status: enwiki has almost finished its in place reindex, alphabetical wikipedias are at frwiki, all group1 wikis have finished their in place reindex.  all group1 wikis are running from mediawiki reindex.  itwiki and cawiki both finished both the in place and from mediawik reindex.  Haven't started alphabetical from mediawiki reindex yet for wikipedias.  that is the only
* 10:40 _joe_: restarting lucene on search1015, stuck. again.
* 02:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 28 02:46:49 UTC 2014 (duration 46m 48s)
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-28 02:24:12+00:00
* 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-28 02:15:38+00:00
 
== June 27 ==
* 23:15 awight: deploymed payments config
* 22:57 logmsgbot: csteipp Synchronized php-1.24wmf11/extensions/OAuth/frontend/specialpages/SpecialMWOAuth.php: Fix OAuth Logins for wmf11 (duration: 00m 18s)
* 20:57 awight: updated crm from 340c43a15a84a9392ad5ef9fc2782243ff140deb to 17439326ca4488ece843a263fc14859b38cff0e9
* 19:33 hashar: puppet-compiler: removed modules/varnish at root@puppet-compiler02:/opt/wmf/software/compare-puppet-catalogs/external/puppet  and resetted repo.
* 19:07 awight: update crm from e2fe03a9cd51e30206d9a1114d62dfbd6960816b to 340c43a15a84a9392ad5ef9fc2782243ff140deb
* 18:57 logmsgbot: aaron Synchronized wmf-config/PoolCounterSettings-eqiad.php: Pre-set FileRenderExpensive config
* 18:34 bblack: updated puppet repo on virt0
* 18:11 mutante: osmium -  hhvm : Depends: libdouble-conversion1 but it is not going to be installed
* 16:49 bblack: updated carbon repo varnish pkg to 3.0.5plus~x-wm6
* 14:18 hashar: Updated our Jenkins Job Builder fork: e9db73d..0972985
* 03:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 27 03:30:00 UTC 2014 (duration 29m 59s)
* 03:06 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-27 03:05:31+00:00
* 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-27 02:35:20+00:00
 
== June 26 ==
* 23:32 manybubbles: Cirrus rebuild progress - started large/high cirrus visibility wikis in group2 - enwiki, cawiki, and itwiki.
* 23:31 manybubbles: Cirrus rebuild progress - alphabetical wikis in group2 are 2/3 of the way done with reindex - from mediawiki rebuild is maybe 20% done there
* 23:31 manybubbles: Cirrus rebuild progress - big wikis in group1 are finished with in place reindex and well into from mediawiki rebuild.
* 23:27 ori: Previous scap included I2cfcfaf06 as well
* 23:23 logmsgbot: ori Finished scap: CirrusSearch updates: Iefe340729, Ie12418e54, Ie21fb352 (duration: 04m 59s)
* 23:18 logmsgbot: ori Started scap: CirrusSearch updates: Iefe340729, Ie12418e54, Ie21fb352
* 23:07 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Ie96265c4f: Add an Erasmus University domain to whitelist (duration: 00m 05s)
* 23:07 logmsgbot: ori updated /a/common to {{Gerrit|Ie96265c4f}}: Add an Erasmus University domain to whitelist
* 21:51 hashar: Zuul/Jenkins back up and operational.
* 21:43 hashar: hardkilled Zuul :-(  6 events lost.
* 21:38 hashar: restarting Zuul it has a bunch of stalled changes
* 21:32 bblack: enabled cp301[78] frontends in pybal
* 21:27 hashar: restarting Jenkins
* 21:26 hashar: Zuul/Jenkins stalled apparently
* 20:59 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable property suggester on testwikidata (duration: 00m 07s)
* 20:58 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/WikimediaMessages/: (no message) (duration: 00m 15s)
* 20:57 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/OAuth/: (no message) (duration: 00m 15s)
* 20:48 logmsgbot: aude Finished scap: Update Wikidata, for enabling property suggester on testwikidata (duration: 31m 57s)
* 20:16 logmsgbot: aude Started scap: Update Wikidata, for enabling property suggester on testwikidata
* 19:18 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 19:14 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/OAuth/: (no message) (duration: 00m 14s)
* 19:06 RobH: blog is back online after a number of reboots due to raid rebuild issues
* 18:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf11
* 18:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf10
* 18:15 logmsgbot: reedy Synchronized php-1.24wmf10/includes/api/ApiQueryRecentChanges.php: Id9c316733896a27ce3f6c3e0e5efdf62f7d1ff1b (duration: 00m 14s)
* 18:08 ottomata: starting new elasticsearch nodes 1017,1018,1019
* 18:04 RobH: aware of holmium issue (old varnish), in process of repair, blog is down
* 17:05 logmsgbot: reedy Synchronized php-1.24wmf11/resources/Resources.php: I1237909d7e058137d55e5de9fa4d64fe1f7f9472 (duration: 00m 14s)
* 17:04 logmsgbot: reedy Finished scap: l10nupdate for 1.24wmf11 for Skins I9395b0e1983122b12bedf003d6398da5ddfd5651 (duration: 16m 35s)
* 16:48 logmsgbot: reedy Started scap: l10nupdate for 1.24wmf11 for Skins I9395b0e1983122b12bedf003d6398da5ddfd5651
* 16:46 logmsgbot: reedy Purged l10n cache for 1.24wmf4
* 16:45 logmsgbot: reedy Purged l10n cache for 1.24wmf5
* 16:45 logmsgbot: reedy Purged l10n cache for 1.24wmf6
* 16:44 logmsgbot: reedy Purged l10n cache for 1.24wmf7
* 16:44 logmsgbot: reedy Purged l10n cache for 1.24wmf8
* 16:32 logmsgbot: reedy Finished scap: testwiki to 1.24wmf11 and build l10n cache (duration: 27m 20s)
* 16:05 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
* 16:01 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.XetXfk5RPi" ' returned non-zero exit status 1 (duration: 00m 18s)
* 16:00 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
* 15:56 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.9SaYNRzegr" ' returned non-zero exit status 1 (duration: 00m 24s)
* 15:55 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
* 15:55 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.EjEynr9oww" ' returned non-zero exit status 1 (duration: 00m 55s)
* 15:54 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
* 15:24 cmjohnson1: shutting down holmium to replace disk
* 14:35 bblack: restarted nova-network on labnet1001
* 14:26 hashar: updated zuul cloner in git repo and deployed zuul ( tag is wmf-deploy-20140626-1 )
* 13:54 godog: remounted (broken) sdk1 on ms-be3003
* 13:32 cmjohnson1: powering down dataset1001 -relocating to 10G rack
* 13:26 logmsgbot: reedy Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/142142/ No-op for labs (duration: 00m 16s)
* 12:55 hashar: Jenkins: updates jobs for extensions (phpunit and qunit) to use the mw-run-update-script.sh instead of update.php . That runs update.php twice, the first time logging sql to a file that can be archived.  {{gerrit|141851}}
* 12:48 mark: Deactivated BGP session to AS13030
* 11:01 hashar: Replacing operations-puppet-validate job with operations-puppet-pplint-HEAD which is faster and can run concurrently on multiple boxes. {{gerrit|142223}}
* 10:52 godog: stopping swift on ms-be3003
* 10:12 godog: upgrading ms-be3001 to swift icehouse
* 06:26 springle: ran operations/software maintain-replicas.pl and fedtables.pl on labsdbs for bug 59683
* 05:54 Tim: on mw1014: reformatted the /tmp partition
* 05:50 Tim: on mw1014: stopped job runner due to bad /tmp
* 04:44 ori: mw1014 is sad, has filesystem issues: "Attempt to read block from filesystem resulted in short read while trying to open /tmp". Puppet can't run. Should be depooled.
* 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 26 03:33:19 UTC 2014 (duration 33m 18s)
* 03:02 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-26 03:01:43+00:00
* 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-26 02:31:50+00:00
 
== June 25 ==
* 23:43 awight: updated crm from f3389daa94e9ad924175bdf0d5bc09c4a26aeb8c to e2fe03a9cd51e30206d9a1114d62dfbd6960816b
* 23:27 logmsgbot: catrope Finished scap: Updating Wikidata and TimedMediaHandler (duration: 04m 24s)
* 23:23 logmsgbot: catrope Started scap: Updating Wikidata and TimedMediaHandler
* 21:22 hashar: puppet fixed on gallium / lanthanum . It was missing a group definition. All fixed! Thanks Chase.
* 20:53 hashar: puppet broken on gallium.wikimedia.org and lanthanum.eqiad.wmnet . That is being looked at.
* 20:34 subbu: deployed parsoid 4ef9d6be
* 19:38 manybubbles: restarted Cirrus scripts after incident - the index rebuilds had to be completely restarted - sanity checking was simply paused
* 18:54 logmsgbot: yurik Synchronized wmf-config/PrivateSettings.php: Removed obsolete ZRMA user/pswd (duration: 01m 06s)
* 18:46 logmsgbot: yurik Finished scap: Removing ZeroRatedMobileAccess ext settings, depl latest JsonConfig/ZeroBanner/Portal (duration: 29m 09s)
* 18:17 logmsgbot: yurik Started scap: Removing ZeroRatedMobileAccess ext settings, depl latest JsonConfig/ZeroBanner/Portal
* 17:41 logmsgbot: demon Synchronized wmf-config/: Cirrus back on for wikis that had it before. Back to square 1 (duration: 00m 04s)
* 17:29 mwalker: updating fundraising tools from 5f3a7316b636c0723ce3fa353186d4041b662872 to cdc4b73bd59d27c8d386b6df629b1c574cfed85f
* 17:06 manybubbles: success!
* 17:06 logmsgbot: manybubbles Synchronized wmf-config/: try to fix cirrus (duration: 00m 04s)
* 16:51 andrewbogott: restarted apache on palladium -- _that_ helped
* 16:49 andrewbogott: it didn't help
* 16:49 andrewbogott: restarting puppetmaster on palladium
* 16:42 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Disable Cirrus everywhere but testwiki (duration: 00m 04s)
* 16:23 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Roll back previous Cirrus deploy (duration: 00m 05s)
* 16:23 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: Roll back previous Cirrus deploy (duration: 00m 04s)
* 16:16 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: I4c54357a: most remaining wikis getting Cirrus as primary (duration: 00m 04s)
* 16:16 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: I4c54357a: most remaining wikis getting Cirrus as primary (duration: 00m 04s)
* 15:27 logmsgbot: manybubbles Synchronized php-1.24wmf9/extensions/Wikidata/: SWAT - fix rtl issue (duration: 00m 10s)
* 15:23 ottomata: reinstalling elastic1017,1018,1019
* 15:20 logmsgbot: manybubbles Synchronized php-1.24wmf10/extensions/Wikidata/: SWAT - fix rtl issue (duration: 00m 12s)
* 14:10 Krinkle: Upgrade npm from v1.4.5 to v1.4.16 on integration-slave1001 and integration-slave1002
* 14:10 Krinkle: Upgraded npm from v1.4.13 to v1.4.16 on integration-slave1003 to fix https://github.com/npm/npm/issues/5472 and repooling
* 13:30 Krinkle: Depooling integration-slave1003 as almost every other -npm build on this node fails due to corrupted ~/.npm cache
* 12:52 manybubbles: cirrus rebuild update: starting from mediawiki reindex step for all alphabetical wikis that have finished so far
* 12:48 manybubbles: cirrus rebuild update: started rebuilding group1's indexes yesterday.  commons and wikidata finished their in place pass and started their from mediawiki pass.  The remaining wikis are running their in place pass in alphabetical order and currently on frwiktionary.
* 12:25 hashar: Upgraded Zuul 9839edb..b7fc126  Brings patchset 20 of Zuul cloner ( https://review.openstack.org/#/c/70373/ )
* 12:02 akosiaris: upgraded etherpad.wikimedia.org to etherpad-lite 1.4.0
* 11:12 paravoid: switching inbound email for wikimedia.org to polonium/mchenry
* 10:35 _joe_: restarted lucene on search1016 as it was stuck there as well, once search1015 is up and running
* 10:06 _joe_: restarted lucene on search1015, it was stuck
* 07:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: incremental LB bump on db1009 and db1021 traffic samplers (duration: 00m 07s)
* 06:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1021 with traffic sampling (duration: 00m 09s)
* 06:01 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1021, db1049 to normal load (duration: 00m 07s)
* 05:05 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1049, warm up (duration: 00m 08s)
* 02:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 25 02:48:53 UTC 2014 (duration 48m 52s)
* 02:39 springle: xtrabackup clone db1005 to db1049
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-25 02:25:57+00:00
* 02:19 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1049 (duration: 00m 11s)
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-25 02:13:27+00:00
* 00:57 chasemp: added dns for wikimania 2015 (gerrit 140186)
 
== June 24 ==
* 23:28 ori: apache-graceful-all was for Ifc9596cc7
* 23:28 logmsgbot: ori gracefulled all apaches
* 23:12 logmsgbot: maxsem Synchronized visualeditor.dblist: https://gerrit.wikimedia.org/r/141702 (duration: 00m 03s)
* 23:11 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MultimediaViewer: (no message) (duration: 00m 05s)
* 23:02 ori: apache graceful done by me for I543efda24, I29b34689e, and I1c269433e
* 23:00 logmsgbot: root gracefulled all apaches
* 20:53 hashar: Jenkins / Zuul deploying experimental pipeline {{gerrit|141827}}
* 20:29 RoanKattouw: Restarting Apache on mw1220, getting lots of "Unable to allocate memory for pool" errors
* 20:29 ottomata: rebooting analytics1021
* 20:25 ottomata: reinitializing varnish topics with replication factor of 3
* 20:02 hashar: updated our Jenkins Job Builder copy  416ee7d..e9db73d
* 19:58 hashar: Upgraded Zuul on gallium.wikimedia.org to install the zuul-cloner of doom. 4f9fd51..9839edb Tagged  wmf-deploy-20140624-1 in our repo.
* 19:39 manybubbles: rebuilding search index for group1 wikis after upgrade today
* 18:27 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
* 18:25 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf10
* 17:52 logmsgbot: manybubbles Synchronized wmf-config: Drop Cirrus indexes to five shards on rebuild and switch all wikis to new highlighter (duration: 00m 04s)
* 17:44 logmsgbot: aaron Synchronized wmf-config/InitialiseSettings.php: Maintenance reports limit incremental increase (duration: 00m 08s)
* 17:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: reduce db1021 load (duration: 00m 10s)
* 17:06 akosiaris: restarted hadoop yarn on analytics1013
* 15:36 bblack: VCL compilation is now in-sync everywhere but bits caches...
* 15:21 logmsgbot: manybubbles Synchronized php-1.24wmf9/extensions/CirrusSearch/: SWAT Stop Cirrus from breaking RandomRootPage (duration: 00m 06s)
* 15:15 logmsgbot: manybubbles Synchronized php-1.24wmf10/extensions/CirrusSearch/: SWAT Stop Cirrus from breaking RandomRootPage (duration: 00m 04s)
* 15:06 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - visual editor config changes and retire some beta features (duration: 00m 04s)
* 15:05 logmsgbot: manybubbles Synchronized visualeditor-default.dblist: SWAT - Enable VisualEditor by default on Wikimania 2014 wiki (duration: 00m 04s)
* 15:05 logmsgbot: manybubbles Synchronized visualeditor.dblist: SWAT - Enable VisualEditor by default on Wikimania 2014 wiki (duration: 00m 06s)
* 15:03 logmsgbot: manybubbles Synchronized php-1.24wmf10/includes/config/GlobalVarConfig.php: SWAT -  GlobalVarConfig should not throw exceptions for null-valued config settings (duration: 00m 05s)
* 14:53 logmsgbot: hoo Synchronized wmf-config/CommonSettings.php: Enable Wikibase property suggester on beta (duration: 00m 07s)
* 14:15 hashar: Jenkins set SMTP server to wiki-mail.wikimedia.org  smtp.pmtpa.wmnet got deleted
* 14:07 hashar: Jenkins is back
* 13:59 Krinkle: Build logs in Jenkins incorrectly render ansi color codes since it was upgraded to 0.4.0. Downgrading to 0.3.1 and restarting Jenkins.
* 09:55 godog: removing old salt master cache on palladium, moved yesterday out of the way
* 06:59 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1049 (duration: 00m 08s)
* 06:23 Nemo_bis: FYI no gerrit mail since yesterday 15 UTC, https://bugzilla.wikimedia.org/show_bug.cgi?id=67018
* 02:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 24 02:47:14 UTC 2014 (duration 47m 13s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-24 02:25:43+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-24 02:13:38+00:00
 
== June 23 ==
* 23:12 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/141102/ (duration: 00m 06s)
* 23:12 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/141102/ (duration: 00m 05s)
* 23:03 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/140897/ (duration: 00m 04s)
* 20:05 subbu: deployed parsoid 392435a2 (deploy sha db94f88c)
* 19:22 hashar: gallium / zuul : deleting /var/lib/zuul/git old Zuul repositories. They have been migrated to /srv/ssd/zuul/git/ ages ago
* 19:20 jgage: ms-be3003 full root partition fixed, swift had written to /srv/swift-storage/sdk1 onto root due to umounted sdk1
* 17:38 bblack: lvs1005:eth3 was negotiated to 100mbps (???) - disable -> enable on switch fixed it
* 17:36 godog: restarted salt-master on palladium, suspected job cleanup stuck
* 17:04 bd808: Fixed dangling symlink for /etc/apache2/sites-enabled/logstash.wikimedia.org on logstash1001 by deleting symlink and forcing puppet run
* 16:49 godog: added mw1149-52 back to pybal apache
* 16:33 paravoid: switched inbound mail for all non-wikimedia.org domains from mchenry/sodium to polonium/mchenry (~16:00 + <= 1h TTL UTC)
* 15:13 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add a Library of Congress domain to wgCopyUploadsDomains [[gerrit:141308]] (duration: 00m 14s)
* 15:11 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Adjust group rights on ruwiki [[gerrit:140910]] (duration: 00m 14s)
* 15:10 logmsgbot: anomie Synchronized php-1.24wmf9/includes/api/ApiExpandTemplates.php: SWAT: Fix fatal in API action=expandtemplates with Scribunto [[gerrit:141416]] (duration: 00m 15s)
* 15:04 logmsgbot: anomie Synchronized php-1.24wmf10/includes/api/ApiExpandTemplates.php: SWAT: Fix fatal in API action=expandtemplates with Scribunto [[gerrit:141417]] (duration: 00m 14s)
* 14:55 andrewbogott: reenabling puppet on labstore1001, hoping it doesn't break labs
* 14:38 hashar: Further upgraded Zuul up to upstream b8c24ce + our local hacks. Git tag is wmf-deploy-20140623-4
* 14:14 hashar: upgraded Zuul by one commit (that introduces swift supports though disabled it on our setup via a custom hack)
* 13:20 paravoid: switching outbound email to polonium
* 12:17 manybubbles: rebuilding Cirrus index on group0 wikis to pick up changes like results boosting from categories and wikitext search
* 10:37 godog: powering down maerlant, decom-med
* 10:05 godog: hardreset maerlant, stuck on console and no ssh
* 09:40 paravoid: killing sodium's lighttpd compress cache
* 07:21 _joe_: powercycled cp4018, stuck with a blank console
* 02:59 springle: moving lighttpd compressed archives on sodium off / to regain inodes
* 02:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 23 02:45:24 UTC 2014 (duration 45m 23s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-23 02:25:53+00:00
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-23 02:13:53+00:00
* 00:38 legoktm: mail is stuck, lots of mails queued in exim
 
== June 22 ==
* 22:25 _joe_: restarted apache on strontium, passenger crashed (again).
* 21:06 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings-labs.php: For cluster consistency... (duration: 00m 08s)
* 19:24 godog: silenced LVS healthcheck on rendering.svc until 23:23 UTC
* 02:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 22 02:41:30 UTC 2014 (duration 41m 29s)
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-22 02:23:50+00:00
* 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-22 02:12:50+00:00
 
== June 21 ==
* 16:12 _joe_: restarted ms-be1012, see http://paste.debian.net/106247/ for console output
* 02:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 21 02:45:17 UTC 2014 (duration 45m 16s)
* 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-21 02:28:59+00:00
* 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-21 02:17:18+00:00
 
== June 20 ==
* 22:58 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: Load nostalgia from skins rather than extensions when it exists (duration: 00m 04s)
* 20:23 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: wfGetIP removal, code cleanup (duration: 00m 04s)
* 20:22 logmsgbot: demon Synchronized wmf-config/throttle.php: wfGetIP removal, code cleanup (duration: 00m 05s)
* 17:11 godog: expanded palladium's root to avoid filling up, suspected salt-master (RT #7721)
* 16:53 bd808: Ran /usr/local/bin/sync-common on fenari to verify fix for bug 66844. It works!
* 15:16 logmsgbot: reedy Synchronized wmf-config/extension-list-labs: (no message) (duration: 00m 16s)
* 11:00 _joe_: restarted apache on palladium, passenger was dead and filling error logs
* 03:35 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 20 03:34:06 UTC 2014 (duration 34m 5s)
* 03:19 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-20 03:18:36+00:00
* 02:35 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-20 02:34:14+00:00
* 00:06 MaxSem: Running clearMessageBlobs.php
 
== June 19 ==
* 23:52 MaxSem: that was a touch
* 23:51 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MultimediaViewer/: (no message) (duration: 00m 04s)
* 23:38 logmsgbot: maxsem Finished scap: Mark Traceur made me do it! (duration: 15m 14s)
* 23:23 logmsgbot: maxsem Started scap: Mark Traceur made me do it!
* 23:20 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MobileFrontend/: (no message) (duration: 00m 03s)
* 23:19 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
* 23:18 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/CirrusSearch/: (no message) (duration: 00m 03s)
* 23:18 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/VisualEditor/: (no message) (duration: 00m 04s)
* 23:16 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/VisualEditor/: (no message) (duration: 00m 04s)
* 23:14 bd808: Restarted logstash service on logstash1001
* 23:06 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MultimediaViewer/: (no message) (duration: 00m 05s)
* 23:06 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MultimediaViewer/: (no message) (duration: 00m 05s)
* 22:52 bd808: Updated scap to 792a572
* 21:21 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Set wmgMediaViewerBeta to false everywhere (duration: 00m 15s)
* 21:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf10
* 21:07 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf9 take 2
* 19:31 mutante: started mysql on pc1002
* 19:17 MatmaRex: <RobH> powercycled pc1002
* 19:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias back to 1.24wmf8
* 19:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf9
* 19:00 logmsgbot: reedy Finished scap: scap 1.24wmf10 take 2... (duration: 22m 59s)
* 18:37 ori: neon, logstash100x, zirconium, stat1001, netmon1001: replaced sites-enabled symlinks with their targets and forced puppet-run to clean up after Iddc778a28
* 18:37 logmsgbot: reedy Started scap: scap 1.24wmf10 take 2...
* 18:08 logmsgbot: reedy Started scap: testwiki to 1.24wmf10 and build l10n cache
* 17:29 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix the entity selector (duration: 00m 09s)
* 15:51 mutante: powercycling elastic1017 (went down and no console output)
* 15:13 godog: removed old pmtpa swift stats from graphite
* 15:04 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Put testwiki namespaces in the right place [[gerrit:140261]] (duration: 00m 14s)
* 15:04 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Put testwiki namespaces in the right place [[gerrit:140261]] (duration: 00m 15s)
* 15:02 logmsgbot: anomie Synchronized wmf-config/throttle.php: SWAT: Raise account creation limit for Telugu Wikipedia workshop on June 23 [[gerrit:140669]] (duration: 00m 15s)
* 14:30 cmjohnson1: replacing failed disk slot3 es1006
* 13:01 _joe_: re-enable puppet on lvs1003
* 11:26 logmsgbot: reedy Synchronized wmf-config/: touch (duration: 00m 15s)
* 11:25 logmsgbot: reedy Synchronized commonsuploads.dblist: (no message) (duration: 00m 15s)
* 11:00 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
* 10:53 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 15s)
* 10:52 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 15s)
* 10:28 Reedy: manually ran sync-common tin on fenari
* 10:09 logmsgbot: reedy Synchronized docroot/noc: (no message) (duration: 00m 15s)
* 10:07 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
* 10:04 logmsgbot: reedy Synchronized wmf-config/: I248fa7b98a8a0eea943c6643d1bf9c2ed36296b8 (duration: 00m 15s)
* 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 19 03:33:36 UTC 2014 (duration 33m 35s)
* 02:46 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-19 02:45:51+00:00
* 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-19 02:23:42+00:00
 
== June 18 ==
* 23:09 awight: update crm from 26460d6eaec26861661322df8e9f07a8b0519677 to f3389daa94e9ad924175bdf0d5bc09c4a26aeb8c
* 23:05 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/VisualEditor/: https://gerrit.wikimedia.org/r/#/c/140563/ (duration: 00m 03s)
* 23:03 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/140250/ (duration: 00m 04s)
* 22:30 bblack: rebooting lvs1004 + lvs1005
* 22:10 bblack: turning lvs1003 pybal back on
* 21:52 bblack: disable pybal on lvs1003, since 1006 seems to have all its interfaces :P
* 21:34 bblack: rebooting lvs1003 for kernel/bios stuff
* 21:00 bblack: rebooting lvs1006 for kernel/bios stuff
* 20:23 subbu: deployed Parsoid 88a61f81 (deploy repo sha 470a5ef2)
* 17:39 logmsgbot: yurik Synchronized docroot/bits/WikipediaMobileFirefoxOS/: (no message) (duration: 01m 09s)
* 17:35 logmsgbot: yurik Synchronized php-1.24wmf9/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 14s)
* 17:32 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 15s)
* 17:26 logmsgbot: yurik Synchronized docroot/bits/WikipediaMobileFirefoxOS/: (no message) (duration: 01m 04s)
* 17:10 RobH: magnesium back to proper function
* 17:09 RobH: apache2ctl restart on magnesium, racktables wasn't working
* 16:24 bblack: rebooting lvs4001 for kenerl + num_queues
* 16:19 bblack: rebooting lvs4002 for kenerl + num_queues
* 15:20 bblack: rebooting lvs4003 for kernel / num_queues updates
* 15:17 bblack: rebooting lvs4004 for kernel / num_queues updates
* 15:10 logmsgbot: anomie Synchronized php-1.24wmf9/extensions/Scribunto/engines/LuaCommon/SiteLibrary.php: SWAT: Fix Scribunto-related exceptions on testwiki [[gerrit:140370]] (duration: 00m 14s)
* 13:40 _joe_: restarted profiler-to-carbon, stuck (again) waiting for mwprof
* 13:25 springle: script rt-7708.pl hitting m2-master eventlogging from terbium for RT #7708. fine to kill if necessary
* 10:01 hashar: Updated our Jenkins job builder fork: 8cbc93a..416ee7d
* 08:26 _joe_: disk is gone, powering down ms-be1007, opening ticket for disk replacement
* 08:24 _joe_: stopped swift on ms-be1007, unmounting volume to check for repair
* 06:01 springle: restarted gmetad on nickel while unbreaking the mysql graphs I broke on ganglia
* 04:30 ori: enabled puppet on polonium (was disabled but nothing in SAL)
* 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 18 02:58:22 UTC 2014 (duration 58m 21s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-18 02:25:03+00:00
* 02:23 MaxSem: searchidx1001 outta sync - running sync-common
* 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-18 02:13:34+00:00
* 02:05 Krinkle: Nevermind, graphite.wikimedia.org going down is due to overload which recovers eventually (it just has). Has become SNAFU/FIXME.
* 02:02 Krinkle: graphite.wikimedia.org is down with HTTP 502 Bad Gateway errors
* 01:49 ori: puppet freshness on tungsten and stat1001 can be fixed with https://gerrit.wikimedia.org/r/#/c/140269/
 
== June 17 ==
* 20:19 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/140178/ (duration: 00m 04s)
* 20:17 logmsgbot: maxsem Synchronized php-1.24wmf8/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/140178/ (duration: 00m 05s)
* 20:01 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix editing site links (duration: 00m 24s)
* 18:23 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 16s)
* 18:22 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf9
* 18:05 logmsgbot: demon Synchronized wmf-config/PoolCounterSettings-eqiad.php: Limit regex searches before they start landing on wikis (duration: 00m 04s)
* 16:32 bblack: enabled amssq31-46 esams text frontend varnishes in pybal (were misconfigured; wrong domainname)
* 15:18 logmsgbot: manybubbles Synchronized php-1.24wmf8/extensions/CirrusSearch/: SWAT - Fix Cirrus Special:Random (duration: 00m 04s)
* 15:13 logmsgbot: manybubbles Synchronized php-1.24wmf9/extensions/CirrusSearch/: SWAT - Fix Cirrus Special:Random (duration: 00m 04s)
* 15:02 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - lower event logging rate for mediaviewer (duration: 00m 05s)
* 13:51 _joe_: production puppet masters upgraded to puppet 3
* 07:12 springle: starting updateCollation on s3 frwikinews from tin
* 07:07 logmsgbot: springle Synchronized wmf-config/InitialiseSettings.php: $wgCategoryCollation to uca-fr on frwikinews (duration: 00m 07s)
* 03:20 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 17 03:19:12 UTC 2014 (duration 19m 11s)
* 02:35 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-17 02:34:09+00:00
* 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-17 02:22:46+00:00
 
== June 16 ==
* 23:12 logmsgbot: maxsem Synchronized php-1.24wmf8/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/139562/ (duration: 00m 05s)
* 23:11 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/139562/ (duration: 00m 06s)
* 23:05 logmsgbot: maxsem Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/139888/ (duration: 00m 08s)
* 21:30 ori: upgraded eventlogging to 3012aad
* 20:45 ori: updated eventlogging to b4b42effc6
* 17:36 logmsgbot: csteipp Synchronized php-1.24wmf8/extensions/EducationProgram/includes/api/ApiAddStudents.php: Bug66631 (duration: 00m 05s)
* 17:34 logmsgbot: csteipp Synchronized php-1.24wmf9/extensions/EducationProgram/includes/api/ApiAddStudents.php: (no message) (duration: 00m 05s)
* 15:59 godog: manually ran update-ubuntu-mirror on carbon, successful
* 15:57 awight: updated crm from e52a4eb1bfab622f612dc84f687678fff1fdbc04 to 26460d6eaec26861661322df8e9f07a8b0519677
* 15:30 ottomata: reinstalling analytics1018
* 13:38 twkozlowski: _joe_ also working on recovering the list which was deleted by mistake
* 13:37 _joe_: closed wikimedia-de-by list
* 13:13 _joe_: removing chip-l mailing list as for bug #63877
* 13:03 godog: restarting swift-proxy-server on ms-fe1001 to test statsd metrics
* 10:47 godog: restarting swift-proxy-server on ms-fe3002 to test statsd metrics
* 10:23 hoo: Touched all 1.24wmf8 extension/wikidata files and ran sync-common after that on mw1070
* 10:18 logmsgbot: hoo Synchronized php-1.24wmf8/extensions/Wikidata/: Update Wikidata to fix a suggester bug (duration: 00m 09s)
* 10:16 godog: restarting swift-proxy-server on ms-fe3001 to test statsd metrics
* 10:12 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix a suggester bug (duration: 00m 13s)
* 09:29 apergos: restarted search1015 about 15 mns ago, it's now recovered afaict, restarted search1016, it's doing index setup now
* 03:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 16 02:59:43 UTC 2014 (duration 59m 42s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-16 02:26:05+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-16 02:14:38+00:00
 
== June 15 ==
* 17:44 logmsgbot: hoo Synchronized php-1.24wmf8/extensions/Wikidata/: Touched various JavaScripts (duration: 00m 09s)
* 14:26 Reedy: Job runners were restarted on tmh100[12] and are now processing jobs
* 14:15 godog: extended palladium root partition by +20G
* 13:50 _joe|away: restarted mw-job-runner on tmh1001
* 10:02 paravoid: nuked ms-be1001 sdj with zeros, reformatting and placing into production again
* 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 15 02:58:21 UTC 2014 (duration 58m 20s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-15 02:26:03+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-15 02:14:46+00:00
 
== June 14 ==
* 22:27 bawolff: video scalers seem to have stopped doing webVideoTranscode jobs
* 20:24 legoktm: ran "delete from ep_students where student_user_id =0 limit 1;" on enwiki for bug 66624
* 20:10 legoktm: ran "delete from ep_users_per_course where upc_user_id=0 limit 1" on enwiki for bug 66624
* 19:19 paravoid: unmounting ms-be1001's sdj1, corrupted filesystem
* 18:46 paravoid: rebooting ms-be1001, XFS: Internal error XFS_WANT_CORRUPTED_RETURN, lots of processes in D
* 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 14 03:07:14 UTC 2014 (duration 7m 13s)
* 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-14 02:36:35+00:00
* 02:36 bblack: enabled amssq43-46 frontends (esams text varnish) in pybal
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-14 02:16:38+00:00
* 00:46 bblack: enabled amssq39-42 frontends (esams text varnish) in pybal
 
== June 13 ==
* 22:01 manybubbles: logstash1002 seems to be properly restoring nodes to itself.  I'll monitor it for the next few minutes but I believe my work here is done.
* 21:55 manybubbles: bouncing logstash1002 because it seems stuck.  not sure why.  no useful logs.
* 21:07 bblack: turned on amssq35-38 text frontends in esams (in pybal)
* 20:57 awight: update crm from c38296add61421f87e12cb5b4f3dd68bdf2340db to e52a4eb1bfab622f612dc84f687678fff1fdbc04
* 20:23 bblack: turned on amssq31-34 text frontends in esams
* 18:41 mutante: DNS update - removing manutius' public IP
* 18:31 mutante: shutting down manutius, decom
* 18:22 logmsgbot: ori Synchronized php-1.24wmf9/extensions/Math: I498053de4: Fix the VisualEditor parts of Math-wmf9 with a working cherry pick of I7d5e1174 (duration: 00m 08s)
* 16:55 logmsgbot: hoo Synchronized php-1.24wmf8/extensions/Wikidata/: Update Wikidata to fix JavaScript issues (duration: 00m 09s)
* 16:45 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix JavaScript issues (duration: 00m 10s)
* 16:31 Reedy: Finished creating mathoid tables on all wikis
* 16:26 Reedy: Creating mathoid tables on all wikis
* 16:11 mutante: manutius - decom, delete salt key, puppet cert, stopped services...
* 15:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 13 15:16:09 UTC 2014 (duration 53m 11s)
* 14:59 logmsgbot: reedy Synchronized wmf-config/: Disable MW_MATH_SOURCE for now (duration: 00m 15s)
* 14:46 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-13 14:45:40+00:00
* 14:36 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-13 14:35:41+00:00
* 13:00 bblack: moved ge-3/0/0 - 3/0/15 from public to private vlan on cs2-esams (amssq31-46)
* 10:02 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1049 (duration: 00m 12s)
* 09:56 paravoid: deactivating eqiad<->HE, excessive packet loss/latency
* 09:33 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1071 (duration: 00m 07s)
* 08:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1070, depool db1071 (duration: 00m 12s)
* 07:48 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1066, depool db1070 (duration: 00m 07s)
* 07:19 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1065, depool db1066 (duration: 00m 13s)
* 06:51 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1062, depool db1065 (duration: 00m 09s)
* 06:09 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062 (duration: 00m 12s)
* 05:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1051 (duration: 00m 14s)
* 03:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 13 03:53:17 UTC 2014 (duration 53m 16s)
* 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-13 03:11:28+00:00
* 02:35 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-13 02:34:41+00:00
* 00:45 logmsgbot: ori Synchronized php-1.24wmf8/extensions/Math: Reverting Extension:Math to 1bb3bfa3b5656 (duration: 00m 05s)
* 00:44 logmsgbot: ori Synchronized php-1.24wmf9/extensions/Math: Reverting Extension:Math to 1bb3bfa3b5656 (duration: 00m 06s)
* 00:41 ori: removed Physikerwelt and Frédéric Wang from extension-Math group in Gerrit pending further inquiry into recent changes
* 00:38 logmsgbot: ori Finished scap: fix any lingering inconsistencies in the state of the app servers (see https://gerrit.wikimedia.org/r/139089) (duration: 26m 59s)
* 00:11 logmsgbot: ori Started scap: fix any lingering inconsistencies in the state of the app servers (see https://gerrit.wikimedia.org/r/139089)
 
== June 12 ==
* 23:35 logmsgbot: ori Synchronized php-1.24wmf8/extensions/MobileFrontend: Re-syncing after submodule update (duration: 00m 06s)
* 23:34 ori: ran sync-common on mw1151
* 23:17 logmsgbot: catrope Synchronized php-1.24wmf9/extensions/MobileFrontend: (no message) (duration: 00m 04s)
* 23:17 logmsgbot: catrope Synchronized php-1.24wmf8/extensions/MobileFrontend: (no message) (duration: 00m 05s)
* 23:17 logmsgbot: catrope Synchronized php-1.24wmf8/extensions/VisualEditor: (no message) (duration: 00m 04s)
* 23:07 Krinkle: integration-slave1003 is failing npm-test builds due to a cache corruption (filed as https://github.com/npm/npm/issues/5472). Manually cleared /mnt/home/jenkins-deploy/.npm/async on integration-slave1003.eqiad.wmflabs for now.
* 23:05 MaxSem: Purging PageImages data from Wikibooks and Wikisource
* 22:59 logmsgbot: catrope Synchronized wmf-config/: (no message) (duration: 00m 04s)
* 22:46 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: disable MW_MATH_MATHML until mathoid table is created (BUG 66492) (duration: 00m 04s)
* 22:31 logmsgbot: ori Synchronized php-1.24wmf8/extensions/WikimediaEvents: Update WikimediaEvents for Ibd36da416 (duration: 00m 03s)
* 22:30 logmsgbot: ori Synchronized php-1.24wmf9/extensions/WikimediaEvents: Update WikimediaEvents for Ibd36da416 (duration: 00m 03s)
* 21:11 logmsgbot: yurik Synchronized php-1.24wmf9/extensions/JsonConfig/: JsonConfig ext update, fixing bug 66555 (duration: 01m 03s)
* 21:10 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/JsonConfig/: JsonConfig ext update, fixing bug 66555 (duration: 01m 04s)
* 19:25 ottomata: stopping puppet on an18
* 19:19 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf9
* 19:19 ottomata: starting hadoop decom of analytics1018.  This node will become a Kafka broker
* 19:04 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf8
* 19:04 MaxSem: Dropping old GeoData tables from everywhere
* 18:52 logmsgbot: reedy Finished scap: 1.24wmf9 staging take 2... (duration: 15m 20s)
* 18:37 logmsgbot: reedy Started scap: 1.24wmf9 staging take 2...
* 18:06 logmsgbot: reedy Started scap: testwiki to 1.24wmf9 and build l10n cache
* 17:49 ottomata: disabling puppet on analytics1012 and analytics1022
* 17:48 ottomata: starting some kafka failure tests, I have scheduled downtime for some service checks in icinga, hopefully this will not be noisy
* 17:41 ottomata: restarting elasticsearch on logstash servers
* 17:34 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Enabling new zero ext on all wikis (duration: 01m 03s)
* 17:22 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Attempting to enable new zero ext on zerowiki & ruwiki - take3 (duration: 01m 04s)
* 17:06 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/: (no message) (duration: 01m 12s)
* 17:05 greg-g: yurik's blank sync message could have been: Deploying new JsonConfig,ZeroBanner,ZeroPortal extensions (refactoring ZeroRatedMobileAccess ext)
* 17:04 logmsgbot: yurik Synchronized php-1.24wmf7/extensions/: (no message) (duration: 01m 15s)
* 15:31 logmsgbot: manybubbles Synchronized wmf-config/throttle.php: SWAT: Raise account creation limit for eswiki outreach event (duration: 00m 05s)
* 13:39 bblack: enabling cp301[34] esams mobile frontends in pybal
* 11:18 hashar: Gerrit: created mediawiki/services/cxserver/deploy repository for Nikerabbit and kart_
* 05:52 paravoid: cr1-esams/cr2-knams: dismantling amslvs BGP peerings
* 05:46 paravoid: amslvs[1234]: stopping pybal
* 03:40 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 12 03:39:07 UTC 2014 (duration 39m 6s)
* 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-12 03:02:09+00:00
* 02:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1051 (duration: 01m 08s)
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-12 02:32:07+00:00
* 01:47 ori: graceful'd appservers for I0e66ee0a1: 2.4 compat: load mod_filter for AddOutputFilterByType
* 00:44 bblack: ran "puppetca -s palladium.eqiad.wmnet" on palladium to get agent running again, someone borked/regenerated the key there 6 hours ago?
* 00:20 mwalker: clearMessageBlobs.php killed because we fixed the problem in a more different way
* 00:17 logmsgbot: mwalker Synchronized php-1.24wmf8/extensions/MultimediaViewer/resources/mmv/ui/mmv.ui.canvasButtons.js: poking cache for multimediaviewer messages (duration: 00m 04s)
* 00:05 logmsgbot: aaron Synchronized php-1.24wmf8/includes/EditPage.php: e11d41dd366b039bff79e247368b6bff1245ea5e (duration: 00m 07s)
 
== June 11 ==
* 23:50 mwalker: clearing resourceloader blobs on commonswiki to try and force a multimediaviewer message "mwscript extensions/WikimediaMaintenance/clearMessageBlobs.php --wiki=commonswiki"
* 23:49 awight: updated SmashPig from 98b1f348aa55f6a3aac441db08a59ca309fade7a to 22e2923a3a030b17815181574f9ca99b38c5f2dc
* 23:41 logmsgbot: mwalker Finished scap: SWAT deploy for MultimediaViewer, CentralNotice, and testwiki config (duration: 24m 16s)
* 23:16 logmsgbot: mwalker Started scap: SWAT deploy for MultimediaViewer, CentralNotice, and testwiki config
* 23:10 Krinkle: Running deleteEqualMessages.php on trwiki (bug 43917)
* 22:58 logmsgbot: yurik Synchronized wmf-config/: Restoring to ZRMA for now (duration: 01m 04s)
* 22:22 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Attempting to enable new zero ext on zerowiki & ruwiki - take2 (duration: 01m 06s)
* 22:19 ^d: restarted elasticsearch on logstash1003, complaining about heap.
* 22:06 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Attempting to enable new zero ext on zerowiki & ruwiki (duration: 01m 12s)
* 21:58 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/JsonConfig/: (no message) (duration: 01m 11s)
* 21:56 logmsgbot: yurik Synchronized php-1.24wmf7/extensions/JsonConfig/: (no message) (duration: 01m 09s)
* 21:50 logmsgbot: yurik Finished scap: (no message) (duration: 25m 51s)
* 21:46 ori: Disabling Puppet on mw1149. It's a former bits app server that isn't in PyBal so it isn't getting traffic. Going to stage some proposed changes for apache-config and operations/puppet there.
* 21:24 logmsgbot: yurik Started scap: (no message)
* 21:05 logmsgbot: yurik Finished scap: Deploying 3 new ext (JsonConfig, ZeroBanner, ZeroPortal), but they are not enabled anywhere yet (duration: 05m 03s)
* 21:00 logmsgbot: yurik Started scap: Deploying 3 new ext (JsonConfig, ZeroBanner, ZeroPortal), but they are not enabled anywhere yet
* 20:07 gwicke: deployed Parsoid 3de0dba15
* 19:18 bblack: rebooting lvs3003 for 3.13 kernel
* 19:17 logmsgbot: marktraceur Finished scap: MultimediaViewer fixes for cards 630, 429, and 697 (duration: 18m 45s)
* 19:17 greg-g: mw1151 *still* giving permission denied errors (publickey), what's the status, yo?
* 19:03 bblack: rebooting lvs3002 for 3.13 kernel + XPS
* 18:59 logmsgbot: marktraceur Started scap: MultimediaViewer fixes for cards 630, 429, and 697
* 18:44 ottomata: disabling puppet on analytics1012 to allow for more replica threads to catch up with current broker replicas...maybe :)
* 18:41 awight: updated crm from b6815d29de97b80a0ab65db576213a604f0c7cb9 to c38296add61421f87e12cb5b4f3dd68bdf2340db
* 18:03 Krinkle: Reloading Zuul to deploy I5d154a4002d08
* 16:43 bblack: shutting off lvs3002.esams pybal to test XPS balancing of live traffic on lvs3004.esams + 3.13
* 16:30 bblack: rebooting lvs3004 (inactive uploads LVS) for 3.13 again
* 14:52 hashar: Jenkins restarting (plugin upgrades)
* 14:48 bblack: rebooting lvs3004.esams (inactive uploads LVS) for 3.13 kernel
* 14:41 _joe_: manually ran 'planet' on en.planet to restore technews
* 14:40 hashar: Jenkins updating plugins
* 13:56 paravoid: upgrading mw1153-mw1160, tmh1001-tmh1002 for USN-2244-1
* 12:21 _joe_: set up a secondary remote named 'readonly' in /a/common on tin, to use with the icinga check for unmerged commits
* 11:40 akosiaris: manually cleaning librenms tables. db1001 is going to have increased load for some time. The approach is automatable, see http://jira.observium.org/browse/OBSERVIUM-757
* 11:32 godog: restarted uwsgi on tungsten, a lot of accesses to reqstats.edits.*.submits
* 10:45 godog: restarted uwsgi on tungsten, hung on fetching many metrics
* 09:54 _joe_: restarted apache on palladium - passenger crashed
* 05:26 paravoid: restarting all swift daemons across the cluster to fix runaway threads due to rsyslog restart
* 05:04 springle: beginning schema changes bug 49193 page_content_model
* 03:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 11 03:28:14 UTC 2014 (duration 28m 13s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-11 02:28:18+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-11 02:14:43+00:00
 
== June 10 ==
* 23:34 andrewbogott: updated labs Trusty image w/puppet3, made default
* 23:19 mutante: rebooting unresponsive ms-be1003
* 21:09 RobH: montly sms credit check:  1,447.36 SMS credits.  will check again in 30 days
* 19:47 hashar: Jenkins restarted apparently properly. Any breakage would probably be related to the version switch :-D
* 19:45 ottomata: power cycling analytics1012, attempting to reinstall as kafka broker with new kafka partman recipe
* 19:42 hashar: Jenkins upgraded from 1.532.2 to 1.554.2 (i.e. bumped to a new LTS version).
* 19:37 hashar: Broke Jenkins by silently upgrading it  :-(
* 19:09 Krinkle: git-deploy: Deploying integration/slave-scripts I9521890b911714edf2
* 18:59 logmsgbot: reedy Synchronized php-1.24wmf8/skins/vector/components/tabs.less: (no message) (duration: 00m 14s)
* 18:58 mutante: shutting down ekrem
* 18:18 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Enable data transclusion for wikiquote (duration: 00m 14s)
* 18:15 logmsgbot: reedy Synchronized docroot and w: Update non Wikipedias to 1.24wmf8 (duration: 00m 16s)
* 18:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Update non Wikipedias to 1.24wmf8
* 18:14 logmsgbot: reedy Synchronized php-1.24wmf8/extensions/Wikidata/: (no message) (duration: 00m 16s)
* 17:28 _joe|away: restarted profiler-to-carbon, stuck waiting data from mwprof
* 15:21 mutante: ekrem - rm from stored configs/icinga
* 15:12 mutante: ekrem - revoke salt,puppet keys, stop agents/minion
* 07:42 springle: enabled pt-slave-delay for dbstore1001, 24h all shards
* 06:12 springle: xtrabackup clone db1043 to db1048
* 04:57 springle: db1048 down for upgrade
* 03:40 springle: switched mchenry to use m2-master/m2-slave for OTRS address lookups
* 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 10 03:24:19 UTC 2014 (duration 24m 18s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-10 02:28:14+00:00
* 02:27 springle: switched traffic db1048 to db1020. broke gerrit briefly; see ops email
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-10 02:14:41+00:00
* 01:33 chasemp: restarted gerrit on ytterbium
* 01:01 manybubbles: upgraded all elasticsearch servers in production to 1.2.1.  They are just restoring the last few shards on the last node now and they'll spend a few hours tonight rebalancing after the upgrade but otherwise I'm done.
* 00:41 mwalker: updating donationinterface on payments from b4c5cf1bceb70d65eae28cdd0873036dc33c8992 to 6d74002f2634f41f7038daa7357ff6de55ee4880 for worldpay form error
 
== June 9 ==
* 23:58 manybubbles: lied - upgrading elastic1014
* 23:57 manybubbles: upgrading elastic1015
* 23:30 Krinkle: Reloading Zuul to deploy 6727b8b
* 23:12 logmsgbot: maxsem Synchronized php-1.24wmf8/extensions/MobileApp: (no message) (duration: 00m 03s)
* 23:11 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/MobileApp: (no message) (duration: 00m 03s)
* 23:03 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://bugzilla.wikimedia.org/66377 (duration: 00m 04s)
* 20:42 manybubbles: upgraded elastic1007-elastic1010 without issue - starting elastic1010
* 20:08 subbu: deployed Parsoid 9b673587 (deploy sha 7d0097a1)
* 19:23 ottomata: disabling puppet on analytics1012
* 18:59 ottomata: decomissioning analytics1012 in hadoop cluster, this will become a Kafka broker
* 17:58 manybubbles: elastic1004-1006 upgraded without trouble - cluster is working on filling elatic1006 before moving on to 1007, and the rest
* 17:04 andrewbogott: switching labs to puppet3
* 17:03 awight: update crm from b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7 to b6815d29de97b80a0ab65db576213a604f0c7cb9
* 16:30 manybubbles: upgrading elastic1003 - upgrade is going well so far so I'm going to stop watching it as closely and let it be more automated
* 15:28 manybubbles: elastic1001 went well, doing 1002 by hand again
* 15:17 logmsgbot: anomie Synchronized php-1.24wmf8/extensions/Wikidata: SWAT: Wikidata entity suggester bug fixes [[gerrit:138339]] (duration: 00m 16s)
* 15:12 greg-g: mw1151 still "permission denied" during deploys
* 15:12 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable TemplateData GUI on Portuguese Wikipedia [[gerrit:137986]] (duration: 00m 14s)
* 15:09 logmsgbot: anomie Synchronized php-1.24wmf7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: SWAT: VE fix for focus regression [[gerrit:137978]] (duration: 00m 15s)
* 15:06 andrewbogott: beta updating all instances to puppet 3 via a cherry-pick of https://gerrit.wikimedia.org/r/#/c/137898/ on deployment-salt
* 15:05 logmsgbot: anomie Synchronized php-1.24wmf8/extensions/VisualEditor/modules/ve-mw/: SWAT: VE fix for focus regression and alignment issues [[gerrit:137971]] [[gerrit:138122]] (duration: 00m 14s)
* 15:01 manybubbles: successfully synced plugins, upgrading elastic1001 to make sure everything is working ok with it - then we'll run through the others more quickly
* 14:57 manybubbles: syncing elasticsearch plugins for 1.2.1 - any elasticsearch restart from here on out needs to come with 1.2.1 or the node will break.
* 14:54 manybubbles: starting Elasticsearch upgrade with elastic1001
* 07:14 springle: disabled puppet on analytics1021 to avoid kafka broker restarting with missing mount
* 05:15 springle: xtrabackup clone db1046 to db1020
* 04:44 springle: umount /dev/sdf on analytics1021, fs in r/o mode, kafka broker not running. no checks yet
* 03:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun  9 03:23:05 UTC 2014 (duration 23m 4s)
* 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-09 02:28:08+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-09 02:14:46+00:00
 
== June 8 ==
* 23:27 p858snake|l: icinga has been shitting in the channel for 9+ hours (before I went to bed) about Varnishkafka, nothing noted in SAL. Here be a note about it.
* 03:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun  8 03:21:28 UTC 2014 (duration 21m 27s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-08 02:27:21+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-08 02:14:10+00:00
 
== June 7 ==
* 23:48 hoo: Fixed four CentralAuth log entries on meta which were logged for WikiSets/0
* 21:36 manybubbles: that means I turned off puppet and shut down Elasticsearch on elastic1017 - you can expect the cluster to go yellow for half an hour or so while the other nodes take rebuild the redundency that elastic1017 had
* 21:35 manybubbles: after consulting logs - elastic1017 has had high io wait since it was deployed - I'm taking it out of rotation
* 21:31 manybubbles: elastic1017 is sick - thrashing to death on io - restarting Elasticsearch to see if it recovers unthrashed
* 17:56 godog: restarted ES on elastic1017.eqiad.wmnet (at 17:22 UTC)
* 03:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun  7 03:23:32 UTC 2014 (duration 23m 31s)
* 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-07 02:29:57+00:00
* 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-07 02:16:30+00:00
 
== June 6 ==
* 23:51 Krinkle: Restarted Jenkins, force stopped Zuul, started Zuul, configure Jenkins via web interface (disable Gearman, save, enable German); Seems to be back up now, finally.
* 22:52 mutante: same for rhenium, titanium, bast1001, calcium, carbon, ytterbium, stat1003
* 22:42 RoanKattouw: Restarting Jenkins didn't help, jobs still aren't making it across from Zuul into Jenkins
* 22:36 RoanKattouw: Restarting stuck Jenkins
* 22:35 mutante: same for holmium, hafnium, silver, netmon1001, magnesium, neon, antimony
* 22:17 mutante: upgraded ssl packages on zirconium
* 21:57 Krinkle: Took Jenkins slave on gallium temporarily offline and back online to resolve possible stagnation
* 20:56 awight_: updated crm from ded541894a70922e098fb3ea48306c8ec0f0f6aa to b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7
* 18:24 mwalker: updating payments from e823354822c7a35e6c2069d3e72180a45dbc89dc to b4c5cf1bceb70d65eae28cdd0873036dc33c8992 for globalcollect oid hack
* 14:04 hashar: Gerrit back. chase rebooted it :)
* 13:55 hashar: Gerrit having some troubles: error: RPC failed; result=22, HTTP code = 503  (while cloning CirrusSearch )
* 12:58 cmjohnson1: replacing raid controller db1020
* 06:12 Tim: on osmium installed nodejs for testing
* 04:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun  6 04:23:08 UTC 2014 (duration 23m 7s)
* 03:13 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-06 03:12:19+00:00
* 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-06 02:42:28+00:00
* 00:38 bblack: nginx restarted on ssl*
* 00:16 mutante: fixed permissions on bugzilla's index.cgi, sry
 
== June 5 ==
* 23:18 logmsgbot: maxsem Synchronized php-1.24wmf7/includes/ChangeTags.php: https://gerrit.wikimedia.org/r/#/c/137563/ (duration: 00m 03s)
* 23:16 logmsgbot: maxsem Synchronized php-1.24wmf8/includes/ChangeTags.php: https://gerrit.wikimedia.org/r/#/c/137563/ (duration: 00m 03s)
* 23:06 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/TemplateData: https://gerrit.wikimedia.org/r/#/c/137751/ (duration: 00m 04s)
* 22:15 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ife5081549: Put $wgRCFeeds[rcs100x] config behind $wmfRealm check (duration: 00m 04s)
* 22:12 logmsgbot: ori updated /a/common to {{Gerrit|Ife5081549}}: Put $wgRCFeeds['rcs100x'] config behind $wmfRealm check
* 21:48 ori: updated eventlogging to a8602c1d879f
* 21:34 MaxSem: Renaming geo_killlist and geo_updates to *_old
* 18:36 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
* 18:35 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 13s)
* 18:17 Reedy: Created FlaggedRevs tables on ckbwiki
* 18:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Update group0 to 1.24wmf8
* 18:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf7
* 17:00 logmsgbot: reedy Synchronized wmf-config/: Wrap some long lines, add some docs (duration: 00m 26s)
* 16:43 bblack: rebooting lvs3002
* 16:36 paravoid: downpref all of amslvs* in favor of lvs30*
* 16:17 paravoid: downprefing amslvs1, upprefing lvs3001
* 16:02 mark: Connected cp3018:eth1 to cr1-esams:xe-0/0/3 (unconfigured)
* 15:59 _joe_: disabling puppet on virt1000 while we test the puppet3 upgrade on virt0
* 15:48 logmsgbot: reedy Finished scap: 2nd scap for 1.24wmf8, should be effectively a nooop (duration: 12m 33s)
* 15:35 logmsgbot: reedy Started scap: 2nd scap for 1.24wmf8, should be effectively a nooop
* 15:21 logmsgbot: anomie Synchronized php-1.24wmf6/extensions/VisualEditor/modules/ve-mw/ui/dialogs/: SWAT: Use <visualeditor-toolbar-cite-label> correctly in the Media and Reference toolbars [[gerrit:136783]] (duration: 00m 15s)
* 15:18 logmsgbot: anomie Synchronized php-1.24wmf7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/: SWAT: Use <visualeditor-toolbar-cite-label> correctly in the Media and Reference toolbars [[gerrit:136782]] (duration: 00m 12s)
* 15:04 logmsgbot: anomie Synchronized php-1.24wmf7/extensions/Popups/resources/: SWAT: Hovercard animation fixes [[gerrit:137530]] [[gerrit:137531]] [[gerrit:137532]] (duration: 00m 14s)
* 14:57 logmsgbot: reedy Finished scap: testwiki to 1.24wmf8 and build l10n cache (duration: 26m 23s)
* 14:54 hashar: restarting Zuul
* 14:31 logmsgbot: reedy Started scap: testwiki to 1.24wmf8 and build l10n cache
* 14:15 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.hiiCprts7Z" ' returned non-zero exit status 1 (duration: 00m 17s)
* 14:14 logmsgbot: reedy Started scap: testwiki to 1.24wmf8 and build l10n cache
* 14:07 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.WtQBrR6JUp" ' returned non-zero exit status 1 (duration: 01m 08s)
* 14:06 logmsgbot: reedy Started scap: testwiki to 1.24wmf8 and build l10n cahce
* 14:05 logmsgbot: reedy Purged l10n cache for 1.24wmf5
* 13:58 hashar: Adding unit tests Jenkins job for most mediawiki extensions {{gerrit|137578}}
* 12:05 godog: powercycling ms-be1005, no ssh, no console
* 10:28 godog: restarted uwsgi on tungsten
* 09:24 godog: moving bits traffic to the general appserver pool in eqiad
* 04:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun  5 04:09:50 UTC 2014 (duration 9m 49s)
* 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-05 03:02:00+00:00
* 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-05 02:32:06+00:00
* 02:23 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1007 (duration: 01m 26s)
* 00:46 bblack: lvs3002 (live uploads lb for esams) is running ntpd
 
== June 4 ==
* 23:43 Tim: on searchidx1001: restarting lsearchd and indexer
* 23:40 logmsgbot: mwalker Finished scap: Scapping for SWAT; MultiMedia viewer and config changes (duration: 22m 16s)
* 23:20 Tim: on searchidx1001: as a temporary hack to work around scap disk full errors, set up a bind mount at /usr/local/apache/common-local linking to a directory in /a, by local modification of /etc/fstab
* 23:18 logmsgbot: mwalker Started scap: Scapping for SWAT; MultiMedia viewer and config changes
* 21:56 logmsgbot: yurik Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/136503/ (duration: 01m 07s)
* 21:54 logmsgbot: yurik Synchronized mobilelanding.php: (no message) (duration: 01m 07s)
* 20:47 MaxSem: Truncating geo_killlist everywhere
* 20:33 subbu: deployed Parsoid 165a2042 (deploy sha fc1b1ed4)
* 19:04 bd808|deploy: Restarted elasticsearch on logstash1001; JVM OOM
* 19:00 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/GeoData/: (no message) (duration: 00m 04s)
* 18:58 logmsgbot: maxsem Synchronized php-1.24wmf6/extensions/GeoData/: (no message) (duration: 00m 03s)
* 18:43 bd808|deploy: mw1151 gave an ssh denied error for MaxSem during sync-dir
* 18:40 logmsgbot: maxsem Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/136487/ (duration: 00m 04s)
* 17:54 mutante: shutting down solr1001-1003
* 17:47 logmsgbot: yurik Synchronized php-1.24wmf7/extensions/ZeroRatedMobileAccess/: (no message) (duration: 01m 07s)
* 17:44 logmsgbot: yurik Synchronized php-1.24wmf6/extensions/ZeroRatedMobileAccess/: (no message) (duration: 01m 06s)
* 17:27 mutante: stopping puppet/salt on solr100[13], removed from icinga
* 16:36 robh: blog.wikimedia.org updated to latest wp version
* 16:13 mutante: installing package upgrades on bast1001
* 16:11 mutante: installing package upgrades on iron
* 15:59 mutante: killing puppet certs,salt keys for solr100[13].eqiad - decom
* 15:28 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: close wikimania2013wiki for real (duration: 00m 10s)
* 15:28 logmsgbot: manybubbles Synchronized closed.dblist: close wikimania2013wiki (duration: 00m 09s)
* 15:23 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: close wikimania2013wiki (duration: 00m 10s)
* 15:21 logmsgbot: manybubbles Synchronized php-1.24wmf6/extensions/MobileApp/: (no message) (duration: 00m 10s)
* 15:15 logmsgbot: manybubbles Synchronized php-1.24wmf7/extensions/MobileApp/: (no message) (duration: 00m 08s)
* 15:07 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT deploy for media viewer (duration: 00m 13s)
* 14:57 mutante: cleaning up duplicate cronjobs on terbium - all log to /var/log/mediawiki now
* 12:53 hashar: Zuul upgraded (git tag wmf-deploy-20140604 ).  Merges are now done by an indecent process zuul-merger
* 12:43 hashar: upgrading Zuul to split the merger part to an independent process. Short unscheduled downtime starting in a few minutes
* 07:51 _joe_: rebooted ms-be1001, host unresponsive to ping, blank console
* 06:14 springle: starting online schema change, bug 66089 gerrit 137149
* 04:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun  4 04:26:32 UTC 2014 (duration 26m 31s)
* 03:35 Krinkle: Deploy I882e3fa57b2e5e3de in Zuul and reload config
* 03:16 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-04 03:15:34+00:00
* 02:47 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-04 02:46:06+00:00
 
== June 3 ==
* 23:14 logmsgbot: ori Synchronized php-1.24wmf7/extensions/MobileApp: SWAT cherry-picks for MobileApp (with patch) (duration: 00m 04s)
* 23:11 logmsgbot: ori Synchronized php-1.24wmf6/extensions/MobileApp: SWAT cherry-picks for MobileApp (duration: 00m 04s)
* 23:10 logmsgbot: ori Synchronized php-1.24wmf7/extensions/MobileApp: SWAT cherry-picks for MobileApp (duration: 00m 03s)
* 23:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I9dac0dc6a80: Set $wgIncludejQueryMigrate = true; for all wikis (duration: 00m 03s)
* 22:41 logmsgbot: marktraceur Finished scap: Update Media Viewer preference string for wmf7 - already backported to wmf6 (duration: 13m 19s)
* 22:38 Krinkle: git-deploy: Deploying integration/slave-scripts If2e2e675802f
* 22:27 logmsgbot: marktraceur Started scap: Update Media Viewer preference string for wmf7 - already backported to wmf6
* 21:49 logmsgbot: marktraceur updated /a/common to {{Gerrit|I409703a11}}: Enable MMV by default on dewiki beta.
* 21:25 logmsgbot: marktraceur Synchronized mediaviewer.dblist: Enable media viewer by default on enwiki (duration: 00m 06s)
* 21:18 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: Throttle the MMV event logging a bit more for the launch today (duration: 00m 06s)
* 21:17 logmsgbot: marktraceur updated /a/common to {{Gerrit|I549906510}}: Launch Media Viewer for all users on English wikipedia
* 21:09 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: Touch InitialiseSettings.php because that's what we do (duration: 00m 06s)
* 21:08 logmsgbot: marktraceur Synchronized mediaviewer.dblist: Add dewiki to the on-by-default list for Media Viewer (duration: 00m 06s)
* 21:08 logmsgbot: marktraceur updated /a/common to {{Gerrit|Ie237b0ae1}}: Launch Media Viewer for all users on German wikipedia
* 20:51 MaxSem: Disabled GeoData updates on terbium
* 20:41 hashar: repack command:  find /srv/ssd/gerrit/ -type d -name '*.git' -print -exec git --git-dir="{}" repack -afd \; -exec git --git-dir="{}" pack-refs --all \;
* 20:41 hashar: Jenkins repacking gerritslave replicas on gallium and lanthanum. Running in screen as hashar -> gerritslave
* 18:14 logmsgbot: reedy Synchronized wmf-config/: Stop sending IRC RC to PMTPA (duration: 00m 17s)
* 18:07 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
* 18:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: All non wikipedias to 1.24wmf7
* 15:45 akosiaris: merged https://gerrit.wikimedia.org/r/#/c/133515/  which enabled ferm on hydrogen/chromium
* 15:41 logmsgbot: anomie Finished scap: SWAT: Update i18n for MultimediaViewer [[gerrit:136718]] (duration: 17m 56s)
* 15:23 logmsgbot: anomie Started scap: SWAT: Update i18n for MultimediaViewer [[gerrit:136718]]
* 15:03 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Lower MediaViewer sampling for enwiki and dewiki [[gerrit:136717]] (duration: 00m 14s)
* 13:05 paravoid: salt * start procps
* 11:13 _joe_: restarted jobrunners as they were blocked by restarting via cron
* 10:58 godog: try restarting mw-job-runner on mw1012
* 03:42 springle: revert to lvm snapshot on db1046, xfs being crotchety
* 03:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun  3 03:16:22 UTC 2014 (duration 16m 21s)
* 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-03 02:25:48+00:00
* 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-03 02:14:12+00:00
* 01:32 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: wgCentralAuthRC to EQIAD rc ircd (duration: 00m 14s)
* 00:28 awight: update crm from 5f6217d8f4d750087dcd37faca6b41de82d2362e to ded541894a70922e098fb3ea48306c8ec0f0f6aa
 
== June 2 ==
* 23:34 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/136428/ (duration: 00m 03s)
* 23:22 logmsgbot: maxsem Synchronized php-1.24wmf6/extensions/Flow/: (no message) (duration: 00m 04s)
* 23:21 logmsgbot: maxsem Synchronized php-1.24wmf6/extensions/VisualEditor/: (no message) (duration: 00m 03s)
* 23:20 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/VisualEditor/: (no message) (duration: 00m 04s)
* 23:18 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/136936/ (duration: 00m 05s)
* 22:49 Krinkle: Repooled integration-slave1003 in Jenkins.
* 22:37 Krinkle: Hack-patching integration-slave1003.eqiad.wmflabs per https://bugzilla.wikimedia.org/show_bug.cgi?id=61508#c2
* 21:30 mutante: searchidx1001 - low disk space, gzip MegaSAS.log, delete old kernel headers
* 21:18 awight: updated crm from b6e004f7349507523423c59170274150a44b0aaf to 5f6217d8f4d750087dcd37faca6b41de82d2362e
* 20:09 gwicke: deployed Parsoid 04a4bf2b
* 20:08 hashar: Jenkins unpolled integration-slave1003 npm is outdated there and does not trust npmregistry.org ( {{bug|61508}} )
* 19:29 awight: updated crm from ce64066316e77f6fc3545c6265e2d81e3ef773c4 to b6e004f7349507523423c59170274150a44b0aaf
* 19:18 awight: update crm from 5b231163e9e880de5b9787d40b679a6723748aca to ce64066316e77f6fc3545c6265e2d81e3ef773c4
* 18:58 logmsgbot: csteipp Synchronized php-1.24wmf6/includes/upload/UploadBase.php: (no message) (duration: 00m 04s)
* 18:51 logmsgbot: csteipp Synchronized php-1.24wmf7/includes/upload/UploadBase.php: (no message) (duration: 00m 06s)
* 18:41 awight: updated tools from d257e8445e028b758b1d1fa90c857667d4faac62 to cbcd14a84f7bc8682822d3b1910b48bfd932b00d
* 17:15 chasemp: disabling ircd on ekrem
* 17:05 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Cache search suggestions for 3 hours instead of 6 (duration: 00m 04s)
* 17:03 chasemp: moving irc.wikimedia.org to argon
* 16:27 ottomata: ran preferred-replica-election to fix vk delivery errors
* 16:24 logmsgbot: demon Synchronized wmf-config/throttle.php: Library of Israel editathon (duration: 00m 04s)
* 16:07 manybubbles: rebuilding all english non-wikipedias with unicode normalization
* 15:36 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy - more import sources and upload domains (duration: 00m 04s)
* 15:34 manybubbles: reindexing all hebrew wikis to switch them from the hebrew analyzer to proper unicode normalization
* 15:33 ottomata: attempting to powercycle analytics1015, it is not responding to pings, no output on console
* 15:33 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy changing some search settings (duration: 00m 05s)
* 15:26 hashar: restarted Zuul. All jobs lists :-(
* 15:25 hashar: Zuul stuck in a loop reporting a change :-(
* 15:20 hashar: Jenkins/Zuul stuck. Depooling/Repooling some slaves to reregister jobs with Zuul
* 14:51 ottomata: chown -R datasets /data/xmldatadumps/public/other/pagecounts-ez on dataset1001 to accompany 70a7f61, fixing bug 66005
* 12:44 akosiaris: manually ran puppet on mw11991
* 07:21 hashar: restarted Zuul unintentionally
* 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun  2 03:11:05 UTC 2014 (duration 11m 4s)
* 03:04 ori: ..on vanadium.
* 03:03 ori: moving /var/log/eventlogging/archive/* to /srv/eventlogging-logs to free up space on the root partition. unpuppetized for now, sadly.
* 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-02 02:23:57+00:00
* 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-02 02:12:53+00:00
* 02:06 logmsgbot: tstarling Synchronized php-1.24wmf6: Revert "Use square bounding boxes for default-sized thumbnails" (duration: 01m 18s)
* 02:02 logmsgbot: tstarling Synchronized php-1.24wmf7: (no message) (duration: 01m 31s)
 
== June 1 ==