You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

RCStream: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Krinkle
imported>BryanDavis
(BryanDavis moved page RCStream to Obsolete:RCStream: Moving decomissed service docs to Obsolete namespace)
 
Line 1: Line 1:
[[File:RCStream example.png|thumb|374px|link=http://codepen.io/Krinkle/pen/laucI/?editors=001|Example client at [http://codepen.io/Krinkle/pen/laucI/?editors=0010 codepen.io/Krinkle/pen/laucI]]]
#REDIRECT [[Obsolete:RCStream]]
 
{{Ambox
|type=content
|text=This service has been '''deprecated''' in favor of [[EventStreams]], available at [https://stream.wikimedia.org/?doc https://stream.wikimedia.org].  As of 2017-07-10, RCStream is offline, and will no longer serve events.
}}
 
'''RCStream''' is a simple server daemon that broadcasts "recent changes" events from MediaWiki wikis using the [http://socket.io Socket.IO] 0.9 protocol. The main instance used to run at <code>stream.wikimedia.org/rc</code>, broadcasting changes from all public wikis in the Wikimedia production cluster.
 
'''stream.wikimedia.org/rc''' provided a live data stream of edits on Wikimedia wikis that anyone can tap and use to power editor tools and web apps, create beautiful visualisations, inform research, and extend MediaWiki.
 
RCStream subscribes to the [[mw:Manual:RCFeed|RCFeed]] from MediaWiki wikis. As a web developer, one can open the stream using JavaScript. As an app developer, one can use a suitable client library for your platform.
== API ==
RCStream provides a simple API for subscribing to [[mw:Manual:$wgRCFeeds|RCFeeds]] of MediaWiki wikis. After connecting you emit a <code>'subscribe'</code> event, specifying the wikis you wish to subscribe to. This use any of the below formats:
* a single hostname, such as <code>nl.wikipedia.org</code>.
* an array of hostnames.
* hostnames matching a wildcard pattern such as <code>*.wikivoyage.org</code> or <code>nl.*</code>.
* all wikis, by subscribing to the special topic name <code>*</code>.
 
You then receive <code>'change'</code> events whose data is an [[mw:RCFeed#Properties|RCFeed structure]] containing the type of change, the title of the page, the new revision number, etc.
 
The Socket.IO server uses the <code>/rc</code> namespace. It also implements an <code>/rcstream_status</code> endpoint that exposes internal state about connected clients and queue size that may help when debugging.
 
== Consumers ==
*[https://github.com/notconfusing/cocytus Cocytus]. Tracks citations on Wikipedia.
*[http://wikimedia.meteor.com/ Meteor DDP]. Proxies change events combined with page content and diff from the API.
*[http://datasift.com/source/44/wikipedia Datasift]
 
*[http://codepen.io/Krinkle/pen/laucI/?editors=0010 Demo on CodePen.io]. Example listener using JavaScript in the browser.
*Various researchers (per wiki-research-l, December 2014)
*[[mw:Manual:Pywikibot|Pywikibot]]
== Client example ==
{{Alert|content=As writing (January 2015), RCStream implements version 0.9 of the Socket.IO protocol, not 1.0 ([[phab:T68232]]). See also [https://github.com/Automattic/socket.io/tree/0.9.17 socket.io 0.9] and [https://github.com/Automattic/socket.io-client/tree/0.9.17 socket.io-client 0.9] on GitHub for more information.}}
 
=== JavaScript ===
<syntaxhighlight lang="javascript">
// Requires socket.io-client 0.9.x:
// browser code can load a minified Socket.IO JavaScript library;
// standalone code can install via 'npm install socket.io-client@0.9.1'.
 
var io = require( 'socket.io-client' );
var socket = io.connect( 'https://stream.wikimedia.org/rc' );
 
socket.on( 'connect', function () {
    socket.emit( 'subscribe', 'commons.wikimedia.org' );
} );
 
socket.on( 'change', function ( data ) {
    console.log( data.title );
} );
</syntaxhighlight>
 
=== Python ===
Install dependencies:<syntaxhighlight lang="shell">
pip install socketIO_client==0.5.6
</syntaxhighlight>Get stream of events:<syntaxhighlight lang=python>
#!/usr/bin/python
# -*- coding: utf-8 -*-
import socketIO_client
 
class WikiNamespace(socketIO_client.BaseNamespace):
    def on_change(self, change):
        print('%(user)s edited %(title)s' % change)
 
    def on_connect(self):
        self.emit('subscribe', 'commons.wikimedia.org')
 
 
socketIO = socketIO_client.SocketIO('https://stream.wikimedia.org')
socketIO.define(WikiNamespace, '/rc')
 
socketIO.wait()
</syntaxhighlight>
 
== Wikimedia deployment ==
The [[mw:Manual:RCFeed|RCFeed]] of Wikimedia wikis is configured using [[mw:Manual:$wgRCFeeds|$wgRCFeeds]]. The JSON formatter is used with the Redis engine.
 
The RCStream servers are <code>rcs1001</code> and <code>rcs1001</code> (see also [https://github.com/wikimedia/operations-puppet/blob/1dffa7ab/manifests/site.pp#L2087-L2095 puppet node] and [https://github.com/wikimedia/operations-puppet/blob/1dffa7ab/manifests/role/rcstream.pp puppet role]). Each backend server runs multiple instances of RCStream, as well as a Redis instance that receives RCFeed messages from the [[Application servers|MediaWiki app servers]]. The servers exposes the RCStream backends through a local Nginx reverse proxy.
 
An [[LVS]] load balancer (<code>stream-lb</code>) is situated in front of the backend servers, exposed as [[stream.wikimedia.org]].
 
The RCStream server also responds at <code><nowiki>https://stream.wikimedia.org/rcstream_status</nowiki></code> with a simple text message; check this if you do not receive any events.
 
=== Beta Cluster ===
The [[Beta cluster]] has a simplified setup on a single VM instance running the rcstream Puppet role, exposed as [http://stream.wmflabs.org/ http://stream.wmflabs.org].
== See also ==
{{SourceLinks|url=https://github.com/wikimedia/mediawiki-services-rcstream|text=rcstream}}
* [https://phabricator.wikimedia.org/tag/wikimedia-stream/ Issue tracker (Phabricator workboard)]
* [[mw:Requests_for_comment/Publishing_the_RecentChanges_feed|MediaWiki RFC: RecentChanges feed]] - Original proposal that led to the creation of RCFeed logic in [[mw:|MediaWiki]] and the RCStream web service.
* [[irc.wikimedia.org]]: The older UDP-to-IRC service. Previously this was the only available logic in MediaWiki for exposing recent changes in real-time.
 
[[Category:Decommissioned services]]

Latest revision as of 17:45, 6 July 2018

Redirect to: