You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Tool:Dispatcher

From Wikitech-static
Jump to navigation Jump to search

On tools project there is a service, called bot dispatcher. This document will explain to you how it works and how you can use it.

What it is

Purpose of bot dispatcher is to allow bot operators to create subscriptions to wikimedia RC feed. These subscriptions may trigger some events that will result in some actions being done. This way the bot operators can make it easier for a bot to watch and operate over selected pages. (Bots do not need to watch the wikis themselves, they just subscribe to list of pages and are notified about actions done over these pages.

For wikitech

Proposed design

Bot dispatcher consist of

Daemon

Service running on tool labs that is listening on some tcp port, which other users can connect to and control it. Or they can use some of interfaces which makes this easier for them.

Daemon is ultimately responsible for handling the user requests and for watching the wikis.

Terminal interface

Is a simplest interface that allows users to manage the subscriptions

Web interface

Is a hight tech interface that allows users to manage the subscriptions

Subscription

Every subscription needs to have a unique name. Subscription consist of several data:

  • Name - name of subscription
  • Token - secret token that can be used to manipulate it
  • List of definitions of what to watch

Definition

Every subscription can contain unlimited number of definitions, definition contains of:

  • Wiki name - identifier of which wiki should be watched
  • Page name - either name of regex matching the page which is supposed to be watched
  • User name - either user name or regex matching the user names of users who perform the actions
  • Flags - users can optionally watch only bot / minor changes etc

Definitions can be added and removed from a subscription using one of interfaces

Notification mechanisms

In future there could be more of them

Redis

Every active subscription will insert a data to redis queue of same name as the subscription name (for this reason the subscription should have hard to guess name)

Supported outputs

The redis will support multiple outputs just as mediawiki does for api's so that it's as much easy as possible for target developers to implement this feed in their bots, for now there are 3

  • Pipe separated values: wiki|title|username|action|diffid|changeid|changesize|summary
  • XML
<rc wiki="Name of wiki" title="Name of page" username="Name of user who made a change" action"type of rc change" diffid="id of diff" changeid="id of change" changesize="size of change">Summary</rc>
  • JSON

Technical details

Communication between daemon and client

This describe how the communication between daemon and client is done. You don't need to know how this works unless you are:

  • Going to help with development of this service
  • Going to implement own interface into your bot that will dynamically change the subscription without using any of prebuilt interfaces

The communication between the daemon and client (interface or your program) is done using a tcp connection. You need to obtain the hostname and port where to connect:

  • Host: /data/project/dispatcher/hostname
  • Port /data/project/dispatcher/port

When you open the socket, you can use following commands in this syntax:

command parameters separated by space\n

List of commands is bellow

quit

Close the connection

subscribe <name of subscription>

Request to create a new subscription, if the request is successful, the token for this subscription is created and sent back to client

unsubscribe <name of subscription> <token>

Remove a subscription, you need to provide a token for this

auth <subscription> <token>

Authenticate so that you get access to insert / remove definitions

insert [xml|json]

Insert a list of definitions either in xml or json to subscription, you need to authenticate before you can use this command

Example

insert xml\n
<items>\n
  <item page="Wikipedia:Blah">en_wikipedia</item>\n
</items>\n

remove [xml|json]

Remove a list of definitions either in xml or json from subscription

list [xml|json]

Send you a list of existing definitions

format [pipe|xml|json]

Switch a format that you want to use for data produced by dispatcher

Example usage

Script is @ /data/project/dispatcher/bin/dispatcher-cli

petrb@tools-login:~$ dispatcher-cli --help
usage: dispatcher-cli [-h] [--token TOKEN] [--unsubscribe UN_SUBSCRIPTION]
                      [--subscribe SUBSCRIPTION] [--insert INSERT]
                      [--remove REMOVE] [--wiki WIKI] [--page PAGE]
                      [--pagerx PX] [--user USER] [--userrx UX]

Command line interface for dispatcher daemon. Dispatcher is an utility that
let you subscribe to RC feed in order to get notified about changes to
selected pages. (In this moment only redis queues are supported)

optional arguments:
  -h, --help            show this help message and exit
  --token TOKEN         Secret token that needs to be provided in order to
                        modify existing subscription
  --unsubscribe UN_SUBSCRIPTION
                        Remove a subscription with a given name.
  --subscribe SUBSCRIPTION
                        Create a subscription with a given name. This name
                        must be system unique
  --insert INSERT       Insert a data to subscription with a given name. You
                        need to provide token, wiki and page or user (or both)
  --remove REMOVE       Remove a data from subscription with a given name. You
                        need to provide token, wiki and page or user (or both)
  --wiki WIKI           Specify a name of wiki to watch, this argument is
                        needed for insert and remove
  --page PAGE           Name of a page to watch (case sensitive)
  --pagerx PX           Regular expression the page name should match
  --user USER           Username to watch
  --userrx UX           Regular expression that username should match

You have a bot that needs to check every single edit done to pages starting with Wikipedia:Articles_for_creation/

execute:

petrb@tools-login:~$ dispatcher-cli --subscribe mybot
Attempting to create new subscription mybot
TOKEN: FNVUIPDZHQHNLYJPEHDETZRRDJRIQXXLNWPWBPSR

petrb@tools-login:~$ dispatcher-cli --insert mybot --token FNVUIPDZHQHNLYJPEHDETZRRDJRIQXXLNWPWBPSR --wiki en_wikipedia --pagerx Wikipedia:Articles_for_creation/.*
Logging to a subscription
Attempting to insert data
Inserted: 1

the redis queue called mybot is automatically created and being filled up by changes to these pages