You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Tool:Deputy
![]() | |
---|---|
Website | https://deputy.toolforge.org |
Description | Bulk data processor for Deputy users |
Keywords | copyright, data processing, api, javascript, nodejs, typescript |
Author(s) | Chlod Alejandrotalk |
Maintainer(s) | Chlod (View all) |
Source code | https://github.com/ChlodAlejandro/deputy-dispatch |
License | Apache License 2.0 |
Issues | https://github.com/ChlodAlejandro/deputy-dispatch/issues |
Dispatch (or Deputy Dispatch) is a Node.js + Express webserver that exposes API endpoints that processes large masses of data from Wikimedia wikis for easier consumption by Deputy. It is meant to centralize and optimize the gathering and processing of bulk data such that numerous users of Deputy do not individual make taxing requests on Wikimedia servers.
This user makes requests under the user, but does not make any edits. It purely reads data from the Wikimedia servers, and the logged-in status allows it to query more than an anonymous user would be able to.
Usage
Dispatch is primarily used through Deputy. Deputy has been built to work cross-wiki and integrate with Dispatch to support every single Wikimedia wiki, with an out-of-box configuration which can handle simple copyright management tasks on the wiki.
The Dispatch API can also be used directly. Documentation for the API is automatically generated, and can be found here.
Asynchronous jobs
Some tasks done by Dispatch may require longer periods of time to run. Though these usually last under 3 minutes, timeouts or network issues may not be able to sustain such a connection for a prolonged period of time. For this reason, tasks which take a while to execute must be ran through asynchronous job requests. An initial request is sent to Dispatch (using POST
) which returns a job ID. The progress of the job can then be polled using a GET to the /{id}/progress
sub-path of that endpoint. Lastly, the result of that job when it completes can be accessed with a GET to the /{id}
sub-path of that endpoint.
Note that attempting to access the result early will end up in a 409 Conflict HTTP error. The data is usually cached for an hour before being discarded. Refer to the documentation for the task information schema.
Deployment
The deputy
tool uses a standard Node.js web service to operate. Logging into the tool with become deputy
will immediately drop you into the ~/www/js
folder so you can get to work immediately.
Deployments are not automatic. As the Wikimedia GitLab instance develops, this may change in the future. For now, the following steps are used to deploy new versions of the tool.
- [me@tools-sgebastion-XX]
become deputy
- That's pretty obvious already.
- [tools.deputy@tools-sgebastion-XX]
git pull
- Change as needed to download specific tags, branches, etc.
- [tools.deputy@tools-sgebastion-XX]
webservice shell
- Getting a shell on the actively-running webservice ensures that the correct Node.js version is being used.
- When in doubt, you can use a temporary Node.js pod as well.
- [tools.deputy@shell-XXXXXXXXXX]
cd $HOME/www/js
- In case you're not there already.
- [tools.deputy@shell-XXXXXXXXXX]
npm ci
- This re-downloads dependencies.
- [tools.deputy@shell-XXXXXXXXXX]
npm run build:tsoa
- This rebuilds the API routes and documentation.
- Routes are handled by the package tsoa, hence the name.
- Contrary to what you'd expect, TypeScript isn't actually transpiled into JavaScript in this step. TypeScript is run directly (see step 7).
- [tools.deputy@shell-XXXXXXXXXX]
exit
- In other words, leave the webservice shell.
- [tools.deputy@tools-sgebastion-XX]
webservice restart
- This restarts the webservice, which makes the bot use new code.
- All done, you can now log out and exit.
For deployment issues, you can email wikichlod.net or use Special:EmailUser/Chlod Alejandro. If you both break and fix Dispatch (and you're not User:Chlod Alejandro), you get a complimentary chocolate chip cookie.
Development
Instructions on how to get a development set-up of Dispatch can be found at https://github.com/ChlodAlejandro/deputy-dispatch#contributing. Note that you will need a Toolforge account, because you'll need access to the Wiki Replicas. Attempting to run Dispatch without properly setting the database connection information up will cause a big warning to show up and all endpoints which rely on the Replicas to return errors.