You are browsing a read-only backup copy of Wikitech. The live site can be found at

Analytics/Systems/Wikistats 2

From Wikitech-static
Jump to navigation Jump to search

About Wikistats

Wikistats 2.0 Main.png
See also mw:Analytics/Wikistats

Wikistats is the public statistics website of the Wikimedia Foundation (not to be confused with the Cloud VPS project also called Wikistats). Its main purpose is to add context and motivate our editor community by providing a set of metrics through which users can see the impact of their contributions in the projects they are a part of. In Wikistats 2 we are not only updating the website interface but we are also providing new access to all our edit data in an analytics-friendly form. The transition of relying on static, precomputed datasets generated periodically into APIs querying our data lake improves drastically (and fundamentally changes) the way, time and resources it takes to calculate edit metrics both for the WMF and the community.

There are notable differences between the UI of "old" and "new" wikistats. However, the main difference among those two systems is the backend. Wikistats 2's metrics are computed by extracting data from MediaWiki databases, processing it and re-storing it on an analytics-friendly form so metrics can be extracted easily. Data used and served by Wikistats 2 will be all public, the source of data for the system is the new database replicas on labs.

Design Process

During Q2 and Q3 of the 2016-2017 FY the Analytics team contracted designer Aislinn Grigas to produce the design of the new Wikistats web application. There were two rounds of public requests for feedback, one for the preliminary wireframes, and one for a final candidate of the design. Prototyping and implementation of the website started on Q4 2017. The designs and wireframes produced are archived in MediaWiki here


Wikistats 2 Backend.png

Wikistats 2 is a client-side only single page application, this means that it does not have a server component and can be served from anywhere, varnish, apache or even amazon s3. This way Wikistats remains abiding by the right to fork rule, as recreating it in a different server or in a local machine only requires cloning the repository and following the installation steps.

The application user interface is divided in two main sections:

  • The dashboard: where the ~12 most important metrics are shown with a small graph (when applicable), some simple aggregations (sum, increse/decrease over time...). All metrics belong to one of three main areas: reading, contributing, and content.
  • The detail page: once a metric is clicked, the app transitions into a page where the full graph of that metric is shown, with the possibility to see directly the underlying data in a table view. The page includes UI elements to gain more insights of the data, such as the breakdown selectors (splitting the metric according to several possible criteria), or the time range and granularity selectors.

The backend layer for Wikistats 2 is in alpha version and it is made by a set of API endpoints in the Analytics Query Service:

Local install for development

Cloning the project

The minimum requirements to install the Wikistats UI are Node.js (with the npm package manager) and Git. The project is hosted in a Phabricator repository

git clone
cd wikistats2 npm install

Third-party UI elements

Wikistats uses many components from the Semantic UI library, which requires a special initialization with gulp when installing the project:

npm install -g gulp
git clone semantic
cd semantic
gulp build

Generating the bundle

Last, you need to generate the Javascript bundle that contains the Wikistats project, its dependencies and the stylesheets. Assuming you want a development environment, you should run:

npm run dev

This command will set up a watcher that will rebuild the bundle each time a project file changes. The production environment won't minify the bundle so that code is readable within the browser developer tools. This will generate the static site in ./dist-dev within your wikistats repository directory. In order to see the built site you need a simple http server such as python's SimpleHTTPServer

python -m SimpleHTTPServer 5000

The application should be now working in localhost:5000



Wikistats uses Vue.js as its web framework. All the components that make up the application's structure are stored in the src/components directory, the most important ones being App.vue, Dashboard.vue and Detail.vue. We recommend you to install the Vue Developer Tools for your web browser to have a clear picture of what each component is doing and what data is it handling.

Vue 3 brings performance wins and better reactivity, let's evaluate a possible migration. Version 3 is more modular and allows better tree-shaking when building, as well as a better static analysis compiler. We could get some of these benefits with a simple upgrade, but a true migration would mean rewriting our components to use the composition API and to rethink our bundling. The main benefit would be performance, and that's not a major concern right now. I would say we have more to gain from a general clean-up of state and routing, making sure all updates to state are consistently done in the same way through vuex. This would be a maintenance win, the majority of our time on Wikistats in the past year has been puzzling out problems with state. Other priorities that rank above a migration to the composition API include: increased security by moving to a vetted npm package repository, the data exploration UI that Fran proposed, and general improvements to look and feel. Wikimedia at large is starting out with Vue 2 on their proof of concept projects, so I propose we align with them. Sharing conventions and best practices seems like time better spent than upgrading. For others reading this, docs on composition api and migration.

State management and data flow

We try to avoid passing properties down the Vue component hierarchy in more than two levels. If a property is important enough that it should be passed across the whole application, we prefer it to accessed via a state manager. We user VueX as our state manager, which is declared in src/store .

Metric models

The metric data coming from the APIs is converted into a DimensionalData (src/model/DimensionalData.js) object, which uses Crossfilter.js as its local storage. The DimensionalData API allows the application to simply filter, breakdown and aggregate the data being explored in the application.


We use d3 version 4+ for our visualizations. With this release of d3, it's possible to include only the code that we use instead of the whole library. This will allow us to bundle and optimize Wikistats for mobile devices.


Wikistats 2 is localized via Translatewiki and date/time/number formatting libraries. As translation to new languages is completed, we can include a language variant in the build by adding them to the src/languages.json file. When building, we print out the percent translated of languages that are not yet included. If you're deploying Wikistats and see new completed translations, check the Translatewiki link and add the language if everything looks good.


Tests are located in the test directory. We use Jasmine as our testing library and Karma as the test runner. Running the following:

npm test

will initialize a karma watcher that will run the webpack bundler each time a test change, and evaluate the whole test suite, printing out any failures in the console. Beware the by default, npm test will use Google Chrome as the testing browser. If you're using a different browser or environment you should change it in karma.conf.js

Additionally, there are smoke tests to be performed with each significant change to the codebase, which are described in Analytics/Wikistats 2/Smoke testing.

Contributing and Deployment

Git repo

Wikistats 2 used Phabricator's repository management software, Differential, as its code hosting solution until the end of Q1 2017. As of October 2017, like most Wikimedia projects, Wikistats 2's code is hosted in its Gerrit repository and mirrored in Github as a read-only repo. Read the Wikitech page for information on how to contribute to the projects using Gerrit.

When you've created a gerrit-compliant git commit (with a change-id appended), you can open a new code review by running:

git push origin HEAD:refs/for/master


Smoke tests

Main article: Analytics/Wikistats 2/Smoke Testing

Adding languages

When a language in Wikistats translatewiki page has a translation coverage of 75% or more, we consider it to be ready for production. However, newly translated languages need a manual step to be included in production. Here's a step-by-step on how to add those languages:

  • Go to Wikistats translatewiki and sort the languages by coverage.
  • Open the src/languages.json file in the Wikistats repo.
  • Manually get the languages that have a coverage higher than 75%, but are not in the languages.json file.
  • For each one of them add a code snippet in languages.json (in alphabetical order if possible) like:
"ko": {
  "numbroCode": "ko-KR",
  "englishName": "Korean",
  "nativeName": "한국어"
  • The key should match the short language code specified in translatewiki's "Language" column (before the colon).
  • The numbroCode should match the most appropriate language code in If numbro does not support the language, use "en".
  • You can usually find the native name of a language in Wikipedia :]
  • Finally, run a build to make npm collect all the specified languages from translatewiki.


Testing in beta

Even though it's not required in all changes to the UI, it is recommended to put the site bundle generated with your gerrit change in our canary website. It might be helpful for the team or other people to test your feature in different browsers or devices, especially if they don't have the Wikistats dev environment set up in their machines.

Because we want to be able to easily debug the site in the canary, we push the development bundle of the UI. In your Wikistats directory:

rm -rf dist-dev
npm run dev

When the bundle is generated, quit the process and copy the contents of the dist-dev directory to the canary machine (this assumes you have access to Labs):

scp -r ./dist-dev/* @dashiki-staging-02.eqiad.wmflabs:/srv/static/<<some-folder-name>>/

Once the copy is complete, the site should be available at<<some-folder-name>>

Releasing a new version to production

(A new approach using a Dockerfile, WIP, skip past this section until it's done)

# from the wikistats folder on your machine:
docker run --rm --volume $(pwd):/ws -it debian:buster bash -l
echo "deb buster-backports main" > /etc/apt/sources.list.d/buster-backports.list
apt-get update
apt-get install -t buster-backports npm git
# this seems to work, one small annoyance: it uses the docker root user to build and output to dist/

(skip to here)

Deploying is done by pushing the latest stable app bundle to the release branch.

The first step is to bump up the version of Wikistats in the first lines of package.json:

    "name": "wikistats",
    "version": "x.y.z",

Then you need to generate the new bundle in a clean dist folder:

rm -rf dist/
npm install
npm run build

This will:

  1. delete the existing production bundle (rm)
  2. refresh package dependencies and package-lock.json file (npm install)
  3. generate a new one with all the JS and CSS minified and transpiled for compatibility with most browser versions (npm run build).

Before deploying, make sure to do some smoke tests in as many browsers as possible with the bundle you just created. Go to the dashboard through to a metric and change the breakdowns a bit.

  • TODO: add some guidance about how to do it.

In cases when you don't use the code for a long time, you might end up with a broken version of semantic. This leads to the npm run build step to end up with an error (be careful, the error lines are GREEN!!) mentioning path with semantic. A working solution has been to delete the repository and clone it again (The Hammer).

Then finally submit a new patch for the master branch:

git add package.json package-lock.json dist
git commit -m "Release x.y.z"
git tag x.y.z
git push origin HEAD:refs/for/master

Once you go to gerrit and merge the change it will get sync-ed on the machine (from which the UI is served) by the next puppet run.

We used to release from the release branch but we changed that process on 2018-08-29. See:

Give it a half an hour for the job to run and you should be able to see your changes live in

Supported browsers

Wikistats 2 uses es6 syntax and thus needs transpiling libraries to ensure that browsers that do not implement es6 can still display the app. It also uses babel-polyfills to avoid having to write browser-dependent code and ensure all targeted browsers can display the site.

The following browsers are supported by Wikistats 2 and have been tested with positive results:

  • Last 2 versions of Google Chrome
  • Last 2 versions of Google Chrome Mobile
  • Last 2 versions of Safari
  • Last 2 versions of Mobile Safari
  • Last 2 versions of Mozilla Firefox
  • IE 10, IE 11 and Microsoft Edge

Project Development

For Analytics' team members

Phab tasks

Meeting notes



We talked about overall objective of project this quarter: "establish technical viability of our workflow of data". It has two main steps: bootstrapping of data and updates. Bootstrapping will be done from dumps or database. Aaron mentioned that db will be best as we might get better quality data. Updates are to come from event stream. Event Stream Schemas:

We talked about us having ability to change publishing on event stream from mw to whatever is needed, we got input from Aaron for the schemas that eventBus uses. Regarding scaling: loading of boostrapping data is really a 1-off. Can we move data from altiscale cluster to our cluster? Can we get a db slave just for this? *(sounds like this last one is easy to do)

To reduce our iteration cost: can we calculate metrics for just one project to start? We are focusing on data workflow rather than data precision.

Action Items: Joseph to look at event stream schemas and to verify if we have enough data to do a metric vertical (Pages Created?)