You are browsing a read-only backup copy of Wikitech. The live site can be found at

Portal:Data Services/Admin/Wiki Replicas

From Wikitech-static
< Portal:Data Services‎ | Admin
Revision as of 19:00, 21 November 2017 by imported>Arturo Borrero Gonzalez (→‎From production to wiki-replicas: interchange steps 4 and 5)
Jump to navigation Jump to search

This page holds all the knowledge we have regarding Wiki Replicas for admins.

Service architecture by layers

Description of how the service is currently deployed.

Here is a general diagram of how things are:


Physical layer

This service is build on top of several physical servers:

All servers are the same:

  • HP ProLiant DL380 Gen9
  • 16 CPU Intel(R) Xeon(R) CPU E5-2637 v3 @ 3.50GHz
  • 500GB RAM
  • 1 hard-disc in /dev/sda with 11.7 TiB (probaby RAID, using hpsa kernel driver)
  • 4 x 1GB ethernet interfaces (tg3)

All monitoring, including RAID status, is done by icinga.

Storage layer

The service has a concrete storage configuration.

As seen by fdisk:

As seen by the LVM stack:

As seen by df:

All 3 servers should have more or less the same storage configuration.

The definition for this storage layout is done at install time:

DB layer

The databases in this service have a concrete layout/configuration.

TODO: fill info

Higher layers

Such as applications running on top, proxies, caches, et al.


main article Help:Toolforge/Database#Naming conventions

The DNS names of the wiki-replicas databases as seen from the SQL client point of view are things like:

  • ${PROJECT}.{analytics,web}.db.svc.eqiad.wmflabs. Example: enwiki.web.db.svc.eqiad.wmflabs
  • s${SHARD_NUMBER}.{analytics,web}.db.svc.eqiad.wmflabs. Example:

Puppet deployment

The cluster is currently deployed using operations/puppet.git.

All the servers are given the labs::db::replica role, which currently includes:

  • standard configs
  • mariadb packages + configs
  • ferm firewall + rules to allow the cluster to inter-comm
  • deployment of scripts, like maintain-views and friends

From production to wiki-replicas

Wiki databases from production are copied and sanitized to serve wiki-replicas.

Step 0: databases in production

The starting situation is that there are databases in production for wiki projects (like wikipedia, wikidata, wikictionary, and friends). We would like to provide this same databases for WMCS users. Due to privacy reasons, some data needs to be redacted or deleted. That's why users can't directly access this database.

So, we choose what databases to copy to wiki-replicas, which are all of them.

Every time a new database is created in production (for example, a new language for a wiki) we are in this step 0.

The identification of new database candidates for migrating to wiki-replicas is done under request by someone. Right now there aren't any mechanisms to notify pending migrations or the like.

Step 1: sanitization

Main article: MariaDB/Sanitarium and Labsdbs

The production database is copied to sanitarium boxes by means of MariaDB replication mechanisms (TODO: is this true? give more info if possible).

Each sanitarium host has a MariaDB instance to replicate each db shard. The replication into the sanitarium host uses triggers and filters to remove sensitive columns, tables and databases in the simple case where there are no conditions (e.g. Ensures user_password does not go into Cloud Services).

Having this redaction done on a separate host outside of Cloud Services helps isolate the security of the data and ensure a privilege escalation via the Cloud Services access does not compromise the most sensitive data in the db.

Some triggers are added by means of:

This step is handled by main operations teams, DBAs.

Step 2: evaluation

Main article: Labsdb redaction

Once data is in sanitarium boxes, some cron jobs and manual scripts check whether data is actually redacted. For example, check that a given column is NULL.

Involved code:

This evaluation also happens in the wiki-replica servers, and alerts in case some private data is detected.

The main production team, DBAs, are in charge of this step.

TODO: where are the cron jobs?

Step 3: filling up wiki-replicas

Data is finally copied to wiki-replica servers.

TODO: How is this data copied?

TODO: Is there any method to perform real-time replication? If not, data in wiki-replicas would be outdated soon.

TODO: Who does this step?

Step 4: setting up GRANTs

Create database GRANTS (by main operations team DBAs).

TODO: elaborate info

Step 5: setting views

Create _p views, which are intermediate views which left behind private data.

This is done by means of the script.

In each wiki-replica server, it's executed like this:

% sudo maintain-views --databases $wiki --debug

This step is handled by the WMCS team.

TODO: elaborate info on what is this doing

Step 6: setting up metadata

Insert a new row in for the new wiki by running the maintain-meta_p script.

The execution is like this:

% sudo /usr/local/sbin/maintain-meta_p --databases $wiki. 

This step is handled by the WMCS team.

TODO: elaborate info on what is this doing

Step 7: setting up DNS

Create a patch to add the database to the correct list in hieradata/common/profile/openstack/base/pdns/labsdb.yaml.

Also, in, run:

% source <(sudo cat ~root/
% sudo /usr/local/sbin/wikireplica_dns --aliases --shard <sN>

This step is done by the WMCS team.

TODO: elaborate info on what is this doing

Step 8: all is done

All is done, wiki-replica contains a mirror of the production database. Finally WCMS users/tools/projects are able to query the database/tables.

This is usually done by using the sql wrapper script.

TODO: the benefits of using the sql script.

Admin guide

Docs to perform common tasks related to this service. As detailed as possible.

Who admins what

  • main productions cluster: main operations team, DBAs
  • sanitarium cluster: main operations team, DBAs
  • wiki-replicas cluster: WMCS, with some support from DBAs
  • wiki-replicas DNS: WMCS


Before this new cluster, we had an old cluster composed of physical servers: labsdb1001, labsdb1002 and labsdb1003.

At the time of this writing, the last standing server is labsdb1003, which is going down in December 2017.

See also