You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Portal:Data Services/Admin/Wiki Replicas
This page is currently a draft.
More information and discussion about changes to this draft on the talk page.
This page holds all the knowledge we have regarding Wiki Replicas for admins.
Service architecture by layers
Description of how the service is currently deployed.
Here is a general diagram of how things are:
This service is build on top of several physical servers:
All servers are the same:
- HP ProLiant DL380 Gen9
- 16 CPU Intel(R) Xeon(R) CPU E5-2637 v3 @ 3.50GHz
- 500GB RAM
- 1 hard-disc in /dev/sda with 11.7 TiB (probaby RAID, using hpsa kernel driver)
- 4 x 1GB ethernet interfaces (tg3)
All monitoring, including RAID status, is done by icinga.
The service has a concrete storage configuration.
As seen by fdisk:
% sudo fdisk -l Disk /dev/sda: 11.7 TiB, 12802299617280 bytes, 25004491440 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 262144 bytes / 2097152 bytes Disklabel type: gpt Disk identifier: 0933E022-348E-4F59-8F84-D4C3B32090BD Device Start End Sectors Size Type /dev/sda1 4096 78123007 78118912 37.3G Linux filesystem /dev/sda2 78123008 93749247 15626240 7.5G Linux swap /dev/sda3 93749248 25004490751 24910741504 11.6T Linux LVM Disk /dev/mapper/tank-data: 11.6 TiB, 12754295455744 bytes, 24910733312 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 262144 bytes / 2097152 bytes
As seen by the LVM stack:
% sudo pvs PV VG Fmt Attr PSize PFree /dev/sda3 tank lvm2 a-- 11.60t 0 % sudo vgs VG #PV #LV #SN Attr VSize VFree tank 1 1 0 wz--n- 11.60t 0 % sudo lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data tank -wi-ao---- 11.60t
As seen by df:
% df -h Filesystem Size Used Avail Use% Mounted on udev 252G 0 252G 0% /dev tmpfs 51G 4.1G 47G 8% /run /dev/sda1 37G 11G 25G 31% / tmpfs 252G 0 252G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 252G 0 252G 0% /sys/fs/cgroup /dev/mapper/tank-data 12T 6.5T 5.2T 56% /srv
All 3 servers should have more or less the same storage configuration.
The databases in this service have a concrete layout/configuration.
Such as applications running on top, proxies, caches, et al.
The cluster is currently deployed using operations/puppet.git.
All the servers are given the labs::db::replica role, which currently include:
- standard configs
- mariadb packages + configs
- ferm firewall + rules to allow the cluster to inter-comm
- deployment of scripts, like maintain-views and friends
For the basic install, these puppet config applies as well:
From production to wiki-replicas
Wiki databases from production are copied and sanitized to serve wiki-replicas.
- There are servers for data sanitization
- triggers to redact data: IP addresses, names, deleted revisions
- this happens before the data leaves prod
- Some cron jobs check whether data is redacted, for example check that a column is NULL
- Then, copy to wiki-replicas
- And run maintain-views and friends
TODO: actually explain things
Docs to perform common tasks related to this service. As detailed as possible.
Before this new cluster, we had an old cluster composed of physical servers: labsdb1001, labsdb1002 and labsdb1003.
At the time of this writing, the last standing server is labsdb1003, which is going down in December 2017.