You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Media storage/Backups: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Jcrespo
imported>Jcrespo
(+image)
Line 11: Line 11:
   - https://backup1007.eqiad.wmnet:9000</nowiki>
   - https://backup1007.eqiad.wmnet:9000</nowiki>


files whose hash start (in hexadecimal) with 0-3 go to backup1004, 4-7 to backup1005, 8-B to backup1006 and C-F to backup1007. This is not guarantee to stay like that, as servers will likely be unavailable for maintenance at times, and the number of servers will be expanded, which means eventually one will be forced to use the metadata database to locate the server where a file is located.
files whose sha256 hash start (in hexadecimal) with 0-3 go to backup1004, 4-7 to backup1005, 8-B to backup1006 and C-F to backup1007. This is not guarantee to stay like that, as servers will likely be unavailable for maintenance at times, and the number of servers will be expanded, which means eventually one will be forced to use the metadata database to locate the server where a file is located (this can be done with the recovery script).
 
== Small recovery script ==
 
[[File:Mediabackups_recovery_prototype.png|thumb|right|Proposed prototype for small-scale media recovery]]
In order to recover 1 or a few files (e.g. all versions from the same name) from backups it is possible to use the proof of concept interactive script <code>restore_media_file.py</code>. This script will locate the desired file by querying the backup database and save the found files to the local filesystem for custom recovery. A separate workflow should be designed for large-scale recoveries (e.g. all files from a wiki lost).
 
  WIP- how to use it


== How to access the web UI of minio ==
== How to access the web UI of minio ==

Revision as of 19:16, 22 December 2021

The backups servers are listed on puppet (hieradata). Worker servers are those that are used to download and upload files on backup and recovery, as well as pre-processing them (e.g. hashing them and checking its integrity). Storage servers run minio (an S3-api compatible service) and hold the data long term. For now it uses completely static discovery as it give us high flexibility to depool a server.

Each datacenter has its own separate set of credentials.

At the moment, the clustering functionality of minio is not used, meaning one will have to access each storage server (minio server) individually. The sharding is based on the sha256 hash of the file, divided by the number of configured servers, in the order configured. For example, with:

endpoints:
  - https://backup1004.eqiad.wmnet:9000
  - https://backup1005.eqiad.wmnet:9000
  - https://backup1006.eqiad.wmnet:9000
  - https://backup1007.eqiad.wmnet:9000

files whose sha256 hash start (in hexadecimal) with 0-3 go to backup1004, 4-7 to backup1005, 8-B to backup1006 and C-F to backup1007. This is not guarantee to stay like that, as servers will likely be unavailable for maintenance at times, and the number of servers will be expanded, which means eventually one will be forced to use the metadata database to locate the server where a file is located (this can be done with the recovery script).

Small recovery script

File:Mediabackups recovery prototype.png
Proposed prototype for small-scale media recovery

In order to recover 1 or a few files (e.g. all versions from the same name) from backups it is possible to use the proof of concept interactive script restore_media_file.py. This script will locate the desired file by querying the backup database and save the found files to the local filesystem for custom recovery. A separate workflow should be designed for large-scale recoveries (e.g. all files from a wiki lost).

 WIP- how to use it

How to access the web UI of minio

The integrated web client of minio, while simple, allows easily to manage, list and upload/download files to the backend with a more user-friendly interface.

minio access is firewalled, and it only has its service port open to the backup workers and prometheus for metrics gathering from the same datacenter. To gain access one needs to tunnel HTTPS on port 9000 to a local port with ssh through a server with access (e.g. a worker server from the same datacenter).

For example:

ssh -L 1234:backup1004.eqiad.wmnet:9000 ms-backup1001.eqiad.wmnet

will tunnel the minio service to the local port 1234 through ms-backup1001.

For the actual active worker and storage servers, consult hieradata.

File:MinIO login.png
Minio login screen

Then go to your browser and open https://localhost:1234. The https is important, as non-tls traffic is not allowed.

Your browser will complain about lack of a trusted TLS- this is because it uses the discovery CA, which is only deployed to the WMF production cluster. Either install it on your client PC or click "Accept risk and continue".

A login screen should appear. Credentials are deployed on the worker servers at: /etc/mediabackup/mediabackups_recovery.conf They are also available on private-puppet:hieradata/common.yaml and private-puppet:hieradata/{eqiad, codfw}.yaml.

File:MinIO browser.png
Minio file browser

It is highly recommended to use the credentials used for backup recovery (not backup generation) as those are read-only. If writing is needed, it is more reliable to use the command line client (mc), unless you know what you are doing.

After logging, a browser screen should appear, allowing you to navigate among the file structure, download files, etc. The minio browser will not disable options for deleting or uploading if in read only- it will only fail to do so.