MariaDB/Provisioning a host
operations/puppet
repository
Host preparation
Downtime the host in Icinga
Semi-automatically:
cookbook sre.hosts.downtime DBNAME --days 1 --task-id TXXXXXX --reason "provisioning - TXXXXXX"
Manually:
-
Open
https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=DBNAME. - Select "Schedule downtime for checked hosts" from the top right menu.
- TBD: next steps
Adding a new host
- TBD
Replacing a failed host
-
Identify an available host in the same DC
- TBD: how?
- TBD: What to do if there aren't any?
-
Match the OS version the replaced host was running
- Reimage the host to Stretch if needed, following the instructions in Server_Lifecycle/Reimage
Section and replication configuration
Add the host to a section:
-
In
manifests/site.pp:- Add the DBNAME to the appropriate regexps
-
Remove the
insetuprole from DBNAME
-
In
hieradata/hosts/DBNAME.yamladd-
For single-instance hosts:
mariadb::shard: 'SECTIONNAME' - For multi-instance hosts:
-
For single-instance hosts:
Configure the correct replication mode:
-
In
hieradata/hosts/DBNAME.yamladd:mariadb::binlog_format: 'BINLOG_FORMAT', whereBINLOG_FORMATis:-
(TBD: when?)
STATEMENT -
(TBD: when?)
ROW
-
(TBD: when?)
Example for
es*
section:
https://gerrit.wikimedia.org/r/c/operations/puppet/+/1172192
Add the host to dbctl config ( example )
-
In
conftool-data/dbconfig-instance/instances.yamladd- DBNAMEin the appropriate location
Add the host to Mediawiki's database loadbalancer configuration ( example )
-
Log on to one of the cluster management hosts (
cumin1003.eqiad.wmnet, cumin2002.codfw.wmnet) -
Run
sudo dbctl --scope DCNAME instance DBNAME edit, whereDCNAMEiseqiadorcodfw -
Fill the template:
-
Set
pooled: false -
Replacing a failed host?
- Yes: mirror the replaced host configuration
- No: For now, just ask (TBD: guidelines?)
-
Set
-
Run
dbctl config commit -m "COMMIT_MESSAGE"
Data population
Cloning
- Clone via the sre.mysql.clone cookbook
- Start MariaDB
- Wait until replication has caught up (all green on Icinga)
Restoring from backup
Replication from master
Enabling notifications ( example ):
-
In
hieradata/hosts/DBNAME.yamldeleteprofile::base::notifications: disabled
Add it to zarcillo DB
Example host: db1208 going to core s3, lives in eqiad in A5
Zarcillo DB lives in db1215
set session binlog_format=ROW;
INSERT INTO instances (name, server, port, `group`) VALUES ('db1208','db1208.eqiad.wmnet',3306, 'core');
INSERT INTO section_instances (instance, section) VALUES ('db1208','s3');
INSERT INTO servers (fqdn, hostname, dc, rack) VALUES ('db1208.eqiad.wmnet', 'db1208', 'eqiad', 'a5');
Next steps
- Ask other team members about schema changes that were executed since the timestamp of the backup you restored the host from and apply them (if needed)
- Pool the host in (TBD: link needed)
This page is a part of the
SRE Data Persistence technical documentation
(
go here for a list of all our pages
)