Difference between revisions of "MariaDB/Provisioning a host"

From Wikitech-static
Jump to navigation Jump to search
imported>LSobanski
m (LSobanski moved page Provisioning a DB Host to MariaDB/Provisioning a host: Changing to the common location)
 
imported>LSobanski
(→‎Replacing a failed host: Updating the reimage instructions)
 
(One intermediate revision by the same user not shown)
Line 3: Line 3:
== Host preparation ==
== Host preparation ==
=== Downtime the host in Icinga ===
=== Downtime the host in Icinga ===
Semi-automatically:
cookbook sre.hosts.downtime DBNAME -D1 -t TXXXXXX -r "provisioning - TXXXXXX"
Manually:
# Open <code>https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=DBNAME</code>.
# Open <code>https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=DBNAME</code>.
# Select "Schedule downtime for checked hosts" from the top right menu.
# Select "Schedule downtime for checked hosts" from the top right menu.
# '''TBD: next steps'''
# '''TBD: next steps'''
=== Adding a new host ===
=== Adding a new host ===
# TBD
# TBD
Line 14: Line 20:
## '''TBD: What to do if there aren't any?'''
## '''TBD: What to do if there aren't any?'''
# Match the OS version the replaced host was running
# Match the OS version the replaced host was running
## Reimage the host to Stretch if needed:
## Reimage the host to Stretch if needed, following the instructions in [[Server_Lifecycle/Reimage]]
### In <code>modules/install_server/files/dhcpd/linux-host-entries.ttyS1-115200</code><br/>add: <code>option pxelinux.pathprefix "http://apt.wikimedia.org/tftpboot/stretch-installer/</code>
### Log on to cumin and run a terminal multiplexer (screen or tmux)
### Run <code>sudo wmf-auto-reimage-host -p TASKID DBNAME</code>


== Section and replication configuration ==
== Section and replication configuration ==
Line 62: Line 65:


{{SRE/Data Persistence/Footer}}
{{SRE/Data Persistence/Footer}}
[[Category:MariaDB]]

Latest revision as of 10:59, 11 October 2021

Host preparation

Downtime the host in Icinga

Semi-automatically:

cookbook sre.hosts.downtime DBNAME -D1 -t TXXXXXX -r "provisioning - TXXXXXX"

Manually:

  1. Open https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=DBNAME.
  2. Select "Schedule downtime for checked hosts" from the top right menu.
  3. TBD: next steps

Adding a new host

  1. TBD

Replacing a failed host

  1. Identify an available host in the same DC
    1. TBD: how?
    2. TBD: What to do if there aren't any?
  2. Match the OS version the replaced host was running
    1. Reimage the host to Stretch if needed, following the instructions in Server_Lifecycle/Reimage

Section and replication configuration

Add the host to a section:

  1. In manifests/site.pp:
    1. Add the DBNAME to the appropriate regexps
    2. Remove the insetup role from DBNAME
  2. In hieradata/hosts/DBNAME.yaml add
    1. For single-instance hosts: mariadb::shard: 'SECTIONNAME'
    2. For multi-instance hosts:

Configure the correct replication mode:

  1. In hieradata/hosts/DBNAME.yaml add: mariadb::binlog_format: 'BINLOG_FORMAT', where BINLOG_FORMAT is:
    1. (TBD: when?) STATEMENT
    2. (TBD: when?)ROW

Add the host to dbctl config (example)

  1. In conftool-data/dbconfig-instance/instances.yaml add - DBNAME in the appropriate location

Add the host to Mediawiki's database loadbalancer configuration (example)

  1. Log on to a cumin host
  2. Run sudo dbctl --scope DCNAME instance DBNAME edit, where DCNAME is eqiad or codfw
  3. Fill the template:
    1. Set pooled: false
    2. Replacing a failed host?
      1. Yes: mirror the replaced host configuration
      2. No: For now, just ask (TBD: guidelines?)
  4. Run dbctl config commit -m "COMMIT_MESSAGE"

Data population

Cloning

  1. Clone (TBD: How?)
  2. Start MariaDB
  3. Wait until replication has caught up (all green on Icinga)

Restoring from backup

  1. MariaDB/Backups#Provision_a_precompressed_and_prepared_snapshot_(preferred)

Replication from master

Enabling notifications (example):

  1. In hieradata/hosts/DBNAME.yaml delete profile::base::notifications: disabled

Next steps

  1. Ask other team members about schema changes that were executed since the timestamp of the backup you restored the host from and apply them (if needed)
  2. Pool the host in (TBD: link needed)



This page is a part of the SRE Data Persistence technical documentation
(go here for a list of all our pages)