You are browsing a read-only backup copy of Wikitech. The live site can be found at

MariaDB/Upgrading a section: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
mNo edit summary
No edit summary
Line 35: Line 35:

{{SRE/Data Persistence/Footer}}
{{SRE/Data Persistence/Footer}}

Revision as of 09:41, 9 June 2021

Order of upgrades

  • Upgrade clouddb* hosts.
  • Upgrade Sanitarium hosts in both DCs
  • Upgrade Sanitarium primaries in both DCs and ensure sanitarium host hangs from the 10.4 one in the active DC
  • Upgrade the candidate master on the standby DC
  • Upgrade the backup source in the standby DC (coordinate with Jaime)
  • Upgrade the master in the standby DC
  • Upgrade the candidate master in the primary DC
  • Upgrade the backup source in the primary DC (coordinate with Jaime)
  • Switchover the primary host in the primary DC to a Buster+10.4 host
  • Upgrade the old primary and make it a candidate primary

Upgrade procedure

  • Patch the dhcp file: [example]
  • Run puppet on install1003 and install2003
  • Depool the host (if needed) using software/dbtools/depool-and-wait
  • Silence the host in Icinga
  • Stop MySQL on the host
  • Run umount /srv; swapoff -a
  • Run reimage: sudo -E wmf-auto-reimage xxxx.wmnet -p TXXXXXX
  • Wait until the host is up
  • Run systemctl set-environment MYSQLD_OPTS=”--skip-slave-start”
  • Run systemctl start mariadb ; mysql_upgrade
  • Run systemctl restart prometheus-mysqld-exporter.service
  • Dropped the host from Tendril and re-add it, otherwise they won’t get updated on tendril metrics
  • Check all the tables before starting replication (this can take up to 24h depending on the section)
    • In a screen run: mysqlcheck --all-databases
    • If any corruption is discovered, fix it with the following: journalctl -xe -u mariadb | grep table | grep Flagged | awk -F "table" '{print $2}' | awk -F " " '{print $1}' | tr -d "\`" | uniq >> /root/to_fix ; for i in `cat /root/to_fix`; do echo $i; mysql -e "set session sql_log_bin=0; alter table $i engine=InnoDB, force"; done
  • Start the replica
  • Wait until the host is up
  • Repool the host.

This page is a part of the SRE Data Persistence technical documentation
(go here for a list of all our pages)