You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Analytics/Systems/Cluster/Mysql Meta: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Elukey
imported>Elukey
Line 56: Line 56:
In puppet you should find the dbprov host hosting the analytics-meta dumps. Currently it is dbprov1003:<syntaxhighlight lang="bash">
In puppet you should find the dbprov host hosting the analytics-meta dumps. Currently it is dbprov1003:<syntaxhighlight lang="bash">
elukey@dbprov1003:~$ sudo ls /srv/backups/dumps/latest
elukey@dbprov1003:~$ sudo ls /srv/backups/dumps/latest
dump.analytics_meta.2020-09-29--02-07-47
dump.analytics_meta.2020-10-06--02-08-27
</syntaxhighlight>
 
elukey@dbprov1003:~$ sudo cat /srv/backups/dumps/latest/dump.analytics_meta.2020-10-06--02-08-27/metadata
Started dump at: 2020-10-06 02:08:27
SHOW MASTER STATUS:
Log: db1108-bin.000150
Pos: 685545104
GTID:0-171971944-198484076,171974727-171974727-7482
 
SHOW SLAVE STATUS:
Connection name:
Host: an-coord1001.eqiad.wmnet
Log: analytics-meta-bin.017481
Pos: 10114597
GTID:0-171971944-198484076,171974727-171974727-7482
</syntaxhighlight>We have some good info from the metadata file: name of the binlog file and position of the last transaction recorded in the dump. The next step is to find all the transactions happened between when the dump was taken and the last known good state of the database.
 
=== Make a list of binlogs that were created AFTER the binlog registered in the metadata file ===
In this case, on db1108 we have to check all the binlogs in the mariadb's data dir after db1108-bin.000150:<syntaxhighlight lang="bash">
elukey@db1108:/srv/sqldata.analytics_meta$ ls -lht db1108-bin.[0-9]*
-rw-rw---- 1 mysql mysql  969M Oct 12 14:51 db1108-bin.000160
-rw-rw---- 1 mysql mysql 1001M Oct 12 00:52 db1108-bin.000159
-rw-rw---- 1 mysql mysql 1001M Oct 11 08:59 db1108-bin.000158
-rw-rw---- 1 mysql mysql 1001M Oct 10 17:12 db1108-bin.000157
-rw-rw---- 1 mysql mysql 1001M Oct 10 01:00 db1108-bin.000156
-rw-rw---- 1 mysql mysql 1001M Oct  9 10:37 db1108-bin.000155
-rw-rw---- 1 mysql mysql 1001M Oct  8 21:15 db1108-bin.000154
-rw-rw---- 1 mysql mysql 1001M Oct  8 06:24 db1108-bin.000153
-rw-rw---- 1 mysql mysql 1001M Oct  7 14:53 db1108-bin.000152
-rw-rw---- 1 mysql mysql 1001M Oct  6 23:27 db1108-bin.000151
-rw-rw---- 1 mysql mysql 1001M Oct  6 07:35 db1108-bin.000150
-rw-rw---- 1 mysql mysql 1001M Oct  5 16:17 db1108-bin.000149
-rw-rw---- 1 mysql mysql 1001M Oct  5 01:12 db1108-bin.000148
-rw-rw---- 1 mysql mysql 1001M Oct  4 09:57 db1108-bin.000147
-rw-rw---- 1 mysql mysql 1001M Oct  3 18:55 db1108-bin.000146
-rw-rw---- 1 mysql mysql 1001M Oct  3 04:53 db1108-bin.000145
[..more..]
</syntaxhighlight>From the above ls, it looks that we have 11 binlogs to check, from 000150 to 000160.


=== Find the last good transaction from the binlog ===
=== Find the last good transaction from the binlog ===
We should now find two things in the current binlog: the last transaction for the superset_production db before the problem (basically the last good known state) and the first transaction right after the dump that we want to restore. Something very quick and easy to spot the right file could be to start from the latests, dump their content and see the first one containing alter table statements.<syntaxhighlight lang="bash">
We should now find the last transaction for the superset_production db before the problem occurred (basically the last good known state). Something very quick and easy to spot the right file could be to start from the latests, dump their content and see the first one containing alter table statements.<syntaxhighlight lang="bash">
elukey@an-coord1001:/var/lib/mysql$ ls -lht analytics-meta-bin* | head
-rw-rw---- 1 mysql mysql  18M Sep 30 09:16 analytics-meta-bin.017344
-rw-rw---- 1 mysql mysql  62M Sep 30 09:00 analytics-meta-bin.017343
-rw-rw---- 1 mysql mysql 5.6K Sep 30 09:00 analytics-meta-bin.index
-rw-rw---- 1 mysql mysql  62M Sep 30 08:00 analytics-meta-bin.017342
-rw-rw---- 1 mysql mysql  62M Sep 30 07:00 analytics-meta-bin.017341
-rw-rw---- 1 mysql mysql  65M Sep 30 06:00 analytics-meta-bin.017340
-rw-rw---- 1 mysql mysql  63M Sep 30 05:00 analytics-meta-bin.017339
-rw-rw---- 1 mysql mysql  63M Sep 30 04:00 analytics-meta-bin.017338
-rw-rw---- 1 mysql mysql  64M Sep 30 03:00 analytics-meta-bin.017337
-rw-rw---- 1 mysql mysql  72M Sep 30 02:00 analytics-meta-bin.017336


elukey@an-coord1001:/var/lib/mysql$ sudo mysqlbinlog --base64-output=decode-rows --database superset_production analytics-meta-bin.017344 | grep -i alter
elukey@db1108:/srv/sqldata.analytics_meta$ sudo mysqlbinlog --base64-output=decode-rows --database superset_production db1108-bin.000160 | grep -i alter
elukey@an-coord1001:/var/lib/mysql$ sudo mysqlbinlog --base64-output=decode-rows --database superset_production analytics-meta-bin.017343 | grep -i alter
elukey@an-coord1001:/var/lib/mysql$ sudo mysqlbinlog --base64-output=decode-rows --database superset_production analytics-meta-bin.017342 | grep -i alter
ALTER TABLE dashboard_email_schedules ADD COLUMN slack_channel TEXT
ALTER TABLE dashboard_email_schedules ADD COLUMN slack_channel TEXT
ALTER TABLE slice_email_schedules ADD COLUMN slack_channel TEXT
ALTER TABLE slice_email_schedules ADD COLUMN slack_channel TEXT
Line 84: Line 107:
ALTER TABLE dbs ADD CHECK (allow_cvas IN (0, 1))
ALTER TABLE dbs ADD CHECK (allow_cvas IN (0, 1))
ALTER TABLE dbs DROP COLUMN perm
ALTER TABLE dbs DROP COLUMN perm
</syntaxhighlight>You should be able to find a position and a timestamp in the binlog right before the alters, record it and proceed with the next step.
</syntaxhighlight>You should be able to find a position (end_log_pos) in the binlog right before the alters, record it and proceed with the next step.


=== Use mysqlbinlog to create a dedicated recovery file ===
=== Use mysqlbinlog to create a dedicated recovery file ===
TODO
<syntaxhighlight lang="bash">
sudo mysqlbinlog --database superset_production --start-position 685545104 --stop-position 70620843 db1108-bin.000150 db1108-bin.000151 ... db1108-bin.000160 >> recovery_binlog_superset_production
</syntaxhighlight>
 
=== Recover the database ===
At this point you should be able to use the guide outlined earlier on to restore the last dump of the database needed, and then apply the recovery_binlog simply with <code>cat recovery_binlog_superset_production | sudo mysql -d superset_production</code>


== Failover ==
== Failover ==

Revision as of 15:59, 12 October 2020

The Analytics Mysql Meta instance is a Mariadb database currently running on the Analytics Hadoop coordinator host (an-coord1001), that hosts some low volume db schemas related to Hadoop tools like Hive, Oozie, Druid, etc..

Databases

MariaDB [(none)]> show databases;
+---------------------+
| Database            |
+---------------------+
| druid               |
| druid_public_eqiad  |
| hive_metastore      |
| hue                 |         |
| oozie               |
| search_airflow      |
| superset_production |
| superset_staging    |
+---------------------+

Dependent services

The Mariadb instance collects metadata about the following systems:

Maintenance

Please remember that any maintenance to the database will need to keep into consideration that we have replication to db1108.

This is a mistake that I made after replication was set up the first time:

  • created a new db called hue_next (I was testing a new version of Hue)
  • granted to the hue user some privileges for hue_next
  • replication broken on db1108 + icinga alarms

The main problem was that at the time we replicated only the databases, not the users on db1108 (since we only needed the replication user and not the rest) so when the grant command was replicated, it failed. The fix was easy, basically adding users to the db on db1108, but please keep this in mind when operating on an-coord1001.

Backup

There are currently two backup workflows currently running:

  • All the databases are periodically backed up in /srv/backup/mysql/analytics-meta on the Analytics Hadoop Master Standby (an-master1002). Daily snapshots are backed up into HDFS at /wmf/data/archive/backup/mysql/analytics-meta/.
  • All the above databases are replicated to db1108, see Analytics/Systems/DB_Replica. db1108 dumps are taken periodically into the db provisioning/recovery hosts (dbprov1002), which are stored long term into Bacula

Restore a Backup

There are multiple ways to restore data:

Let's say that we have a database on analytics-meta that got modified in an unexpected way, for example the superset_production db upgraded to a new version of Superset that ended up to be not working. We cannot rely on the db version stored on db1108 (the replica) since it gets updated via mariadb slave few seconds after its master.

Check on dbprov100x when was the last analytics-meta dump taken.

In puppet you should find the dbprov host hosting the analytics-meta dumps. Currently it is dbprov1003:

elukey@dbprov1003:~$ sudo ls /srv/backups/dumps/latest
dump.analytics_meta.2020-10-06--02-08-27

elukey@dbprov1003:~$ sudo cat /srv/backups/dumps/latest/dump.analytics_meta.2020-10-06--02-08-27/metadata
Started dump at: 2020-10-06 02:08:27
SHOW MASTER STATUS:
	Log: db1108-bin.000150
	Pos: 685545104
	GTID:0-171971944-198484076,171974727-171974727-7482

SHOW SLAVE STATUS:
	Connection name:
	Host: an-coord1001.eqiad.wmnet
	Log: analytics-meta-bin.017481
	Pos: 10114597
	GTID:0-171971944-198484076,171974727-171974727-7482

We have some good info from the metadata file: name of the binlog file and position of the last transaction recorded in the dump. The next step is to find all the transactions happened between when the dump was taken and the last known good state of the database.

Make a list of binlogs that were created AFTER the binlog registered in the metadata file

In this case, on db1108 we have to check all the binlogs in the mariadb's data dir after db1108-bin.000150:

elukey@db1108:/srv/sqldata.analytics_meta$ ls -lht db1108-bin.[0-9]*
-rw-rw---- 1 mysql mysql  969M Oct 12 14:51 db1108-bin.000160
-rw-rw---- 1 mysql mysql 1001M Oct 12 00:52 db1108-bin.000159
-rw-rw---- 1 mysql mysql 1001M Oct 11 08:59 db1108-bin.000158
-rw-rw---- 1 mysql mysql 1001M Oct 10 17:12 db1108-bin.000157
-rw-rw---- 1 mysql mysql 1001M Oct 10 01:00 db1108-bin.000156
-rw-rw---- 1 mysql mysql 1001M Oct  9 10:37 db1108-bin.000155
-rw-rw---- 1 mysql mysql 1001M Oct  8 21:15 db1108-bin.000154
-rw-rw---- 1 mysql mysql 1001M Oct  8 06:24 db1108-bin.000153
-rw-rw---- 1 mysql mysql 1001M Oct  7 14:53 db1108-bin.000152
-rw-rw---- 1 mysql mysql 1001M Oct  6 23:27 db1108-bin.000151
-rw-rw---- 1 mysql mysql 1001M Oct  6 07:35 db1108-bin.000150
-rw-rw---- 1 mysql mysql 1001M Oct  5 16:17 db1108-bin.000149
-rw-rw---- 1 mysql mysql 1001M Oct  5 01:12 db1108-bin.000148
-rw-rw---- 1 mysql mysql 1001M Oct  4 09:57 db1108-bin.000147
-rw-rw---- 1 mysql mysql 1001M Oct  3 18:55 db1108-bin.000146
-rw-rw---- 1 mysql mysql 1001M Oct  3 04:53 db1108-bin.000145
[..more..]

From the above ls, it looks that we have 11 binlogs to check, from 000150 to 000160.

Find the last good transaction from the binlog

We should now find the last transaction for the superset_production db before the problem occurred (basically the last good known state). Something very quick and easy to spot the right file could be to start from the latests, dump their content and see the first one containing alter table statements.

elukey@db1108:/srv/sqldata.analytics_meta$ sudo mysqlbinlog --base64-output=decode-rows --database superset_production db1108-bin.000160 | grep -i alter
ALTER TABLE dashboard_email_schedules ADD COLUMN slack_channel TEXT
ALTER TABLE slice_email_schedules ADD COLUMN slack_channel TEXT
ALTER TABLE row_level_security_filters DROP FOREIGN KEY row_level_security_filters_ibfk_3
ALTER TABLE row_level_security_filters DROP COLUMN table_id
ALTER TABLE query ADD COLUMN ctas_method VARCHAR(16)
ALTER TABLE dbs ADD COLUMN allow_cvas BOOL
ALTER TABLE dbs ADD CHECK (allow_cvas IN (0, 1))
ALTER TABLE dbs DROP COLUMN perm

You should be able to find a position (end_log_pos) in the binlog right before the alters, record it and proceed with the next step.

Use mysqlbinlog to create a dedicated recovery file

sudo mysqlbinlog --database superset_production --start-position 685545104 --stop-position 70620843 db1108-bin.000150 db1108-bin.000151 ... db1108-bin.000160 >> recovery_binlog_superset_production

Recover the database

At this point you should be able to use the guide outlined earlier on to restore the last dump of the database needed, and then apply the recovery_binlog simply with cat recovery_binlog_superset_production | sudo mysql -d superset_production

Failover

This procedure should be used only if an-coord1001 is not available for some reason, for example unrecoverable hardware failure. There is no automatic failover, only a manual one (described by this procedure). The high level idea is the following:

  • Check what daemons are running on an-coord1001. The Hive metastore and oozie probably need to be moved elsewhere, check what host/vm can hold them temporarily.
  • Ssh to db1108, and stop replication on the Analytics Meta's mysql instance (sudo mysql -S /run/mysqld/mysqld.analytics_meta.sock and stop slaves;).
  • Stop bacula backups for the db1108's mysql instance, since we found out in the past that the backup locking overhead causes timeouts and problems to daemons connecting to the Analytics Meta DB.
  • Set the db1108's meta replica in puppet to read/write (check mariadb::config in profile::mariadb::misc::analytics::multiinstance in puppet).
  • Look for all occurrences of an-coord1001 in puppet and figure out the ones that need to be moved to db1108. Keep in mind that the port of the mysql instance on db1108 is not 3306, so replacing an-coord1001.eqiad.wmnet with db1108.eqiad.wmnet in puppet is not enough. A high level list is:
    • hive metastore mysql configuration (hive-site.xml)
    • oozie mysql configuration (oozie-site.xml)
    • druid clusters configuration (druid common config, in hiera)
    • superset configs (superset_config.py)
    • hue config for oozie and hive
    • airflow mysql config for Search (alert them when this happens to warn about airflow working in degraded mode).
  • Merge the above change and run puppet across the hosts, you'll likely need to roll restart the impacted daemons.
  • At this point, you should have recovery.
  • Important: Bacula backups are happening weekly and they may interfere with daemons using the database, due to locking etc.. In order to temporary disable backups, comment the lines related to analytics meta in https://gerrit.wikimedia.org/r/c/operations/puppet/+/617650/1/modules/profile/templates/mariadb/backup_config/dbprov1002.cnf.erb

Failback

This procedure should be used to restore service after a failover from an-coord1001 to db1108 happened. Since the dbs on an-coord1001 need to be rebuild, this will need some downtime for all services.

The high level idea is the following:

  • Set up some downtime for all the daemons using the analytics meta instance (see in the Failover section).
  • Dump the databases on db1108, copy it to an-coord1001 and restore them (see Analytics/Systems/DB_Replica for info about how to do it). You may probably need to drop the existing databases/data before doing so.
  • Move all daemons to an-coord1001 (read the Failover procedure for more info).
  • Set db1108's instance to read-only and set up replication again.
  • Re-enable Bacula backups.