You are browsing a read-only backup copy of Wikitech. The live site can be found at


From Wikitech-static
< User:Razzi
Revision as of 17:38, 5 May 2021 by imported>Razzi
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Procedure to fold 2 partitions into one:

mkdir /srv/sqldata
mv /var/lib/mysql/* /srv/sqldata
umount /var/lib/mysql
umount /srv
lvremove /dev/an-coord1001-vg/mysql
lvextend -l +100%FREE /dev/an-coord1001-vg/srv
resize2fs /dev/an-coord1001-vg/srv

Also had to change the mysql data directory:

profile::analytics::database::meta::datadir: '/srv/sqldata'

Got a ping on - Mysql partition on an-coord1001 sudden change in growth rate since Apr 14th

The issue was resolved, which is visible on:

Priority for today:

I had thought originally that we could do the upgrade with everything online, rather than doing a maintenance window with readonly safe mode etc. I can see the benefit of safe mode for protecting against data loss, and there is always the chance a reimage goes horribly wrong, but since all this is on a standby we shouldn't have to take writing offline.

What would happen if we had a snapshot, data keeps getting written, then we have to restore to the snapshot? There would be some unreferenceable data on workers, but what would be the data lost?

In safe mode, what would

Data builds up on kafka

need to understand all the data that flows into hdfs

How to drain the cluster?


Created kerberos principal for user, as easy as running create and adding krb: present to data.yaml: