You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

User:Razzi/an-master reimaging

From Wikitech-static
< User:Razzi
Revision as of 19:45, 21 April 2021 by imported>Razzi
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

https://phabricator.wikimedia.org/T278423

https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration#High_Availability

razzi@an-master1002:~$ sudo -u hdfs /usr/bin/hdfs haadmin -getServiceState an-master1001-eqiad-wmnet
active

looks good

In terms of what to do when reimaging, I will refer to cookbooks/sre/hadoop/roll-restart-masters.py

       logger.info("Restarting Yarn Resourcemanager on Master.")
       hadoop_master.run_sync('systemctl restart hadoop-yarn-resourcemanager')

ok so can `systemctl stop hadoop-yarn-resourcemanager` on standby

       logger.info("Restart HDFS Namenode on the master.")
       hadoop_master.run_async(
           'systemctl restart hadoop-hdfs-zkfc',
           'systemctl restart hadoop-hdfs-namenode')

also:

systemctl stop hadoop-hdfs-zkfc
systemctl stop hadoop-hdfs-namenode

It's similar to the comment here: https://phabricator.wikimedia.org/T265126#7008232

One more service:

logger.info("Restart MapReduce historyserver on the master.")
hadoop_master.run_sync('systemctl restart hadoop-mapreduce-historyserver')

so good idea to `systemctl stop hadoop-mapreduce-historyserver` on the active.

Risk: active fails while standby is down

Mitigation: minimal downtime, check active grafana stats before doing anything


Current an-master disk configuration:

razzi@an-master1001:~$ lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                            8:0    0 223.6G  0 disk
├─sda1                         8:1    0  46.6G  0 part
│ └─md0                        9:0    0  46.5G  0 raid1 /
├─sda2                         8:2    0   954M  0 part
│ └─md1                        9:1    0 953.4M  0 raid1 [SWAP]
└─sda3                         8:3    0 176.1G  0 part
  └─md2                        9:2    0   176G  0 raid1
   └─an--master1001--vg-srv 253:0    0   176G  0 lvm   /srv
sdb                            8:16   0 223.6G  0 disk
├─sdb1                         8:17   0  46.6G  0 part
│ └─md0                        9:0    0  46.5G  0 raid1 /
├─sdb2                         8:18   0   954M  0 part
│ └─md1                        9:1    0 953.4M  0 raid1 [SWAP]
└─sdb3                         8:19   0 176.1G  0 part
  └─md2                        9:2    0   176G  0 raid1
   └─an--master1001--vg-srv 253:0    0   176G  0 lvm   /srv

versus modules/install_server/files/autoinstall/partman/reuse-raid1-2dev.cfg:

# this workarounds LP #1012629 / Debian #666974
# it makes grub-installer to jump to step 2, where it uses bootdev
d-i    grub-installer/only_debian      boolean         false
d-i    grub-installer/bootdev  string  /dev/sda /dev/sdb

d-i    partman/reuse_partitions_recipe         string \
                /dev/sda|1 biosboot ignore none|2 raid ignore none, \
                /dev/sdb|1 biosboot ignore none|2 raid ignore none, \
                /dev/mapper/*-root|1 ext4 format /, \
                /dev/mapper/*-srv|1 ext4 keep /srv

So we'll want to set it to reuse-parts-test.cfg to confirm

High level plan - check that everything is healthy: nodes on grafana, ensure active / standby is what we expect

- merge patch to set an-master1002 to reuse-parts-test.cfg with a custom partman/custom/reuse-analytics-hadoop-master.cfg. linux-host-entries.ttyS1-115200 already does not have a pxeboot entry so it will use buster upon reimaging

^- Do we want to add logical volumes for swap and root?

- stop hadoop daemons on an-master1002, downtime node, disable puppet

- run reimage, wait for node to come online, ensure things are healthy

- failover to newly-reimaged node, ensure things are still working

- repeat the steps to stop, update, and reimage an-master1001

Ok solved the mystery of sda1 / sdb1 on an-test-master1001: they are a bios boot partition.

razzi@an-test-master1001:~$ lsblk
NAME           MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda              8:0    0 447.1G  0 disk
├─sda1           8:1    0   285M  0 part
└─sda2           8:2    0 446.9G  0 part
  └─md0          9:0    0 446.7G  0 raid1
    ├─vg0-root 253:0    0  74.5G  0 lvm   /
    ├─vg0-swap 253:1    0   976M  0 lvm   [SWAP]
    └─vg0-srv  253:2    0 371.3G  0 lvm   /srv
sdb              8:16   0 447.1G  0 disk
├─sdb1           8:17   0   285M  0 part
└─sdb2           8:18   0 446.9G  0 part
  └─md0          9:0    0 446.7G  0 raid1
    ├─vg0-root 253:0    0  74.5G  0 lvm   /
    ├─vg0-swap 253:1    0   976M  0 lvm   [SWAP]
    └─vg0-srv  253:2    0 371.3G  0 lvm   /srv
razzi@an-test-master1001:~$ sudo fdisk -l
Disk /dev/sdb: 447.1 GiB, 480103981056 bytes, 937703088 sectors
Disk model: MZ7LH480HAHQ0D3
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: C5C638C5-0FFB-4A5C-A4C5-53A22860E315

Device      Start       End   Sectors   Size Type
/dev/sdb1    2048    585727    583680   285M BIOS boot
/dev/sdb2  585728 937701375 937115648 446.9G Linux RAID