You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Ganeti is a cluster virtual server management software tool built on top of existing virtualization technologies such as Xen or KVM and other open source software. It supports both KVM and Xen. At WMF we only have KVM as an enabled hypervisor. Primary Ganeti web page is http://www.ganeti.org/.
At WMF, ganeti is used as a cluster management tool for a private VPS cloud installation. After an evaluation process of Openstack vs Ganeti, Ganeti was chosen as a more fitting software for the job at hand.
Ganeti is architected as a shared nothing cluster with job management. There is one master node that receives all jobs to be executed (create a VM, delete a VM, stop/start VMs, etc) that can be swapped between a preconfigured number of master candidates in case of a hardware failure. That allows for no single point of failure for cluster operations. For VMs operations, provided the DRBD backend is used, which we do in WMF, even in the case of catastrophic failure for a hardware node, VMs can be restarted with minimal disruption on their secondary (backup) node.
A high level overview of the architecture is here http://docs.ganeti.org/ganeti/2.12/html/_images/graphviz-246e5775f608681df9f62dbbe0a5d4120dc75f1c.png and more discussion about it is in http://docs.ganeti.org/ganeti/2.12/html/design-2.0.html
A cluster is identified by:
- The nodes
- An FQDN (e.g ganeti01.svc.eqiad.wmnet), which obviously corresponds to an IPv4 address. That IPv4 address is "floating", meaning that it is owned by the current master.
Administration always happens via the master. It is the only node where all commands can be run and hosts the API. Failover of a master is easy but manual. See below for more information on how to do it.
Connect to a cluster
Just ssh to its FQDN
Init the cluster
An example of a initializing a new cluster:
sudo gnt-cluster init \ --no-ssh-init \ --enabled-hypervisors=kvm \ --vg-name=ganeti \ --master-netdev=br0 \ --hypervisor-parameters kvm:kvm_path=/usr/bin/qemu-system-x86_64,kvm_flag=enabled,serial_speed=115200,migration_bandwidth=64,kernel_path= \ --nic-parameters=link=br0 \ ganeti01.svc.codfw.wmnet
The above is the way we currently have our clusters configured
Modify the cluster
Modifying the cluster to change defaults, parameters of hypervisors, limits, security model etc is possible. An example of modifying the cluster is given below.
sudo gnt-cluster modify -H kvm:kvm_path=/usr/bin/qemu-system-x86_64,kvm_flag=enabled,kernel_path=
To get an idea of what is actually modifiable do a:
sudo gnt-cluster info
and then lookup in ganeti documentation the various options []
Destroy the cluster
Destroying the cluster is a one way street. Do not do it lightly. An example of destroying a cluster:
sudo gnt-cluster destroy --yes-do-it
do note that various things will be left behind. For example /var/lib/ganeti/queue/ will not be deleted. It's up to you if you want to delete it or not, depending on the case.
Add a node
Adding a new hardware node to the cluster to increase capacity
sudo gnt-node add ganeti1002.eqiad.wmnet
Listing cluster nodes
Listing all hardware nodes in a cluster:
sudo gnt-node list
That should return something like:
Node DTotal DFree MTotal MNode MFree Pinst Sinst ganeti1001.eqiad.wmnet 427.9G 427.9G 63.0G 391M 62.4G 0 0 ganeti1002.eqiad.wmnet 427.9G 427.9G 63.0G 289M 62.5G 0 0 ganeti1003.eqiad.wmnet 427.9G 427.9G 63.0G 288M 62.5G 0 0 ganeti1004.eqiad.wmnet 427.9G 427.9G 63.0G 288M 62.5G 0 0
The columns are respectively: Disk Total, Disk Free, Memory Toal, Memory used by node itself, Memory Free, Instances for which this node is primary, instances for which this node is secondary
Detecting the master node
The master node can be queried by running
sudo gnt-node list -o name,master_candidate,master
View the job queue
Ganeti has a job queue built-in. Most of the times it's working fine but if something is taking too long it might be helpful to check what's going on in the job queue
and getting a job id from the result
gnt-job info #ID
Hardware/software upgrades on a ganeti cluster can happen with 0 downtime to the VMs operations. The procedure to do so is outlined below. In case a shutdown/reboot is needed the procedure to empty to node is described. The rolling
Do the software upgrade (if needed)
throughout the cluster. It should have 0 repercussions to any VM anyway. Barring a Ganeti bug in the upgraded version, the cluster itself should also have 0 problems
Doing a rolling reboot of the cluster is easy. Empty every node, reboot it, check that it is online, proceed to the next. The one thing to take care is to not reboot the master without failing it over first.
Failover the master
Choose a master candidate that suits you. You can get master candidates by
sudo gnt-node list -o name,master_candidate
sudo gnt-cluster master-failover
The cluster IP will now be served by the new node and the old one is no longer the master.
There might be a time where the cluster will look/actually be unbalanced. That will be true after a rolling reboot of the nodes. Doing a rebalancing is easy and baked into ganeti, all it takes is running a command
sudo hbal -L -X
Please run it in a screen session. It might take quite a while to finish. The jobs have been submitted so it's fine losing that session but it's still prudent.
The cluster will calculate a current score, run some heuristic algorithms to try and minimize that score and then execute the commands require to reach that state.
Reboot/Shutdown for maintenance a node
Select a node that needs rebooting/shutdown for brief hardware maintenance and empty of primary instances
sudo gnt-node migrate -f ganeti1004.eqiad.wmnet
sudo gnt-node list
should return 0 primary instances for the node. It is safe to reboot it or shut it down for a brief amount of time for hardware maintenance
Shutdown a node for a prolonged period of time
Should the node be going down for an undetermined amount of time, also move the secondary instances
sudo gnt-node migrate -f <node_fqdn> sudo gnt-node evacuate -s <node_fqdn>
The second command means moving around DRBD pairs and syncing disk data. It is bound to take a long time, so find something else to do in the meanwhile
sudo gnt-node list
should return 0 for both primary instances as well as secondary instances. Before powering off the node we need to remove it from the cluster as well
sudo gnt-node remove <node_fqdn>
NOTE: Do not forget to readd it after it is fixed (if it ever is)
sudo gnt-node add <node_fqdn>
Create a VM
Creating a VM is easy. Most of the steps are the same as for production so keep in mind the regular process as well.
Assign a hostname/IP
Same process as for hardware. Assign the IP/hostname and make sure DNS changes are live before going forward.
Create the VM (private IP)
gnt-instance add \ -t drbd \ -I hail \ --net 0:link=br0 \ --hypervisor-parameters=kvm:boot_order=network \ -o debootstrap+default \ --no-install \ -B vcpus=<x>,memory=<y>g \ --disk 0:size=<z>g \ <fqdn>
Note the the VM will NOT be started. That's on purpose for now. x, y ,z on the above are variables. t,g,m denote tera,giga,mega bytes respectively.
Create the VM (public IP)
gnt-instance add \ -t drbd \ -I hail \ --net 0:link=vlan<VLAN_ID> \ --hypervisor-parameters=kvm:boot_order=network \ -o debootstrap+default \ --no-install \ -B vcpus=<x>,memory=<y>g \ --disk 0:size=<z>g \ <fqdn>
Note that the only difference between public/private IP is the <VLAN_ID> required. This is the VLAN ID used by the network. Here's multiple different ways to find it out:
- The easiest (USE THIS FOR NOW) Use the ganeti nodes themselves: /etc/network/interfaces. All support VLANS apart from the very basic one (the one the ganeti hosts are installed into anyway are there in the form of vlan<VLAN_ID>. For now it is just one as we don't yet support cross row.
- Use the switches: This requires knowning just a bit more about the network. Figure out the row the ganeti hosts are racked into, log in into the corresponding switch and get the vlan ID with show vlans <id_name>.
Get the MAC address of the NIC
gnt-instance info <fqdn> | grep MAC
Get the MAC address
Same as usual. Use linux-host-entries.ttyS0-115200 for Ganeti VMs. Otherwise you will not be getting a console
Update autoinstall files
Same as usual. Do however add virtual.cfg to the configuration for a specific VM. Example: https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/install-server/files/autoinstall/netboot.cfg;601720be51228f7eae3de17988b1afa8881a5bdb$71
Start the VM
gnt-instance start <fqdn>
and connect to the console
gnt-instance console <fqdn>
Ctrl+] to leave the console
Set boot order to disk
WARNING: Fail to do this and the VM will be stuck in an endless reboot, install, reboot loop
Assuming the installation goes on well but before it finishes, you need to set the boot order back to disk. This is a limitation of the current version of the Ganeti software and is expected to be solved (upstream is aware).
gnt-instance modify \ --hypervisor-parameters=boot_order=disk \ <fqdn>
Note: when the VM has finished installing, it will shutdown automatically. The Ganeti software includes HA checks and will promptly restart it. We rely on this behaviour to have the VM successfully installed. However, if you list the VMs during this phase you will see the VM in ERROR_down. Don't worry, this is expected.
Assign role to the VM in puppet
Delete a VM
Irrevocably deleting a VM is done via:
gnt-instance remove <fqdn>
Please remember to clean up DHCP/DNS entries afterwards
Shutdown/startup a VM
gnt-instance startup <fqdn> gnt-instance shutdown <fqdn>
Note: In the shutdown command, ACPI will be used to achieve a graceful shutdown of the VM. A 2 minute timeout exists however, after which the VM will be forcefully shutdown. In case you prefer to not wait those 2 minutes, --timeout exists and can be used like so
gnt-instance shutdown --timeout 0 <fqdn>
Get a console for a VM
You can get log into the "console" for a Ganeti instance via
gnt-instance console <fqdn>
The console can be left with "ctrl + ]"
Resize a VM
Make sure first that the cluster has adequate space for whatever resource you want to increase (if you do want to increase and not decrease a resource). This is done manually by a combination of grafana statitics for CPU/Memory utilization and the output of gnt-node list for disk space utilization. After that you can issue the following command to increase/decrease the memory size and number of Virtual CPUs assigned to a VM
gnt-instance modify -B mem=<X>[gm],vcpus=<N> <fqdn>
where X, N are physical numbers. X can be suffixed by g or m for Gigabytes or Megabytes (please don't do Terabytes ;))
Adding space to an existing disk is possible. But do note that the resizing of partitions and filesystems is up to you, as ganeti can't do it for you. The command would be.
gnt-instance modify --disk #:size=X[gmt] <fqdn>
where # is the number of disk starting from 0. You can get the disks allocated to a VM using gnt-instance info <fqdn>. Again X is a physical number suffixed for Gigabytes/Megabytes/Terabytes.
Adding a disk is also easy if you want to avoid the mess with having to resize partitions/filesystems. The command would be:
gnt-instance modify --disk add:size=X[gmt] <fqdn>
Again X is a physical number suffixed for Gigabytes/Megabytes/Terabytes.
All of the commands that have a Y/N prompt can be forced with a -f. For example the following will spare you the prompt
gnt-instance remove -f <fqdn>
All commands are actually jobs. If you would rather not wait on the prompt --submit will do the trick
gnt-instance shutdown --submit <fqdn>