You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Portal:Cloud VPS/Admin/Procedures and operations: Difference between revisions
imported>Arturo Borrero Gonzalez (→nova: mention puppet and services) |
imported>Arturo Borrero Gonzalez (→Canary VM instance in every hypervisor: refresh image uuid to a buster one) |
||
Line 110: | Line 110: | ||
<code> | <code> | ||
root@cloudcontrol1004:~# OS_TENANT_NAME=testlabs openstack server create --flavor 2 --image | root@cloudcontrol1004:~# OS_TENANT_NAME=testlabs openstack server create --flavor 2 --image 031d2d76-8368-4066-a502-d28107d0195e --availability-zone host:cloudvirt1007 --nic net-id=7425e328-560c-4f00-8e99-706f3fb90bb4 canary1007-01 | ||
</code> | </code> | ||
Revision as of 10:04, 7 January 2020
This page describes some standard admin procedures and operations for our Cloud VPS deployments, specifically for the jessie/mitaka/neutron combinations.
Manual routing failover
In the old nova-network days, a very long procedure was required to manually failover from a dead/under-maintenance network node (typically cloudnetXXXX|labnetXXXX).
Nowadays is much more simpler. This procedure assumes you want to move the active service from one node to the other:
Examples of neutron operations |
---|
The following content has been placed in a collapsed box for improved usability. |
root@cloudcontrol1003:~# neutron router-list
+--------------------------------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------+
| id | name | external_gateway_info | distributed | ha |
+--------------------------------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------+
| d93771ba-2711-4f88-804a-8df6fd03978a | cloudinstances2b-gw | {"network_id": "5c9ee953-3a19-4e84-be0f-069b5da75123", "enable_snat": true, "external_fixed_ips": [{"subnet_id": "e4fb2771-a361-4add-ac4e-280cc300c59f", "ip_address": "10.64.22.4"}]} | False | True |
+--------------------------------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------+
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | standby |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | :-) | active |
+--------------------------------------+--------------+----------------+-------+----------+
user@cloudnet1004:~ $ sudo systemctl stop neutron-metadata-agent.service neutron-dhcp-agent.service neutron-l3-agent.service neutron-linuxbridge-agent.service
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | active |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | xxx | standby |
+--------------------------------------+--------------+----------------+-------+----------+
|
The above content has been placed in a collapsed box for improved usability. |
Alternatively you can play with other neutron commands to manage agents.
Examples of other neutron commands |
---|
The following content has been placed in a collapsed box for improved usability. |
root@cloudcontrol1003:~# neutron agent-update 970df1d1-505d-47a4-8d35-1b13c0dfe098 --admin-state-down
Updated agent: 970df1d1-505d-47a4-8d35-1b13c0dfe098
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | active |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | False | :-) | standby |
+--------------------------------------+--------------+----------------+-------+----------+
root@cloudcontrol1003:~# neutron agent-update 970df1d1-505d-47a4-8d35-1b13c0dfe098 --admin-state-up
Updated agent: 970df1d1-505d-47a4-8d35-1b13c0dfe098
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | active |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | :-) | standby |
+--------------------------------------+--------------+----------------+-------+----------+
root@cloudcontrol1003:~# neutron l3-agent-router-remove 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b d93771ba-2711-4f88-804a-8df6fd03978a
Removed router d93771ba-2711-4f88-804a-8df6fd03978a from L3 agent
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | :-) | standby |
+--------------------------------------+--------------+----------------+-------+----------+
root@cloudcontrol1003:~# neutron l3-agent-router-add 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b d93771ba-2711-4f88-804a-8df6fd03978a
Added router d93771ba-2711-4f88-804a-8df6fd03978a to L3 agent
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | active |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | :-) | standby |
+--------------------------------------+--------------+----------------+-------+----------+
|
The above content has been placed in a collapsed box for improved usability. |
By the time of this writing is not known which method produces less impact in terms of network downtime.
Remove hypervisor
Follow this procedure to remove a virtualizacion server (typically cloudvirtXXXX|labvirtXXXX).
- Remove or shutdown node
openstack hypervisor list
will still show itnova service-list
will show it as down once it's taken away:
| 9 | nova-compute | labtestvirt2003 | nova | disabled | down | 2017-12-18T20:52:59.000000 | AUTO: Connection to libvirt lost: 0 |
nova service-delete 9
will remove where the number is the id fromnova service-list
VM/Hypervisor pinning
In case you want to run a concrete VM in a concrete hypervisor, run this command at instance creation time:
OS_TENANT_NAME=myproject openstack server create --flavor 3 --image b7274e93-30f4-4567-88aa-46223c59107e --availability-zone host:cloudvirtXXXX myinstance
Canary VM instance in every hypervisor
Each hypervisor should have a canary VM instance running.
The command to create it should be something like:
root@cloudcontrol1004:~# OS_TENANT_NAME=testlabs openstack server create --flavor 2 --image 031d2d76-8368-4066-a502-d28107d0195e --availability-zone host:cloudvirt1007 --nic net-id=7425e328-560c-4f00-8e99-706f3fb90bb4 canary1007-01
Updating openstack database password
Openstack uses many databases, and updating the password require several stesp.
nova
We usually have the same password for the different nova dabatases nova_eqiad1 and nova_api_eqiad1.
- in the puppet private repo (in puppetmaster1001.eqiad.wmnet), update the profile::openstack::eqiad1::nova::db_pass hiera key in hieradata/eqiad/profile/openstack/eqiad1/nova.yaml.
- in the puppet private repo (in puppetmaster1001.eqiad.wmnet), update class passwords::openstack::nova in modules/passwords/manifests/init.pp.
- in the database (m5-master.eqiad.wmnet), update grants, something like:
GRANT ALL PRIVILEGES ON nova_api_eqiad1.* TO 'nova'@'208.80.153.x' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_api_eqiad1.* TO 'nova'@'%' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_eqiad1.* TO 'nova'@'208.80.153.x' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_eqiad1.* TO 'nova'@'%' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_cell0_eqiad1.* TO 'nova'@'208.80.153.x' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_cell0_eqiad1.* TO 'nova'@'%' IDENTIFIED BY '<%= @db_pass %>';
- repeat grants for every cloudcontrol server IP and IPv6 address.
- update cell mapping database connection string (yes, inside the database itself) in m5-master.eqiad.wmnet:
$ mysql nova_api_eqiad1;
[nova_api_eqiad1]> update cell_mappings set database_connection='mysql://nova:<password>@m5-master.eqiad.wmnet/nova_eqiad1' where id=4;
[nova_api_eqiad1]> update cell_mappings set database_connection='mysql://nova:<password>@m5-master.eqiad.wmnet/nova_cell0_eqiad1' where id=1;
- run puppet everywhere (in cloudcontrol servers etc) so the new password is added to the config files.
- if puppet is not restarting the affected services, restart them by hand (systemctl restart nova-api, etc)
neutron
TODO: add information.
glance
TODO: add information.
designate
TODO: add information.
keystone
TODO: add information.