You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Portal:Cloud VPS/Admin/Network: Difference between revisions
imported>Arturo Borrero Gonzalez (→Ingress & Egress: refresh section) |
imported>David Caro |
||
Line 36: | Line 36: | ||
=== ceph osd === | === ceph osd === | ||
* ceph control (ssh, monitoring, mon communication, client communication) plane: | |||
** Primary interface on external card (for example '''ens2f0np0''') | |||
** 10.64.20.0/24 network | |||
** Connected to the physical switch in their rack | |||
** The switch port connecting to this interface needs to configure '''untagged vlan 1118 (cloud-hosts1-eqiad)'''. | |||
* ceph data plane (osd to osd communication): | |||
** Secondary interface on external card (for example '''ens2f1np1''') | |||
** 192.168.4.0/24 network | |||
** Connected to the physical switch in their rack | |||
** The switch port connecting to this interface needs to configure '''untagged vlan 1105 (cloud-storage1-eqiad)'''. | |||
=== ceph mons === | === ceph mons === | ||
* ceph control (ssh, monitoring, client communication, osd communication) plane: | |||
** Primary interface on external card (for example '''ens2f0np0''') | |||
** 10.64.20.0/24 network | |||
** Connected to the physical switch in their rack | |||
** The switch port connecting to this interface needs to configure '''untagged vlan 1118 (cloud-hosts1-eqiad)''' | |||
== Edge network == | == Edge network == |
Revision as of 16:43, 19 May 2021
This page explains how the CloudVPS network works, including the neutron Openstack component.
For the sake of explanation, this document uses the eqiad1 deployment as example, but may be others with same mechanisms.
Network topology
There are 2 different kind of network involved:
- control plane networks: those used in physical servers for SSH, puppet, monitoring, etc. Is a wiki-production network, usually in the 10.x.x.x range.
- data plane networks: those used by CloudVPS virtual clients, and all the traffic doing ingress/egress through the edge of the network.
There are 3 routers involved:
- The neutron virtual router (by means of neutron-l3-agent, neutron-linuxbridge-agent, neutron-server, etc). This router connects the internal software-defined networks to the cloud edge network.
- The physical cloudgw router (a pair of linux servers). This router is the main gateway for all CloudVPS ingress/egress traffic, and is the main netowrk endpoint facing the public internet.
- The physical cloudsw router. This router connects cloudgw to the rest of the internet, including wiki-production networks.
Datacenter network
cloudvirts
- control plane: primary interface (for example eth0) connected to the physical switch in their rack. The switch port connecting to this interface doesn't need any specific configuration.
- data plane: secondary interface (for example eth1) connected to the physical switch in their rack. This switch port must configured in VLAN tagged mode for vlan 1105.
There as been some research on whether we should collapse the 2 interfaces in one, aiming to reduce usage on 10G ports on the switches. The initial research showed promising results, but we didn't introduce this change yet.
cloudnet
- control plane: primary interface (for example eth0) connected to the physical switch in their rack. The switch port connecting to this interface doesn't need any specific configuration.
- data plane: secondary interface (for example eth1) connected to the physical switch in their rack. This switch port must configure a VLAN trunk with vlan 1105 and vlan 1107.
cloudgw
- control plane: primary interface (for example eth0) connected to the physical switch in their rack. The switch port connecting to this interface doesn't need any specific configuration.
- data plane: secondary interface (for example eth1) connected to the physical switch in their rack. The switch port must configure a VLAN trunk with vlan 1120 and vlan 1107.
ceph osd
- ceph control (ssh, monitoring, mon communication, client communication) plane:
- Primary interface on external card (for example ens2f0np0)
- 10.64.20.0/24 network
- Connected to the physical switch in their rack
- The switch port connecting to this interface needs to configure untagged vlan 1118 (cloud-hosts1-eqiad).
- ceph data plane (osd to osd communication):
- Secondary interface on external card (for example ens2f1np1)
- 192.168.4.0/24 network
- Connected to the physical switch in their rack
- The switch port connecting to this interface needs to configure untagged vlan 1105 (cloud-storage1-eqiad).
ceph mons
- ceph control (ssh, monitoring, client communication, osd communication) plane:
- Primary interface on external card (for example ens2f0np0)
- 10.64.20.0/24 network
- Connected to the physical switch in their rack
- The switch port connecting to this interface needs to configure untagged vlan 1118 (cloud-hosts1-eqiad)
Edge network
- neutron manages floating IP NAT and all the software defined network in the virtual realm.
- cloudgw handles routing_source_ip and dmz_cidr and connects neutron to cloudsw.
- cloudsw connects to the internet and the rest of wiki-production networks.
Virtual network
TODO. Inside the virtual realm.
Topology data example
In the case of the eqiad1 deployment, the relevant elements are:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN for instances | lan-flat-cloudinstances2b | cloud-instances2-b-eqiad | cloud-instances2-b-eqiad (vlan 1105) | 172.16.0.0/21 | vlan 1105 cidr |
WAN for floating IPs | wan-transport-eqiad | cloud-eqiad1-floating | --- (no vlan) | 185.15.56.0/25 | cidr |
WAN for transport | wan-transport-eqiad | cloud-gw-transport-eqiad | cloud-gw-transport-eqiad (vlan 1107) | 185.15.56.236/30 | vlan 1107 cidr |
LAN provider (control plane) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-b-eqiad (vlan 1118) | 10.64.20.0/24 | vlan 1118 cidr |
WAN for transport | --- (ignored by neutron) | --- (ignored by neutron) | cloud-instances-transport1-b-eqiad (vlan 1120) | 185.15.56.240/29 | vlan 1120 cidr |
In the case of the codfw1dev deployment, the relevant elements are:
What | Neutron network object | Neutron subnet object | Physical name | Addressing | Netbox |
---|---|---|---|---|---|
LAN for instances | lan-flat-cloudinstances2b | cloud-instances2-b-codfw | cloud-instances2-b-codfw (vlan 2105) | 172.16.128.0/24 | vlan 2105 cidr |
WAN for floating IPs | wan-transport-codfw | cloud-codfw1dev-floating | --- (no vlan) | 185.15.57.0/29 | cidr |
WAN for transport | wan-transport-codfw | cloud-gw-transport-codfw | cloud-gw-transport-codfw (vlan 2107) | 185.15.57.8/30 | vlan 2107 cidr |
LAN provider (HW servers) | --- (ignored by neutron) | --- (ignored by neutron) | cloud-hosts1-b-codfw (vlan 2118) | 10.192.20.0/24 | vlan 2118 cidr |
WAN for transport | --- (ignored by neutron) | --- (ignored by neutron) | cloud-instances-transport1-b-codfw (vlan 2120) | 185.15.56.240/29 | vlan 2120 cidr |
Ingress & Egress
Some notes on the ingress & egress particularities.
routing_source_ip
By default, all the traffic from VMs to the Internet (egress) is source NATed using a single IPv4 address. This address is called routing_source_ip.
There are 2 cases in which this egress NAT is not applied:
- the VM->destination is part of the #dmz_cidr exclusions
- the VM has an explicit floating ip associated (the floating ip will be used as both SNAT and DNAT)
dmz_cidr
The dmz_cidr mechanisms allows us to define certain IP ranges to which VMs can talk to directly without NAT being involved.
A typical configuration per deployment looks like (please refer to ops/puppet.git for actual hiera values):
profile::openstack::eqiad1::cloudgw::dmz_cidr: # VMs --> wiki (text-lb.eqiad) - "172.16.0.0/21 . 208.80.154.224" # VMs --> wiki (upload-lb.eqiad) - "172.16.0.0/21 . 208.80.154.240"
You can read these config as: do not apply NAT to connections src:dst, src:dst, src:dst.
Please note that the dmz_cidr mechanism takes precedence over the routing_source_ip configuration.
A static route is required on the routers so return traffic knows what path to take to reach the Cloud Private IPs.
For example on cr1/2-eqiad: routing-options static route 172.16.0.0/21 next-hop 185.15.56.244/29
Floating IPs
This mechanisms allows us to create an additional public IPv4 address in Neutron. Then this new IP address will be associated with a given instance and all of his egress/ingress traffic will use it (both SNAT and DNAT).
A quota needs to be previously assigned to a project due to limited resources.
Please note that the dmz_cidr mechanism overrides floating IP NAT configurations, and you can see non-NATed packets arriving at VMs with a floating IP assigned.
Here is an example of 3 software defined floating IPs created by Neutron in the codfw1dev deployment, not using eqiad1 for brevity, but it works exactly the same:
root@cloudnet2003-dev:~ # nft -s list chain ip nat neutron-l3-agent-float-snat
table ip nat {
chain neutron-l3-agent-float-snat {
ip saddr 172.16.128.19 counter snat to 185.15.57.2 fully-random
ip saddr 172.16.128.20 counter snat to 185.15.57.4 fully-random
ip saddr 172.16.128.26 counter snat to 185.15.57.6 fully-random
}
}
root@cloudnet2003-dev:~ # nft -s list chain ip nat neutron-l3-agent-OUTPUT
table ip nat {
chain neutron-l3-agent-OUTPUT {
ip daddr 185.15.57.2 counter dnat to 172.16.128.19
ip daddr 185.15.57.4 counter dnat to 172.16.128.20
ip daddr 185.15.57.6 counter dnat to 172.16.128.26
}
}
traffic from instance to own floating IP
VM instances may try having traffic to its own floating IP. As described in T217681#5035533 - Cloud VPS instance with floating (public) IP can not ping that IP directly, this is not possible with default configuration.
That packet arriving the VM instance would be a martian packet.
A workaround of this is to instruct the network stack to allow this kind of martian packet:
sysctl net.ipv4.conf.all.accept_local=1
accept_local - BOOLEAN Accept packets with local source addresses. In combination with suitable routing, this can be used to direct packets between two local interfaces over the wire and have them accepted properly. default FALSE
ingress & egress data example
Some important IP addresses in the eqiad1 deployment:
Type | Name | Address | Explanation | Where is defined, where to change it | DNS FQDN |
---|---|---|---|---|---|
ingress | incoming gateway | 185.15.56.244/29 | neutron address in the WAN transport subnet for ingress | Core routers (static route) & neutron main router object | cloudinstances2b-gw.openstack.eqiad1.wikimediacloud.org
|
egress | routing_source_ip | 185.15.56.1 | IP address for main source NAT for VMs (mind dmz_cidr exclusions) | /etc/neutron/l3_agent.ini in cloudnet nodes (puppet). No NIC has this IP assigned. | nat.openstack.eqiad1.wikimediacloud.org
|
Some important IP addresses in the codfw1dev deployment:
Type | Name | Address | Explanation | Where is defined, where to change it | DNS FQDN |
---|---|---|---|---|---|
ingress | incoming gateway | 208.80.153.190/29 | neutron address in the WAN transport subnet for ingress | Core routers (static route) & neutron main router object | cloudinstances2b-gw.openstack.codfw1dev.wikimediacloud.org
|
egress | routing_source_ip | 185.15.57.1 | IP address for main source NAT for VMs (mind dmz_cidr exclusions) | /etc/neutron/l3_agent.ini in cloudnet nodes (puppet). No NIC has this IP assigned. | nat.openstack.codfw1dev.wikimediacloud.org
|
What Neutron is doing
This section tries to give some light on how Neutron is implementing our network topology under the hood, and what is doing with all this configuration.
Neutron uses 2 specific boxes: cloudnetXXXX.site.wmnet and cloudnetXXXX.site.wmnet (active-standby).
The neutron-server service (daemon, API, etc) runs on cloudcontrol boxes.
All the agents run in cloudnet boxes, execept neutron-linuxbridge-agent, which runs in cloudvirt boxes.
Example of running agents |
---|
The following content has been placed in a collapsed box for improved usability. |
root@cloudcontrol1003:~# neutron agent-list
+--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
| id | agent_type | host | availability_zone | alive | admin_state_up | binary |
+--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
| 468aef2a-8eb6-4382-abba-bc284efd9fa5 | DHCP agent | cloudnet1004 | nova | :-) | True | neutron-dhcp-agent |
| 601bef99-b53c-4e6a-b384-65d1feebedff | Metadata agent | cloudnet1003 | | :-) | True | neutron-metadata-agent |
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | L3 agent | cloudnet1003 | nova | :-) | True | neutron-l3-agent |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | L3 agent | cloudnet1004 | nova | :-) | True | neutron-l3-agent |
| 9f8833de-11a4-4395-8da5-f57fe8326659 | Linux bridge agent | cloudnet1003 | | :-) | True | neutron-linuxbridge-agent |
| ad3461d7-b79e-4279-921d-5a476e296767 | Linux bridge agent | cloudnet1004 | | :-) | True | neutron-linuxbridge-agent |
| b2f9da63-2f16-4aa5-9400-ae708a733f91 | Linux bridge agent | cloudvirt1021 | | :-) | True | neutron-linuxbridge-agent |
| d475e07d-52b3-476e-9a4f-e63b21e1075e | Metadata agent | cloudnet1004 | | :-) | True | neutron-metadata-agent |
| e382a233-e6a0-422e-9d2e-5651082783fc | Linux bridge agent | cloudvirt1022 | | :-) | True | neutron-linuxbridge-agent |
| ff2a8228-3748-4588-927b-4b6563da9ca0 | DHCP agent | cloudnet1003 | nova | :-) | True | neutron-dhcp-agent |
+--------------------------------------+--------------------+---------------+-------------------+-------+----------------+---------------------------+
|
The above content has been placed in a collapsed box for improved usability. |
When a virtual router is created, and assigned to an l3-agent, a linux network namespace (netns for short) will be created:
Example virtual router netns and l3 agents hosting routers |
---|
The following content has been placed in a collapsed box for improved usability. |
root@cloudnet1004:~# ip netns list | grep router
qrouter-d93771ba-2711-4f88-804a-8df6fd03978a
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
+--------------------------------------+--------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+
| 8af5d8a1-2e29-40e6-baf0-3cd79a7ac77b | cloudnet1003 | True | :-) | active |
| 970df1d1-505d-47a4-8d35-1b13c0dfe098 | cloudnet1004 | True | :-) | standby |
+--------------------------------------+--------------+----------------+-------+----------+
|
The above content has been placed in a collapsed box for improved usability. |
This netns will hold all the configuration: IP addresses (such as gateways, floating IPs), iptables rules (NAT, filtering, etc), and other information (static routes, etc).
Using virtual taps, this automatically-generated netns is connected to the main netns where the physical NICs live, along with bridges and vlan tagged subinterfaces.
All this is done in the eth1 interface, while eth0 is left for connection of the cloudnet box to the provider network.
When a virtual router is created, Neutron will decide in which l3-agent will be deploying it, taking into account HA parameters.
In our active-standby setup, only one l3-agent is active at a time, which means that all this netns/interfaces/iptables configuration is deployed by Neutron to just one node.
The 'q-' prefix in netns is from earlier development stages, Neutron was called Quantum.
Security policy
TODO: talk about security groups, dmz_cidr exclusion, core route filtering, etc