You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Network design - Eqiad WMCS Network Infra
This page details the configuration of the network devices managed by SRE Infrastructure Foundations (netops) to support cloud services in the Eqiad (Equinix, Ashburn) datacenter. Further information on the overall WMCS networking setup, including elements managed by the WMCS team themselves, are on the Portal:Cloud VPS/Admin/Network page.
The dedicated physical network currently consists of 4 racks of equipment, with 6 Juniper QFX-series switches deployed across the 4 racks. Additionally, rack C8 is connected to the virtual-chassis switches in row B, to provide connectivity for legacy servers installed there.
Racks C8 and D5 each have 2 switches, a main switch that connects servers and also has an uplink to one of the core routers, and a second switch which provides additional ports for servers. Most cloud hosts consume 2 switch ports, which means a single 48-port switch is not sufficient to connect all hosts in the racks, hence the second switch in each.
Racks E4 and F4 currently only have a single top-of-rack each, and it is hoped in time WMCS can adjust the server configs to use 801.1q / vlan tagging so separate physical ports are not required to connect to two or more networks.
The network is configured in a basic Spine/Leaf structure, with the switches in C8 and D5 acting as Spines, aggregating traffic from E4 and F4, and connecting to the outside world via the CR routers. Connections between racks E4/F4 and C8/D5 are 40G Ethernet (40GBase-LR) optical connections over single-mode fiber. The topology is not a perfect Spine/Leaf, however, as there is also a direct connection between cloudsw1-c8 and cloudsw1-d5. This is required for various reasons, principally that there is only a single uplink from each cloudsw to the CR routers, and an alternate path is needed in case of a link or CR failure.
Several networks are configured on the switches described in the last section. At a high-level networks are divided into the "cloud" and "production" realms, which are logically isolated from each other. This isolation is used to support the agreed Cross-Realm traffic guidelines.
Isolation is achieved through the use of Vlans and VRFs (routing-instances in JunOS) on the cloudsw devices. The default routing-instances on the cloudsw's is used for the production realm traffic, and a named routing-instance, 'cloud', is used for the cloud-realm.
Some networks exist purely at layer-2, with the switches only forwarding traffic between servers based on destination MAC address, remaining unaware of the IP addressing used on those layer-2 segments. Those networks only carry traffic internal to the cloud realm, and specific hosts, like cloudnet and cloudgw, act as the layer-3 routers for them. They are not technically part of the cloud vrf, as there are no IP interfaces belonging to them on the switches, but they are considered to be part of the cloud realm.
The WMCS cloudnet hosts, implementing the OpenStack Neutron router, act as gateway for these networks.