You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Dell Enterprise Sonic Evaluation: Difference between revisions
|Line 29:||Line 29:|
SONiC is based on Debian Linux, with the SAI added to provide an interface to the switch hardware. This makes it very easy to get to grips with for SREs who are already familiar with Debian. It is a modular distro in which networking applications (e.g., FRR, LDDP, LACP, NAT etc) run independently in dedicated Docker containers, which
SONiC is based on Debian Linux, with the SAI added to provide an interface to the switch hardware. This makes it very easy to get to grips with for SREs who are already familiar with Debian. It is a modular distro in which networking applications (e.g., FRR, LDDP, LACP, NAT etc) run independently in dedicated Docker containers, which use Redis as an information source to share configuration and state info.
|Line 67:||Line 67:|
| Dell S5232F-ON || 32xQSFP28 Aggregation / Spine Switch || QFX5120-32C
| Dell S5232F-ON || 32xQSFP28 Aggregation / Spine Switch || QFX5120-32C
Revision as of 16:29, 8 August 2022
For many years Wikimedia have used Juniper equipment for all networking requirements (currently edge/WAN routers, datacenter switches, management firewalls). While we are broadly happy with Juniper, it is also imperative to assess alternatives, ensuring the foundation gets value for money and the best performance possible.
Recent years have seen the cost of datacenter switches in particular increasing. This has partially been driven by a gradual move from 1G to faster connections to end-hosts, with the newer equipment supporting 10G+ speeds being pricier. But there have also been increased costs for software licenses, which in the past were part of the 'base' system, pushing up overall costs. The supply-chain / chip shortage problems that emerged from 2020 onwards have only accelerated this trend.
JunOS, Juniper's operating system, stands out in the foundation as one of the largest closed-source / proprietary software systems in use. In many respects this is standard for network devices. These typically use custom ASICs for packet forwarding, and are not based on the largely open x86/amd64 architecture which server operating systems target. The specialized and proprietary nature of such hardware has seen vendors typically offering "vertically integrated" software/hardware stacks since the dawn of the industry.
In more recent years there has been some movement away from this. Driven initially by the large web-scalers, disaggregated or white box switching has risen to prominence. In this model the switching hardware is provided by one company, and the operating-system is sourced elsewhere (much like one buys a Dell server and runs Debian or Windows on it without consulting Dell). Such an approach offers many advantages, such as being able to change vendors but keep the same operating system. Or change the OS in use on existing hardware. "White box" switch hardware is typically available for a substantially lower cost than brand-name alternatives. There can be drawbacks, however, such as not having a "one stop shop" for support.
Another caveat is that a small number of ASIC vendors, notably Broadcom, have the switching market carved up. These vendors often gate access to their designs and SDKs, limiting the scope for independent parties to create software for them. In one famous case Broadcom ceased licensing its SDK to Cumulus Networks, after they were acquired by rival hardware manufacturer nVidia. This left some customers forced to choose another hardware supplier or move to another OS when they had to upgrade. So the reality right now is that it is not possible to produce an operating system for switching hardware without permission from the ASIC vendors.
Nevertheless the space has opened up and there are several "white box" NOS's available, even if things won't ever be as open as for server hardware. Options include commercial offerings such as PicOS, ArcOS and OcNOS, as well as open-source projects such as DANOS and OpenSwitch.
Of the various open-source options SONiC has become one of the most popular, with significant industry support. Initially developed by Microsoft to power their Azure cloud service, it has since been open-sourced and become part of the OCP Networking Project, with software development stewarded by the Linux Foundation
It leverages the Switch Abstraction Interface (SAI) also defined by OCP, to communicate with switching silicon. Significantly Broadcom has contributed a lot to its development, providing an SAI implementation for thier ASICs and also committing to continued support for future silicon they develop.
SONiC is based on Debian Linux, with the SAI added to provide an interface to the switch hardware. This makes it very easy to get to grips with for SREs who are already familiar with Debian. It is a modular distro in which networking applications (e.g., FRR, LDDP, LACP, NAT etc) run independently in dedicated Docker containers, each of which use Redis as an information source to share configuration and state info.
The modular Linux-based nature make it easy for new applications to be developed or added to the platform, as well as for common Linux automation tooling to be leveraged. It can, for instance, run a standard puppet agent installed from upstream Debian repos, or a Prometheus Node Exporter. It ships with various containerized daemons to provide functionality, most notably employing FRRouting for routing protocols such as OSPF/BGP. While each of these sub-components have their own configuration files and syntax, and various YANG models are defined for specific configuration elements, there is inconsistent coverage between the various ways to configure devices. More recently the Management Framework has been introduced to provide a unified way to configure all these elements. It offers an "industry standard" (i.e. Cisco-like) CLI, as well as REST and gRPC endpoints for the current set of support YANG models.
It supports a dedicated mnagement VRF for connecting a devices management-only network interface. SSH is supported as one would expect, and it supports the standard SNMP MIBs any other Debian system would. Redis is the ultimate store of the full configuration for all elements, and the DB is written to /etc/sonic/config_db. json for persistence. Network state is synced to the Linux kernel, so standard Linux command line interfaces such as iproute2 can be used to view state. Using such tools to modify state is highly discouraged.
SONiC's open source nature is in stark contrast with the more traditional network operating systems, which are provided with hardware and software support from the vendor. If you are running SONiC there is no TAC to contact to get assistance if something does not work as expected, or assistance is needed. Certainly if there is a hardware fault with a device you can go back to the HW vendor for replacement, but outside that users are on their own.
Unlike perhaps the situation with server/x86 based platforms, there is a fairly small install-base of SONiC users. This means community support is limited. Many SONiC users, like Microsoft, LinkedIn or Ali Baba, operate at massive scale, contribute to SONiC themselves, and have staff internally who can provide support, bug fixes, diagnostics etc. For smaller enterprises, however, the lack of any support or sufficient internal resources to deal with problems is a big issue. Smaller outfits often also require more or different features than the web-scalers, which SONiC lacked in the early days.
This situation has made some smaller enterprises wary of moving to SONiC, forgoing the support they're accustomed to from their existing vendors. While Juniper support has not been stellar in recent years, WMF netops are broadly of the opinion that moving to a completely unsupported new platform would represent an unacceptable risk.
Dell Enterprise SONiC
Dell have been producing network devices for several years now. Anecdotally it is common to hear less than favorable opinions from network engineers about the Force-10 OS they ship with these. So perhaps it is not surprising that Dell have decided to offer SONiC as an OS option for some of their switches, and bridge the support and feature gap to make it more attractive to small and medium sized enterprises.
Dell Enterprise Sonic is the result. This initiative has seen them become one of the largest contributors to the SONiC project over the past few years. They offer two variants of the OS, standard and premium. Standard is their build of the upstream open-source project, built and released on a regular schedule. It may contain Dell contributions not yet merged into the upstream project, but does not contain any closed source elements. The premium variant offers more rich analytics and features, such as Mirror on Drop and Inband Flow Analysis. It may also contain closed source features that won't be upstreamed to the open source release.
In terms of WMF requirements and longer-term direction the standard build covers our requirements. Each version is available in either a "cloud bundle" or "enterprise bundle". The enterprise bundle is required by WMF, supporting VXLAN/EVPN which is not available in the cloud offering.
Dell Network Switches
Dell Enterprise Sonic runs on only a small subset of the network devices they produce, namely those based on the Broadcom Trident 3 ASIC (similar to Juniper QFX5120 series).
Initially spurred by a desire to explore more open networking platforms, and then by concerns about cost and lead-time for Juniper equipment, SRE Netops arranged with Dell to get some network devices on test. Specifically they delivered 2 of each of these models:
|Dell S5248F-ON||48xSFP28 + 6xQSFP28 Top-of-Rack / Leaf Switch||QFX5120-48Y|
|Dell S5232F-ON||32xQSFP28 Aggregation / Spine Switch||QFX5120-32C|
These were set up in codfw in a basic Spine/Leaf topology, with all links enabled for IP running BGP to exchange routes (underlay) and running iBGP over the top for the EVPN SAFI. More details here.
EVPN was used to allow for stretched layer-2 segments to be created, similar to the design for the 2022 Eqiad expansion. This is the only supported mechanism for stretched layer-2 segments using SONiC, other than a basic Spanning-Tree/trunking configuration, which is not suitable for a variety of reasons. Similar to the Eqiad EVPN setup running EVPN/VXLAN for L2 extension necessitates the use of overlay VRFs for L3 networking. A single VRF was created for the testing, to which all external L3 connections were terminated.
Various tests were carried out to validate the data-plane functions required of our top-of-rack and aggregation switches worked as expected on the Dell platform. Tests were done to validate the devices could support both our legacy "row-wide/L2" topology, as well as the newer "per-rack/L3 ToR" designs implemented in Drmrs and the Eqiad Expansion.
The main elements that were tested are shown in the below table. All functions were validated locally, i.e. between ports on a single device, as well as across the "fabric" between ports connected to different switches.
|Transceiver Support||Test that fs.com optic modules we commonly used are supported and work as expected.|
|L2 Segmentation||Ability to define Vlans and place ports into them to create virtual L2 segments.|
|L2 Switching||Ability for end hosts to exchange traffic directly at layer 2, when connected to the same Vlan. Correct learning of MAC addresses and distribution to remove devices / addition to remote device MAC forwarding tables. Correct forwarding of frames for broadcast, unknown or multicast destinations. Failover works as expected if links go down. Jumbo frame support.|
|L3 Routing||Basic routing in the overlay VRF, i.e. reachability to directly connected networks works ok, routes are correctly propagated to all devices, failover works as expected. All tests validated for both IPv4 and IPv6.|
|BGP||BGP peering as required to external elements (i.e. CR routers, end-hosts running BGP for load-balancing, anycast etc.). Correct propagation of externally learnt routes to all devices in EVPN fabric. BFD support in VRF.|
|Anycast Gateway||Use of a distributed anycast-gateway to provide a local IP first-hop on every edge device in a stretched L2 Vlan|
|Required Services||Validate various functions work as needed, DHCP Option 82 insertion, DHCP relay, IPv6 RA generation, SNMP, SSH, User account creation.|
Detailed test results and documentation can be viewed here.
In general all required functionality was supported and tests successful. Some minor elements didn't function exactly as we'd expect, but all are very minor, certainly not "show stoppers".
|DHCP Option 82||The system supports the insertion of DHCP Option 82 information into DHCP requests sourced by end hosts, and will include the source port and switch hostname, which is the info we require. The format is slightly different to that the Juniper QFX send, but we can change our DHCPd config on install hosts to accomodate. Medium-term we will likely move away from dependency on top-of-rack switches inserting this information for reimaging hosts, and work towards using DHCP Option 97 information which the hosts themselves include in requests.|
|IPv6 Router Advertisements||This functionality is supported in FRRouting, included in SONiC, however SONiC's data model for YANG or CLI configuration does not include it. So it can only be configured from the FRR "vtysh" shell, outside the normal configuration framework. Dell have committed to adding this to the regular command line / yang model in an upcoming release.|
|IPv6 Link-Local used for ICMP Messages||The devices default to using an interfaces link-local IPv6 address when sourcing ICMPv6 messages. This didn't cause any actual issue other than traceroutes showing the link-local IPs, which isn't much use.|
Overall Dell Enterprise SONiC worked very well, and we did not encounter any significant problems that would cause us to rule out the platform. In general the Linux base made it easy to navigate and get to grips with, and the CLI and configuration was straightforward and easy to use.
- System works well and is easy to get working.
- Dell are keen to make the product a success, and seem keen to provide hands-on assistance / support.
- That could of course change if it does get more traction, and support is passed to more junior staff.
- Experience / confidence gained by using the Dell-supported version might allow us to eventually transition to using the purely open-source release, and eliminate software and support costs.
- Debian base makes it easy to integrate into our overall stack, opens up new possibilities.
- Dell are the only vendor who have given us short lead-times for datacenter switches in 2022.
- Broadcom Trident 3 is same hardware as in current-gen QFX series, so we can expect similar performance.
- Familiar with Dell procurement, RMA process etc. with server hardware, could leverage that experience for switches too.
- OS lacks "frills" and "nerd knobs" to configure the same variety of features as Juniper.
- This can be viewed as a positive in that the code base is smaller, and thus potentially less bugs.
- What we need right now is covered, so no massive problem.
- Relatively small installed base and newness of the OS may mean there are lots of unfound bugs.
- JunOS by contrast is much older and has a massive install base.
- Thus there is much more documentation and community resources available for JunOS.
- While Dell and others have done a good job creating a single management interface to the platform, the reality of the multiple underlying components operating independently is still not completely masked.
- There appear to be no barriers to automating the platform, but re-writing Homer to support Netconf/YANG style data models, and add a new transport module to support their API, will take a considerable amount of effort.
- Lack of familiarity compared to JunOS
- No SNMP MIB support for device-specific / environmental data. So cannot easily integrate with LibreNMS for that.
- Can export to Prometheus relatively easy however.
Given the different sales models by different vendors, for both hardware, software and support, it is not always possible to do a direct comparison between vendors. The below sheet does give a break-down based on Juniper list prices, recent Juniper quotes, and quotes from Dell (including discount) for equipment.
The TL;DR is that the costs are quite close in most analyses. It may work out slightly cheaper in some cases to go with Dell, but either way there are not massive savings for the foundation in switching.
Overall, despite the good experience with Dell Enterprise SONiC, netops' preference is to stick with Juniper QFX series. Reasons include:
- Familiarity and real-world production experience gives higher confidence than any lab-testing could.
- JunOS overall seems a more mature and feature-rich system.
- Both platforms come in at a similar price.
- Having the same OS/config across edge routers and datacenter switches results in lower management overhead / SRE time.
- Homer is already built to support JunOS and we have templates for the platform already, adapting for SONiC is a big project.
- Confidence that Juniper will remain in the datacenter market and supply JunOS into the future.
- While Dell seem committed to SONiC whether they continue to supply it seems less certain, may depend on its success.
- Given new-ness of Dell SONiC there is some fear adopting it would make us beta-testers for them.
- DC-Ops estimate that we have enough existing capacity in Eqiad/Codfw to wait for delivery of Juniper QFX devices in 2023.
- Short-term capacity concerns are not forcing us to select a vendor with faster lead times.
That said the exercise has given us reasonable confidence in Dell's offering, and should we need to move to another platform for any reason it can be considered a viable option. The overall recommendation to stick with Juniper is largely down to being conservative, and JunOS being a good platform which we are relatively happy with, as opposed to there being any glaring deficiencies in the Dell product.