You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Network cheat sheet
This document is about working on the Juniper devices used in the Wikimedia Infrastructure.
SSH access to network equipment
Junipers take ssh keys. huzzah!WMF routers and switches follow the Infrastructure_naming_conventions.
For example, the hostnames of eqiad core routers are cr1-eqiad.wikimedia.org and cr2-eqiad.wikimedia.org:
ssh cr1-eqiad.wikimedia.org
Access switches are named asw-${rownum}-${dc}.mgmt.${dc}.wmnet. Hence, row B switches in eqiad and codfw can be accessed as follows:
ssh asw-b-eqiad.mgmt.eqiad.wmnet ssh asw-b-codfw.mgmt.codfw.wmnet
Operational mode vs Configuration mode
Juniper devices can be used in two ways:
- Operational mode (default when logging in):
{master} elukey@re0.cr2-eqiad>
- Configuration mode (to apply network configuration changes):
elukey@re0.cr2-eqiad> edit Entering configuration mode {master}[edit] elukey@re0.cr2-eqiad#
Juniper 101 from a IRC session with Faidon:
11:53 <paravoid> there are two modes in the cli 11:53 <paravoid> the operational mode and the configuration mode 11:54 <paravoid> when you first login you enter the operational one 11:54 <paravoid> so "show interfaces ae3" shows you the state of the interface
(link speed, physical link etc.)11:54 <paravoid> and "show bgp summary" shows you the BGP summary etc. 11:55 <paravoid> and a few other commands not in the show hierarchy like
"request routing-engine login" and whatnot11:56 <paravoid> to view the config in the operational mode you do "show configuration ..." 11:56 <paravoid> if you want to edit the config, you enter the config mode 11:56 <paravoid> by typing "edit" 11:56 <paravoid> (and leave it with "quit" or "exit") 11:56 <paravoid> once you're there, "show" does something entirely different 11:57 <paravoid> it basically does what "show configuration" does in the operational mode 11:57 <paravoid> so "show interfaces ae3" will show you the config section for interface ae3 11:57 <paravoid> and "show" will do the same as "show configuration" in the operational mode 11:58 <paravoid> so the config now 11:59 <paravoid> there are two ways of viewing it (and editing it, but that's more complicated) 11:59 <paravoid> one is the hierarchical view, the other one is set 11:59 <paravoid> the hierarchical is the thing you see with "show" 11:59 <paravoid> system { domain-name ...; services { ssh { root-login allow; } } } 12:00 <paravoid> set is the thing you see with "| display set" 12:00 <paravoid> in the config mode, you can navigate the hierarchy with edit 12:00 <paravoid> so while you're in there (having typed "edit") 12:00 <paravoid> you can type 12:00 <paravoid> "edit system" 12:01 <paravoid> and then you're only under the system part of the hierarchy 12:01 <paravoid> so "show", no arguments, will show you only what's under sysetm 12:01 <paravoid> and "show services" will show everything that's under "system { services { ... } }" 12:02 <paravoid> similarly you can go deeper by typing "edit services",
or if you're at the root "edit system services"12:02 <paravoid> same with set 12:02 <paravoid> "set" takes a relative path 12:03 <paravoid> so in your case, you can do
- The chat was about editing the analytics-in4 input filter
- (rules for all the ports in the Analytics VLAN)
12:03 <paravoid> set firewall family inet filter analytics-in4 term mysql from destination-address 10.64.37.14/32 12:03 <paravoid> or 12:03 <paravoid> edit firewall family inet filter analytics-in4 12:03 <paravoid> set term mysql from destination-address 10.64.37.14/32 12:04 <paravoid> (or any other combination) 12:05 <paravoid> oh and you can navigate the other way out of the hierarchy
by typing "up" or if you want to go to the root with "top"12:05 <paravoid> "| display set" shows the set command from the root 12:05 <paravoid> so if you're in "edit firewall family inet filter analytics-in4"
and type "show | display set" or "show term mysql | display set" you can paste the output as it is.
Rollbacks
It is always useful to know basic rollback procedures while operating on any service, a mistake can happen and being ready to revert a change is surely a good know-how.
Logging in Edit mode, the rollback ? command shows the most recent list of commits:
--- JUNOS 13.3R9.13 built 2016-03-01 07:03:30 UTC {master} elukey@re0.cr1-eqiad> edit Entering configuration mode {master}[edit] elukey@re0.cr1-eqiad# rollback ? Possible completions: <[Enter]> Execute this command 0 2017-02-09 17:03:58 UTC by elukey via cli commit synchronize 1 2017-02-08 18:54:10 UTC by bblack via cli commit synchronize 2 2017-02-08 17:40:02 UTC by elukey via cli commit synchronize 3 2017-02-08 15:24:44 UTC by elukey via cli commit synchronize 4 2017-02-03 17:57:52 UTC by filippo via cli commit synchronize [..]
The most recent commit is numbered as 0, 1 is the one happened right before it, etc..
Two of the most rollback use cases are:
- Changes that are not going to be committed due to some issues (for example, show | compare does not return the expected outcome). In this case, you'd want to clear whatever change done with rollback 0 (a sort of git reset).
- Changes already committed that caused issues and need to be reverted. In this case the faulty commit should be the last one (number 0) and you'd want to rollback to the last known good state before it. In this case, rollback X (with X == number) undo all the differences between the last and X commit. Please note that you'll need to commit after the rollback!
Edit ACLs for Network ports
We apply ACLs on the router's network ports to filter inbound traffic via Juniper's input filters. Please note that in this case inbound traffic is from the port's point of view, not from what it is attached to it (like a switch or a host). So every input filter that we apply to a specific port (or set of ports) filters traffic coming to the router's port.
Real use case scenario: allow every host in the Analytics VLAN to connect to dbproxy1010.eqiad.wmnet on port 3306.
# Random host belonging to the Analytics VLAN: # analytics1034.eqiad.wmnet # Find the port used to reach analytics1034.eqiad.wmnet elukey@re0.cr1-eqiad> show route analytics1034.eqiad.wmnet [..] 10.64.36.0/24 *[Direct/0] 22w5d 22:31:05 > via ae3.1022 # Check ACLs applied to the port show configuration interfaces ae3.1022 elukey@re0.cr1-eqiad> show configuration interfaces ae3.1022 description "Subnet analytics1-c-eqiad"; vlan-id 1022; family inet { filter { input analytics-in4; } [..] # Check the input filter show configuration firewall family inet filter analytics-in4 [..] term mysql { from { destination-address { 10.X.Y.Z/32; [..] } protocol tcp; destination-port 3306; } then accept; } # Add dbproxy1010's IP to the mysql term list # This must be done in "edit" mode elukey@re0.cr1-eqiad> edit Entering configuration mode {master}[edit] elukey@re0.cr1-eqiad# set firewall family inet filter analytics-in4 term mysql from destination-address 10.64.37.14/32 # Then commit with a meaningful message and quit commit comment "Added dbproxy1010 to the analytics-in4 input filter" # Do the same with dbproxy1010's port if necessary, re-appliying this procedure.
Another similar use case is removing a "destination-port" from a term in a input filter:
# Check the input filter show configuration firewall family inet filter analytics-in4 [..] term mysql { from { destination-address { 10.X.Y.Z/32; [..] } protocol tcp; destination-port [ 3306 8000 ]; } then accept; } # Remove port 8000 from destination-port delete firewall family inet filter analytics-in4 term mysql from destination-port 8000 </syntaxhighlight> If you want to add a comment to a IP:<syntaxhighlight> elukey@re0.cr1-eqiad# edit firewall family inet filter analytics-in4 term aqs from destination-address {master}[edit firewall family inet filter analytics-in4 term aqs from destination-address] elukey@re0.cr1-eqiad# show 10.64.0.107/32; 10.64.32.138/32; 10.64.48.146/32; elukey@re0.cr1-eqiad# annotate 10.64.0.199/32 aqs100{4,5,6} elukey@re0.cr1-eqiad# show /* aqs100{4,5,6} */ 10.64.0.107/32; 10.64.32.138/32; 10.64.48.146/32;
Matching hosts with rack numbers
To find out which cache hosts are connected on codfw's row c:
ema@asw-c-codfw> show interfaces descriptions | match cp xe-2/0/3 up up cp2013 xe-2/0/4 up up cp2014 xe-2/0/5 up up cp2015 xe-7/0/3 up up cp2016 xe-7/0/4 up up cp2017 xe-7/0/5 up up cp2018
Interfaces names, reported in the first column, follow Juniper's interfaces naming convention. The first part of the interface name, xe in the examples above, is the media type. xe stands for 10 Gigabit Ethernet interface, other options would have been ge for Gigabit Ethernet and et for 40 Gigabit Ethernet. The second part is the FPC, which allows us to find out the specific rack number to with the host is connected. The first three hosts (cp2013, cp2014 and cp2015) are on c2 (xe-2), while cp2016, cp2017 and cp2018 are on c7 (xe-7). The last number represents the port number.
Racktables also allows to check the mapping between racks and hostnames.
Check reboot or downtime
Sometimes a lot of hosts in the same rack loose connectivity all together, it might be due to switch failure/reboot.
T159464 is an example: all the Rack A1 hosts were marked as DOWN in Icinga, so we checked the logs on asw-a-codfw
:
elukey@asw-a-codfw> show system uptime
fpc1:
--------------------------------------------------------------------------
Current time: 2017-03-02 17:39:56 UTC
System booted: 2017-03-02 16:55:13 UTC (00:44:43 ago)
Last configured: 2017-03-02 16:58:04 UTC (00:41:52 ago) by root
5:39PM up 45 mins, 0 users, load averages: 0.06, 0.06, 0.06
fpc2:
--------------------------------------------------------------------------
Current time: 2017-03-02 17:39:57 UTC
System booted: 2015-08-04 15:05:30 UTC (82w2d 02:34 ago)
[..]
asw-a1-codfw (fpc1) seems to have rebooted around 16:55, so we might want to check system logs:
elukey@asw-a-codfw> show log messages | match 16:5[45]
{master:2}
Logs can rotate and might not be displayed in the main messages logfile:
show log messages.0.gz | match 16:5[45]
Mar 2 16:54:03 asw-a-codfw vccpd[1635]: VCCPD_PROTOCOL_ADJDOWN: Lost adjacency to dc38.e1d4.1b00 on vcp-255/0/48.32768,
Mar 2 16:54:03 asw-a-codfw vccpd[1635]: interface vcp-255/0/48 went down
Mar 2 16:54:03 asw-a-codfw fpc3 [EX-BCM PIC] ex_bcm_linkscan_handler: Link 54 DOWN
Mar 2 16:54:03 asw-a-codfw vccpd[1635]: Member 2, interface vcp-255/0/48.32768 went down
[..]
Check what host owns a specific IPv6 address
This is necessary if the PTR DNS record is not present. You can use a combination of the NDP and ARP protocols:
elukey@re0.cr1-eqiad> show ipv6 neighbors
[..search the IP that you want..]
2620:0:861:103:92b1:1cff:fe28:d448
90:b1:1c:28:d4:48 stale 254 no no ae3.1019
elukey@re0.cr1-eqiad> show arp no-resolve | match 90:b1:1c:28:d4:48
90:b1:1c:28:d4:48 10.64.32.64 ae3.1019 none
Operational mode commands
show ethernet-switching table # shows mac addresses
show ethernet-switching table interface # shows mac addresses for that interface
show ethernet-switching table vlan # shows mac addresses for vlan
show interfaces descriptions Interface Admin Link Description
ge-1/0/0 up up ms1001
show interfaces terse # shows interfaces with ip's in a very short format
show interface ge-1/0/0 (extensive) # shows interfaces in more detail
monitor interface xe-1/1/0 # shows interface in a real-time updating mode (errors, bits, etc)
show log messages | last 20 # shows log with info
Config commands
Junipers configure after you confirm - you can configure and then double check
- configure - puts you in config mode
- exit - takes you up one level (or out of) config mode
- top - takes you to the top level of config mode
- show - shows you configuration below that level