You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Network cheat sheet

From Wikitech-static
Revision as of 17:49, 3 March 2017 by imported>Elukey (→‎SSH access to network equipment)
Jump to navigation Jump to search

This document is about working on the Juniper devices used in the Wikimedia Infrastructure.

SSH access to network equipment

Junipers take ssh keys. huzzah!WMF routers and switches follow the Infrastructure_naming_conventions.

For example, the hostnames of eqiad core routers are cr1-eqiad.wikimedia.org and cr2-eqiad.wikimedia.org:

ssh cr1-eqiad.wikimedia.org

Access switches are named asw-${rownum}-${dc}.mgmt.${dc}.wmnet. Hence, row B switches in eqiad and codfw can be accessed as follows:

ssh asw-b-eqiad.mgmt.eqiad.wmnet
ssh asw-b-codfw.mgmt.codfw.wmnet

Operational mode vs Configuration mode

Juniper devices can be used in two ways:

  • Operational mode (default when logging in):
{master}
elukey@re0.cr2-eqiad>
  • Configuration mode (to apply network configuration changes):
elukey@re0.cr2-eqiad> edit
Entering configuration mode

{master}[edit]
elukey@re0.cr2-eqiad#

Juniper 101 from a IRC session with Faidon:

11:53 <paravoid> there are two modes in the cli 11:53 <paravoid> the operational mode and the configuration mode 11:54 <paravoid> when you first login you enter the operational one 11:54 <paravoid> so "show interfaces ae3" shows you the state of the interface

                (link speed, physical link etc.)

11:54 <paravoid> and "show bgp summary" shows you the BGP summary etc. 11:55 <paravoid> and a few other commands not in the show hierarchy like

                "request routing-engine login" and whatnot

11:56 <paravoid> to view the config in the operational mode you do "show configuration ..." 11:56 <paravoid> if you want to edit the config, you enter the config mode 11:56 <paravoid> by typing "edit" 11:56 <paravoid> (and leave it with "quit" or "exit") 11:56 <paravoid> once you're there, "show" does something entirely different 11:57 <paravoid> it basically does what "show configuration" does in the operational mode 11:57 <paravoid> so "show interfaces ae3" will show you the config section for interface ae3 11:57 <paravoid> and "show" will do the same as "show configuration" in the operational mode 11:58 <paravoid> so the config now 11:59 <paravoid> there are two ways of viewing it (and editing it, but that's more complicated) 11:59 <paravoid> one is the hierarchical view, the other one is set 11:59 <paravoid> the hierarchical is the thing you see with "show" 11:59 <paravoid> system { domain-name ...; services { ssh { root-login allow; } } } 12:00 <paravoid> set is the thing you see with "| display set" 12:00 <paravoid> in the config mode, you can navigate the hierarchy with edit 12:00 <paravoid> so while you're in there (having typed "edit") 12:00 <paravoid> you can type 12:00 <paravoid> "edit system" 12:01 <paravoid> and then you're only under the system part of the hierarchy 12:01 <paravoid> so "show", no arguments, will show you only what's under sysetm 12:01 <paravoid> and "show services" will show everything that's under "system { services { ... } }" 12:02 <paravoid> similarly you can go deeper by typing "edit services",

                or if you're at the root "edit system services"

12:02 <paravoid> same with set 12:02 <paravoid> "set" takes a relative path 12:03 <paravoid> so in your case, you can do

  1. The chat was about editing the analytics-in4 input filter
  2. (rules for all the ports in the Analytics VLAN)

12:03 <paravoid> set firewall family inet filter analytics-in4 term mysql from destination-address 10.64.37.14/32 12:03 <paravoid> or 12:03 <paravoid> edit firewall family inet filter analytics-in4 12:03 <paravoid> set term mysql from destination-address 10.64.37.14/32 12:04 <paravoid> (or any other combination) 12:05 <paravoid> oh and you can navigate the other way out of the hierarchy

                by typing "up" or if you want to go to the root with "top"

12:05 <paravoid> "| display set" shows the set command from the root 12:05 <paravoid> so if you're in "edit firewall family inet filter analytics-in4"

                and type "show | display set" or "show term mysql | display set"
                you can paste the output as it is.

Rollbacks

It is always useful to know basic rollback procedures while operating on any service, a mistake can happen and being ready to revert a change is surely a good know-how.

Logging in Edit mode, the rollback ? command shows the most recent list of commits:

--- JUNOS 13.3R9.13 built 2016-03-01 07:03:30 UTC
{master}
elukey@re0.cr1-eqiad> edit
Entering configuration mode

{master}[edit]
elukey@re0.cr1-eqiad# rollback ?
Possible completions:
  <[Enter]>            Execute this command
  0                    2017-02-09 17:03:58 UTC by elukey via cli commit synchronize
  1                    2017-02-08 18:54:10 UTC by bblack via cli commit synchronize
  2                    2017-02-08 17:40:02 UTC by elukey via cli commit synchronize
  3                    2017-02-08 15:24:44 UTC by elukey via cli commit synchronize
  4                    2017-02-03 17:57:52 UTC by filippo via cli commit synchronize
  [..]

The most recent commit is numbered as 0, 1 is the one happened right before it, etc..

Two of the most rollback use cases are:

  • Changes that are not going to be committed due to some issues (for example, show | compare does not return the expected outcome). In this case, you'd want to clear whatever change done with rollback 0 (a sort of git reset).
  • Changes already committed that caused issues and need to be reverted. In this case the faulty commit should be the last one (number 0) and you'd want to rollback to the last known good state before it. In this case, rollback X (with X == number) undo all the differences between the last and X commit. Please note that you'll need to commit after the rollback!

Edit ACLs for Network ports

We apply ACLs on the router's network ports to filter inbound traffic via Juniper's input filters. Please note that in this case inbound traffic is from the port's point of view, not from what it is attached to it (like a switch or a host). So every input filter that we apply to a specific port (or set of ports) filters traffic coming to the router's port.

Real use case scenario: allow every host in the Analytics VLAN to connect to dbproxy1010.eqiad.wmnet on port 3306.

# Random host belonging to the Analytics VLAN:
# analytics1034.eqiad.wmnet

# Find the port used to reach analytics1034.eqiad.wmnet
elukey@re0.cr1-eqiad> show route analytics1034.eqiad.wmnet
[..]
10.64.36.0/24      *[Direct/0] 22w5d 22:31:05
                    > via ae3.1022
                    
# Check ACLs applied to the port
show configuration interfaces ae3.1022
elukey@re0.cr1-eqiad> show configuration interfaces ae3.1022
description "Subnet analytics1-c-eqiad";
vlan-id 1022;
family inet {
    filter {
        input analytics-in4;
    }
[..]

# Check the input filter
show configuration firewall family inet filter analytics-in4
[..]
term mysql {
    from {
        destination-address {
            10.X.Y.Z/32;
            [..]
        }
        protocol tcp;
        destination-port 3306;
    }
    then accept;
}

# Add dbproxy1010's IP to the mysql term list
# This must be done in "edit" mode
elukey@re0.cr1-eqiad> edit
Entering configuration mode

{master}[edit]
elukey@re0.cr1-eqiad# set firewall family inet filter analytics-in4 term mysql from destination-address 10.64.37.14/32

# Then commit with a meaningful message and quit
commit comment "Added dbproxy1010 to the analytics-in4 input filter"

# Do the same with dbproxy1010's port if necessary, re-appliying this procedure.

Another similar use case is removing a "destination-port" from a term in a input filter:

# Check the input filter
show configuration firewall family inet filter analytics-in4
[..]
term mysql {
    from {
        destination-address {
            10.X.Y.Z/32;
            [..]
        }
        protocol tcp;
        destination-port [ 3306 8000 ];
    }
    then accept;
}

# Remove port 8000 from destination-port
delete firewall family inet filter analytics-in4 term mysql from destination-port 8000
</syntaxhighlight>
If you want to add a comment to a IP:<syntaxhighlight>
elukey@re0.cr1-eqiad# edit firewall family inet filter analytics-in4 term aqs from destination-address
{master}[edit firewall family inet filter analytics-in4 term aqs from destination-address]
elukey@re0.cr1-eqiad# show
10.64.0.107/32;
10.64.32.138/32;
10.64.48.146/32;
elukey@re0.cr1-eqiad# annotate 10.64.0.199/32 aqs100{4,5,6}
elukey@re0.cr1-eqiad# show
/* aqs100{4,5,6} */
10.64.0.107/32;
10.64.32.138/32;
10.64.48.146/32;

Matching hosts with rack numbers

To find out which cache hosts are connected on codfw's row c:

ema@asw-c-codfw> show interfaces descriptions | match cp 
xe-2/0/3        up    up   cp2013
xe-2/0/4        up    up   cp2014
xe-2/0/5        up    up   cp2015
xe-7/0/3        up    up   cp2016
xe-7/0/4        up    up   cp2017
xe-7/0/5        up    up   cp2018

Interfaces names, reported in the first column, follow Juniper's interfaces naming convention. The first part of the interface name, xe in the examples above, is the media type. xe stands for 10 Gigabit Ethernet interface, other options would have been ge for Gigabit Ethernet and et for 40 Gigabit Ethernet. The second part is the FPC, which allows us to find out the specific rack number to with the host is connected. The first three hosts (cp2013, cp2014 and cp2015) are on c2 (xe-2), while cp2016, cp2017 and cp2018 are on c7 (xe-7). The last number represents the port number.

Racktables also allows to check the mapping between racks and hostnames.

Check reboot or downtime

Sometimes a lot of hosts in the same rack loose connectivity all together, it might be due to switch failure/reboot.

T159464 is an example: all the Rack A1 hosts were marked as DOWN in Icinga, so we checked the logs on asw-a-codfw :

elukey@asw-a-codfw> show system uptime
fpc1:
--------------------------------------------------------------------------
Current time: 2017-03-02 17:39:56 UTC
System booted: 2017-03-02 16:55:13 UTC (00:44:43 ago)
Last configured: 2017-03-02 16:58:04 UTC (00:41:52 ago) by root
 5:39PM  up 45 mins, 0 users, load averages: 0.06, 0.06, 0.06

fpc2:
--------------------------------------------------------------------------
Current time: 2017-03-02 17:39:57 UTC
System booted: 2015-08-04 15:05:30 UTC (82w2d 02:34 ago)
[..]

asw-a1-codfw (fpc1) seems to have rebooted around 16:55, so we might want to check system logs:

elukey@asw-a-codfw> show log messages | match 16:5[45]

{master:2}

Logs can rotate and might not be displayed in the main messages logfile:

show log messages.0.gz | match 16:5[45]

Mar  2 16:54:03  asw-a-codfw vccpd[1635]: VCCPD_PROTOCOL_ADJDOWN: Lost adjacency to dc38.e1d4.1b00 on vcp-255/0/48.32768,
Mar  2 16:54:03  asw-a-codfw vccpd[1635]: interface vcp-255/0/48 went down
Mar  2 16:54:03  asw-a-codfw fpc3 [EX-BCM PIC] ex_bcm_linkscan_handler: Link 54 DOWN
Mar  2 16:54:03  asw-a-codfw vccpd[1635]: Member 2, interface vcp-255/0/48.32768 went down
[..]

Check what host owns a specific IPv6 address

This is necessary if the PTR DNS record is not present. You can use a combination of the NDP and ARP protocols:

elukey@re0.cr1-eqiad> show ipv6 neighbors
[..search the IP that you want..]
2620:0:861:103:92b1:1cff:fe28:d448
                             90:b1:1c:28:d4:48  stale       254 no  no      ae3.1019

elukey@re0.cr1-eqiad> show arp no-resolve | match 90:b1:1c:28:d4:48
90:b1:1c:28:d4:48 10.64.32.64     ae3.1019                 none

Operational mode commands

show ethernet-switching table           # shows mac addresses

show ethernet-switching table interface # shows mac addresses for that interface

show ethernet-switching table vlan # shows mac addresses for vlan

show interfaces descriptions Interface Admin Link Description

ge-1/0/0 up up ms1001
show interfaces terse                   # shows interfaces with ip's in a very short format
show interface ge-1/0/0 (extensive)     # shows interfaces in more detail

monitor interface xe-1/1/0 # shows interface in a real-time updating mode (errors, bits, etc)

show log messages | last 20 # shows log with info

Config commands

Junipers configure after you confirm - you can configure and then double check

  • configure - puts you in config mode
  • exit - takes you up one level (or out of) config mode
  • top - takes you to the top level of config mode
  • show - shows you configuration below that level