You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Difference between revisions of "Netflow"
(Created page with "High level description on https://en.wikipedia.org/wiki/NetFlow == Goal == Gather network level (Layer 4) traffic flows metadata to assist with traffic engineering and DoS mi...")
Revision as of 20:20, 13 September 2019
High level description on https://en.wikipedia.org/wiki/NetFlow
Gather network level (Layer 4) traffic flows metadata to assist with traffic engineering and DoS mitigation.
How does it work?
On the routers:
- 1 out of 1000 flows crossing the routers' external interfaces (both inbound and outbound) gets its metadata sent to a configured collector once the flow timeout is reached (here 10s)
- Example metadata are: source/dest IP/port/AS#, IP protocol, TCP flag...
- The routers share their full BGP view with the collector
On the collectors:
- Samplicator duplicates the IPFIX packets to Fastnetmon and Pmacct, while spoofing the source IP (so they still seem to come from the routers)
- Pmacct (nfacct) extrapolates the flow size and packets based on the sampling rate (eg. do *1000)
- Pmacct uses a prefix list (exported from Puppet) to enrich the collected flows with traffic direction
- Pmacct uses the BGP data provided by the routers to enrich the collected flows metadata (adds peer src/dst AS#, AS path, src/dst AS#)
- Pmacct uses an IP to location database to enrich the collected flows metadata (adds source and destination country) - NOT PROD YET
- Pmacct exports the enriched flow data to Druid via Kafka
- Fastnetmon monitors inbound traffic for both known attack patterns and traffic level threshold and sends a notification email if any condition is met, as well as include a traffic signature if able
How to deploy?
- Apply role::netinsights to a server
- Configure sampling on the router
- Add a BGP session from router to collector
Check if pmacct is sending data to kafka
$ kafkacat -b kafka-jumbo1001.eqiad.wmnet -t netflow -C
Real time Fastnetmon dashboard
Check the logs
Both Pmacct and Fastnetmon log to syslog, grep for
- Real time
- Turnilo is the easiest way to drill down through the data. Example dashboard: https://w.wiki/7N6
- Dashboards can also be made with Superset.
- Spark POC: https://gist.github.com/ottomata/58b3712a1d247a9575772b942e3d5ff3
Deploy to more POPs as we deploy Ganeti clusters.