Anycast syslog

From Wikitech-static
Jump to navigation Jump to search

Main CR: https://gerrit.wikimedia.org/r/c/operations/puppet/+/524037/

Limitation of a non-anycast setup

Some appliances such as network devices or PDUs can only send syslog to udp endpoints and not through the regular pipeline.

The previous setup relied on 2 endpoints: syslog.codfw.wmnet and syslog.eqiad.wmnet, both CNAMEs

Some of those devices resolve the configured endpoint FQDN when the configuration is applied. This causes two issues:

  • Changing the CNAME doesn't make the device send logs to the new endpoint
  • Using DNS round robin is not possible

This left us with two options:

  • Configure the devices with only 1 endpoint (eg. the geographically closer): which mean SPOF
  • Configure the devices with both endpoints (not always supported): duplicated data in syslog

Using anycast, if an endpoint goes down, logs will automatically be routed to any other one.

Configuration

The VIP 10.3.0.4 (syslog.anycast.wmnet) is advertised by wezen.codfw.wmnet and centrallog1001.eqiad.wmnet.

Anycast healthchecker looks if there is a process listening on port udp/10514.

The rsyslog daemon on the centrallog hosts binds port 10514 UDP with a plaintext syslog listener, and forwards any logs recieved on this port along to the kafka logging/logstash pipeline using a config called "netdev_kafka_relay" or "netdev-kafka-relay"