You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Anycast syslog

From Wikitech-static
Jump to navigation Jump to search

Main CR: https://gerrit.wikimedia.org/r/c/operations/puppet/+/524037/

Limitation of a non-anycast setup

Some appliances such as network devices or PDUs can only send syslog to udp endpoints and not through the regular pipeline.

The previous setup relied on 2 endpoints: syslog.codfw.wmnet and syslog.eqiad.wmnet, both CNAMEs

Some of those devices resolve the configured endpoint FQDN when the configuration is applied. This causes two issues:

  • Changing the CNAME doesn't make the device send logs to the new endpoint
  • Using DNS round robin is not possible

This left us with two options:

  • Configure the devices with only 1 endpoint (eg. the geographically closer): which mean SPOF
  • Configure the devices with both endpoints (not always supported): duplicated data in syslog

Using anycast, if an endpoint goes down, logs will automatically be routed to any other one.

Configuration

The VIP 10.3.0.4 (syslog.anycast.wmnet) is advertised by role syslog::centralserver (centrallog2002.codfw.wmnet and centrallog1001.eqiad.wmnet in Dec 2021)

Anycast healthchecker looks if there is a process listening on port udp/10514.

The rsyslog daemon on the centrallog hosts binds port 10514 UDP with a plaintext syslog listener, and forwards any logs recieved on this port along to the kafka logging/logstash pipeline using a config called "netdev_kafka_relay" or "netdev-kafka-relay"

Netconsole

The syslog hosts also run a Linux netconsole server to receive UDP logs from kernel urgent messages, the syslog anycast IP address is used by default. See Netconsole for more information.