You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Rsyslog
Jump to navigation
Jump to search
rsyslog is the default Debian logging daemon and what's deployed fleet-wide at Wikimedia Foundation.
Packaging
We currently have a set of different rsyslog versions/packages that we manage for different reasons, all using gbp build flow:
- Rebuilds of the debian buster upstream packages (
8.1901.0
) including themmkubernetes
plugin our Kubernetes/Logging pipeline is build on - A backport of rsyslog
8.2008.0
to address issues with the debian upstream version (task T259780, task T199406) which is used on centrallog hosts
Branches
- debian/buster-wikimedia-k8s: Published to component/rsyslog-k8s for buster-wikimedia; Used on Kubernetes nodes running buster.
- debian/bullseye-wikimedia-k8s: Published to component/rsyslog-k8s for bullseye-wikimedia; Used on Kubernetes nodes running bullseye.
- debian/stretch-wikimedia: Published to main for stretch-wikimedia; Used on Kubernetes nodes running stretch.
- UNKNOWN: Published to component/rsyslog for buster-wikimedia; Used on centrallog nodes.
Build
# Adapt the --branch argument to debian/buster-wikimedia-k8s in case you want to build that
BACKPORTS=yes DIST=stretch gbp buildpackage --git-pbuilder --git-no-pbuilder-autoconf --git-dist=$DIST -sa -uc -us --git-debian --branch=debian/$DIST-wikimedia
Troubleshooting
rsyslog "stuck"
Servers to look for:
Puppet: syslog::centralserver Currently: centrallog1001.eqiad.wmnet and centrallog2001.codfw.wmnet (Aug 2020)
rsyslog has been observed for getting stuck from time to time (its TLS listener stops responding). In these situations a restart "fixes" the problem, however before doing a restart it is important to capture the daemon's status:
cd timeout 30s strace -f -p $(pidof rsyslogd) -s 65535 -o rsyslog_$(date -Im).strace lsof -p $(pidof rsyslogd) > rsyslog_$(date -Im).lsof gdb -p $(pidof rsyslogd) --batch -ex gcore gdb -p $(pidof rsyslogd) --batch -ex 'thread apply all bt full' > rsyslog_$(date -Im).threaddump systemctl restart rsyslog