Logs
- This page is about server log files. For IRC channel logs, see e.g. https://wm-bot.wmcloud.org/
Logs of several sorts are generated across the cluster and collected in a single location replicated on some machines. Privileged users can explore most logs through the OpenSearch Dashboards front-end at https://logstash.wikimedia.org/ .
The SRE Observability team is working on a common log format called ECS, see the linked doc and intro slides. ECS documentation can be found at https://doc.wikimedia.org/ecs/
For a quick reference of debugging techniques, see Logs/Runbook .
mwlog1002
:/srv/mw-log/
These record
wfDebugLog()
and similar calls in MediaWiki (see especially
mw:Structured logging
). All cluster-wide logs are aggregated here (configured through
$wmgUdp2logDest
, see also
wmgMonologChannels
). There are dozens log files, which amount to around 15 GB compressed per day
as of April 2015
. Some are not sent to
logstash
(
settings
) and some are sampled; log archives are stored for a
variable amount of time
, up to 90 days (per
data retention guideline
). Note that logstash also records the context data for structured logging, so it might contain significantly more information than the files.
Source: All appserver clusters.
Directories:
-
archive/: Directory holding a limited number of previous days of the same logs (compressed once a day).
General channels:
-
exception.log: Fatal exceptions that receive either a localised "Internal error" page, or a Wikimedia Error page rendered by php-wmerrors .-
Error pages report a request ID, e.g.
[d84af39036] 2011-04-01: Fatal exception of type MWException". -
To find the complete stack trace, search for
d84af39036in exception.log, or search forreqId:"d84af39036"in Logstash on the "mediawiki" dashboard.
-
Error pages report a request ID, e.g.
-
apache2.log: aggregated Apache error logs, see #syslog -
api.log: API requests and their parameters (including redacted POST payloads, and temporary PII). This used to be sampled, but is no longer ( during 2014-2015 ) and is flushed every 30 days as of Nov 2015.
Specific components:
-
antispoof.log: Collision check passes and failures from the AntiSpoof extension. This checks for strings that look the same using different Unicode characters (such as spoofed usernames). -
badpass.log: Failed login attempts to wikis. -
captcha.log: Captcha attempts (both failed and successful attempts). -
centralauth.log(2013-05-09–),centralauth-bug39996.log,centralauthrename.log(2014-07-14–): (temporary) debug logs for bugzilla:35707 , bugzilla:39996 , bugzilla:67875 . In theory, rare events; can include username and page visited/request made. -
CirrusSearch.log: Logs various info concerning cirrus (update/query failures and various debug info), Cirrus now uses the analytics platform to log search requests ( Analytics/Data/Cirrus ). -
CirrusSearchSlowRequests.log: Logs slow requests -
CirrusSearchChangeFailed.log: Logs update failures -
external.log: ExternalStore blob fetch failures (see External storage ) -
imagemove.log: Page renames in the File namespace that take place (both failed and successful renames). -
memcached.log: Memcached for MediaWiki (WANObjectCache, misc ephemeral data, rate limiting counters, advisory locks). -
poolcounter.log: PoolCounter failures (connection problems, excess queue size, wait timeouts). -
redis.log: Redis query and connection failures (might involve sessions, job queues, and some other assorted features). -
resourceloader.log: Exceptions related to ResourceLoader . -
JobExecutor.log: Tracks job queue activity and including errors (both failed and successful runs).-
Can be used to produce stats on jobs run on the various wikis, e.g. with Tim's
perl ~/job-stats.pl runJobs.log.
-
Can be used to produce stats on jobs run on the various wikis, e.g. with Tim's
-
swift-backend.log: Errors in the SwiftFileBackend class (timeouts and HTTP 500 type errors for file and listing reads/writes). -
slow-parse.log(since May 2012; 6 months archive) -
spam.log: SimpleAntiSpam honeypot hits from bots (attempted user actions are discarded). -
XWikimediaDebug.log: see X-Wikimedia-Debug#Debug logging .
The syslog for all application servers can be found on apache2.log on mwlog1001 or /srv/syslog/apache.log on centrallog1001 . This includes things like segmentation faults.
5xx errors
5xx errors are available on centrallog1001.eqiad.wmnet:/srv/weblog/webrequest/5xx.json. And in logstash, with Varnish 5xx Logstash dashboard
mwmaint Maintenance scripts
See Maintenance server#Access recent runs .
deploy1002
:/var/log/l10updatelog/l10update.log
Source: scap
-
l10update.log: Error log for LocalisationUpdate runs.
vanadium
:/var/log/eventlogging/
-
various: Logs of EventLogging entries. Potentially useful, in case their transformation into SQL records fails.
Request logs
Logs of any kind of request, e.g. viewing a wiki page, editing, using the API, loading an image.
-
Analytics/Data/Webrequest
: "wmf.webrequest" is a name of one unsampled requests archive in
Hive
. We started deleting older wmf.webrequest data in March 2015. We currently keep 62 days.
- Used for ad-hoc queries.
- Used to generate dumps for pagecounts-all-sites
centrallog1002
:/srv/weblog/webrequest
The cache (outer layer) request logs; see Squid logging#Log files .
The 1:1000 sampled logs are used for about 15 monthly and quarterly reports and day to day operations ( source ).
Beta cluster
The
mw:Beta cluster
has a similar logging configuration to production.
Various server logs are written to the remote syslog server
deployment-mwlog02.deployment-prep.eqiad1.wikimedia.cloud
in
/srv/mw-log
.
Apache access logs are written to /var/log/apache2/other_vhosts_access.log on each beta cluster host.
See mw:Beta_Cluster#Testing_changes_on_Beta_Cluster for information on how to access the beta logstash web UI.
Mailservers
exim logs are retained for 90 days (see phabricator:T167333 ).
Dead
Lucene ( search )
Each host logs at
/a/search/log/log
(
now less noisy
), see
Search#Trouble
on how to identify which host serves what pool etc.
fenari
:/home/wikipedia/syslog
Source: All apaches
-
apache.log: Error log of all apaches (includes sterr of PHP, so PHP Notices, PHP Warnings etc.)-
Use
fatalmonitorto aggregate this into a (tailing) report - This has been deprecated in favor of fluorine :/a/mw-log/apache2.log and logstash.
-
Use
fenari
:/var/log/
Source: Machine-specific logs
-
l10nupdatelog/l10nupdate.log: Used by LocalisationUpdate .