Jump to content

This is a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Logs

From Wikitech
This page is about server log files. For IRC channel logs, see e.g. https://wm-bot.wmcloud.org/

Logs of several sorts are generated across the cluster and collected in a single location replicated on some machines. Privileged users can explore most logs through the OpenSearch Dashboards front-end at https://logstash.wikimedia.org/ .

The SRE Observability team is working on a common log format called ECS, see the linked doc and intro slides. ECS documentation can be found at https://doc.wikimedia.org/ecs/

For a quick reference of debugging techniques, see Logs/Runbook .

mwlog1002 :/srv/mw-log/

These record wfDebugLog() and similar calls in MediaWiki (see especially mw:Structured logging ). All cluster-wide logs are aggregated here (configured through $wmgUdp2logDest , see also wmgMonologChannels ). There are dozens log files, which amount to around 15 GB compressed per day as of April 2015 . Some are not sent to logstash ( settings ) and some are sampled; log archives are stored for a variable amount of time , up to 90 days (per data retention guideline ). Note that logstash also records the context data for structured logging, so it might contain significantly more information than the files.

Source: All appserver clusters.

Directories:

  • archive/ : Directory holding a limited number of previous days of the same logs (compressed once a day).

General channels:

  • exception.log : Fatal exceptions that receive either a localised "Internal error" page, or a Wikimedia Error page rendered by php-wmerrors .
    • Error pages report a request ID, e.g. [d84af39036] 2011-04-01: Fatal exception of type MWException" .
    • To find the complete stack trace, search for d84af39036 in exception.log, or search for reqId:"d84af39036" in Logstash on the "mediawiki" dashboard.
  • apache2.log : aggregated Apache error logs, see #syslog
  • api.log : API requests and their parameters (including redacted POST payloads, and temporary PII). This used to be sampled, but is no longer ( during 2014-2015 ) and is flushed every 30 days as of Nov 2015.

Specific components:

  • antispoof.log : Collision check passes and failures from the AntiSpoof extension. This checks for strings that look the same using different Unicode characters (such as spoofed usernames).
  • badpass.log : Failed login attempts to wikis.
  • captcha.log : Captcha attempts (both failed and successful attempts).
  • centralauth.log (2013-05-09–), centralauth-bug39996.log , centralauthrename.log (2014-07-14–): (temporary) debug logs for bugzilla:35707 , bugzilla:39996 , bugzilla:67875 . In theory, rare events; can include username and page visited/request made.
  • CirrusSearch.log : Logs various info concerning cirrus (update/query failures and various debug info), Cirrus now uses the analytics platform to log search requests ( Analytics/Data/Cirrus ).
  • CirrusSearchSlowRequests.log : Logs slow requests
  • CirrusSearchChangeFailed.log : Logs update failures
  • external.log : ExternalStore blob fetch failures (see External storage )
  • imagemove.log : Page renames in the File namespace that take place (both failed and successful renames).
  • memcached.log : Memcached for MediaWiki (WANObjectCache, misc ephemeral data, rate limiting counters, advisory locks).
  • poolcounter.log : PoolCounter failures (connection problems, excess queue size, wait timeouts).
  • redis.log : Redis query and connection failures (might involve sessions, job queues, and some other assorted features).
  • resourceloader.log : Exceptions related to ResourceLoader .
  • JobExecutor.log : Tracks job queue activity and including errors (both failed and successful runs).
    • Can be used to produce stats on jobs run on the various wikis, e.g. with Tim's perl ~/job-stats.pl runJobs.log .
  • swift-backend.log : Errors in the SwiftFileBackend class (timeouts and HTTP 500 type errors for file and listing reads/writes).
  • slow-parse.log (since May 2012; 6 months archive)
  • spam.log : SimpleAntiSpam honeypot hits from bots (attempted user actions are discarded).
  • XWikimediaDebug.log : see X-Wikimedia-Debug#Debug logging .

The syslog for all application servers can be found on apache2.log on mwlog1001 or /srv/syslog/apache.log on centrallog1001 . This includes things like segmentation faults.

5xx errors

5xx errors are available on centrallog1001.eqiad.wmnet:/srv/weblog/webrequest/5xx.json. And in logstash, with Varnish 5xx Logstash dashboard

mwmaint Maintenance scripts

See Maintenance server#Access recent runs .

deploy1002 :/var/log/l10updatelog/l10update.log

Source: scap

  • l10update.log : Error log for LocalisationUpdate runs.

vanadium :/var/log/eventlogging/

  • various : Logs of EventLogging entries. Potentially useful, in case their transformation into SQL records fails.

Request logs

Logs of any kind of request, e.g. viewing a wiki page, editing, using the API, loading an image.

centrallog1002 :/srv/weblog/webrequest

The cache (outer layer) request logs; see Squid logging#Log files .

The 1:1000 sampled logs are used for about 15 monthly and quarterly reports and day to day operations ( source ).

Beta cluster

The mw:Beta cluster has a similar logging configuration to production. Various server logs are written to the remote syslog server deployment-mwlog02.deployment-prep.eqiad1.wikimedia.cloud in /srv/mw-log .

Apache access logs are written to /var/log/apache2/other_vhosts_access.log on each beta cluster host.

See mw:Beta_Cluster#Testing_changes_on_Beta_Cluster for information on how to access the beta logstash web UI.

Mailservers

exim logs are retained for 90 days (see phabricator:T167333 ).

Dead

Lucene ( search )

Each host logs at /a/search/log/log ( now less noisy ), see Search#Trouble on how to identify which host serves what pool etc.

fenari :/home/wikipedia/syslog

Source: All apaches

  • apache.log : Error log of all apaches (includes sterr of PHP, so PHP Notices, PHP Warnings etc.)
    • Use fatalmonitor to aggregate this into a (tailing) report
    • This has been deprecated in favor of fluorine :/a/mw-log/apache2.log and logstash.

fenari :/var/log/

Source: Machine-specific logs