You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Logstash: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Alex Monk
imported>Spage
m (→‎Overview ("ELK"): bold and link the ELK, minor tweaks)
Line 1: Line 1:
'''Logstash''' is a tool for managing events and logs. When used generically the term encompases a larger system of log collection, processing, storage and searching activities.
'''Logstash''' is a tool for managing events and logs. When used generically the term encompases a larger system of log collection, processing, storage and searching activities.


== Overview ==
== Overview ("ELK") ==


Logstash is used to gather logging messages, convert them into json documents and store them in an Elasticsearch cluster. Kibana is used as a frontend client to search for and display messages from Elasticsearch cluster.
[[File:ELK_Tech_Talk_2015-08-20.pdf|thumb|Slides from TechTalk on ELK by Bryan Davis]]
Various Wikimedia applications send log events to '''[[Logstash]]''', which gathers the messages, converts them into json documents, and stores them in an '''[[Elasticsearch]]''' cluster. Wikimedia uses '''Kibana''' as a front-end client to filter and display messages from the Elasticsearch cluster.


=== Logstash ===
=== Logstash ===


[http://logstash.net/ Logstash] is a tool that can be used to collect, process and forward events and log messages. Collection is accomplished via number of configurable input plugins including raw socket/packet communication, file tailing and several message bus clients. Once an input plugin has collected data it can be processed by any number of filters which modify and annotate the event data. Finally events are routed to output plugins which can forward the events to a variety of external programs including Elasticsearch, local files and several message bus implementations.
[http://logstash.net/ Logstash] is a tool to collect, process, and forward events and log messages. Collection is accomplished via configurable input plugins including raw socket/packet communication, file tailing, and several message bus clients. Once an input plugin has collected data it can be processed by any number of filters which modify and annotate the event data. Finally logstash routes events to output plugins which can forward the events to a variety of external programs including Elasticsearch, local files and several message bus implementations.


=== Elasticsearch ===
=== Elasticsearch ===
Line 15: Line 16:
=== Kibana ===
=== Kibana ===


[http://www.elasticsearch.org/overview/kibana/ Kibana] is a browser based analytics and search interface for Elasticsearch that was developed primarily to view Logstash event data.
[http://www.elasticsearch.org/overview/kibana/ Kibana] is a browser-based analytics and search interface for Elasticsearch that was developed primarily to view Logstash event data.
 
== Systems feeding into logstash ==
See 2015-08 Tech talk slides
 
Writing new filters is easy.
 
=== Systems not feeding into logstash ===
* [[mw:Extension:EventLogging|EventLogging]] of programmatically-defined events, despite the name, has a different pipeline
* [[Varnish]] logs of the billions of requests to WMF wikis would require a lot more hardware. Instead it uses Kafka to feed into Hadoop.


== Production Logstash ==
== Production Logstash ==
Line 38: Line 48:
; Configuration
; Configuration
: It hosts a functional Logstash + Elasticsearch + Kibana stack at [https://logstash-beta.wmflabs.org/ logstash-beta.wmflabs.org] that aggregates log data produced by the [[Nova_Resource:Deployment-prep|beta cluster]].
: It hosts a functional Logstash + Elasticsearch + Kibana stack at [https://logstash-beta.wmflabs.org/ logstash-beta.wmflabs.org] that aggregates log data produced by the [[Nova_Resource:Deployment-prep|beta cluster]].
== Gotchas ==
=== GELF transport ===
Make sure logging events sent to the GELF input don't have a "type" or "_type" field set, or if set, that it contains the value "gelf". The gelf/logstash config discards any events that have a different value set for "type" or "_type".


== Documents ==
== Documents ==
{{Special:Prefixindex/Logstash/|hideredirects=1|stripprefix=1}}
{{Special:Prefixindex/Logstash/|hideredirects=1|stripprefix=1}}


[[Category:Services]]
[[Category:Services]]
== Gotchas ==
=== GELF transport ===
Make sure logging events sent to the GELF input don't have a "type" or "_type" field set, or if set, that it contains the value "gelf". The gelf/logstash config discards any events that have a different value set for "type" or "_type".

Revision as of 19:21, 20 August 2015

Logstash is a tool for managing events and logs. When used generically the term encompases a larger system of log collection, processing, storage and searching activities.

Overview ("ELK")

File:ELK Tech Talk 2015-08-20.pdf Various Wikimedia applications send log events to Logstash, which gathers the messages, converts them into json documents, and stores them in an Elasticsearch cluster. Wikimedia uses Kibana as a front-end client to filter and display messages from the Elasticsearch cluster.

Logstash

Logstash is a tool to collect, process, and forward events and log messages. Collection is accomplished via configurable input plugins including raw socket/packet communication, file tailing, and several message bus clients. Once an input plugin has collected data it can be processed by any number of filters which modify and annotate the event data. Finally logstash routes events to output plugins which can forward the events to a variety of external programs including Elasticsearch, local files and several message bus implementations.

Elasticsearch

Elasticsearch is a multi-node Lucene implementation. The same technology powers the CirrusSearch on WMF wikis.

Kibana

Kibana is a browser-based analytics and search interface for Elasticsearch that was developed primarily to view Logstash event data.

Systems feeding into logstash

See 2015-08 Tech talk slides

Writing new filters is easy.

Systems not feeding into logstash

  • EventLogging of programmatically-defined events, despite the name, has a different pipeline
  • Varnish logs of the billions of requests to WMF wikis would require a lot more hardware. Instead it uses Kafka to feed into Hadoop.

Production Logstash

Web interface
logstash.wikimedia.org
Authentication
wikitech LDAP username and password and membership in one of the following LDAP groups: nda, ops, wmf
Hosts
logstash100[1-3] servers in Eqiad.
Configuration
Each host provides a Logstash instance, an Elasticsearch node, a Redis server and an Apache vhost serving the Kibana application. The Apache vhosts also act as reverse proxies to the Elasticsearch cluster and perform LDAP-based authentication to restrict access to the potentially sensitive log information. The misc Varnish cluster is being used to provide ssl termination and load balancing support.

Wmf-elk-cluster-2014-10.svg

Prototype (Beta) Logstash

Web interface
logstash-beta.wmflabs.org
Authentication
Limited access; The username and password can be found on deployment-bastion.eqiad.wmflabs in the /root/secrets.txt file.
Hosts
deployment-logstash1.eqiad.wmflabs
Configuration
It hosts a functional Logstash + Elasticsearch + Kibana stack at logstash-beta.wmflabs.org that aggregates log data produced by the beta cluster.


Gotchas

GELF transport

Make sure logging events sent to the GELF input don't have a "type" or "_type" field set, or if set, that it contains the value "gelf". The gelf/logstash config discards any events that have a different value set for "type" or "_type".


Documents