You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Event Platform/EventGate

From Wikitech-static
Jump to navigation Jump to search

EventGate is an HTTP service for ingestion of events, written in Node.js. It takes JSON messages over HTTP POST requests, optionally validates them against a JSONSchema, and then produces them to a backend. The default backend (and the one used at WMF) is Kafka.

For more information about the EventGate codebase, see the README at github.com/wikimedia/eventgate. EventGate is meant to be generic and not WMF specific. It can be used standalone as a library, or it can be used with the built-in support for running Express (the Node.js HTTP server provided via mediawiki-service-template). WMF operates EventGate using the Express Node.js HTTP server.

This page documents the WMF deployments of EventGate.

Service

Main article: Event Platform/EventGate/Administration

Implementation

EventGate code is primarily hosted on Github for greater exposure to non-Wikimedia developers. There exists plenty of tooling around using Kafka with Avro but not much for Kafka with JSONSchemas. By hosting on Github we hope to gain more visibility and participation from non-WMF developers.

The eventgate-wikimedia repository contains the WMF deployment of EventGate. It uses the eventgate package as an npm dependency, and has additional utilities, configuration and deployment pipeline code for WMF's instances of EventGate.

To modify the behavior of the eventgate library or service, use EventGate directly instead, from https://github.com/wikimedia/eventgate.

Operation

There are multiple clusters of EventGate services in Wikimedia production.

EventGate is hosted in production as a containerized service running on Kubernetes, as such there are no Puppet roles or persistent backend hostnames.

As of Feb 2020, thee are currently the following clusters (per deployment-charts):

  • eventgate-main
  • eventgate-analytics
  • eventgate-analytics-external
  • eventgate-logging-external

More details on how are used further down below.

Wikimedia EventGate configuration

Wikimedia's EventGate wrapper implements custom validate and produce behaviours for our use in WMF production. This includes configuration to look up JSON Schemas from either a the local filesystem (for eventgate-main) or a remote schema registry URL (for eventgate-analytics).

EventGate expects that specific implementations know how to map from an individual event to its JSONSchema. We use the $schema field in each of our JSON events to do this. This field contains a relative and versioned URI to the event's JSONSchema. EventGate fetches and caches this schema and uses it to validate each event with the same $schema.

Producer types: Guaranteed and Hasty

Wikimedia's EventGates configuration offers two different Kafka producer connections, named guaranteed and hasty.

The guaranteed producer is intended for batch processing , and will block the HTTP response until the the event has been validated and sent to the Kafka brokers with either an ACK response or known failure to ACK. Note that "guaranteed" does not mean that the event is guaranteed to be persisted in Kafka (there is not an indefinite retry or other buffer). Rather itmeans that HTTP response status can be trusted, so a 2xx status code guarantees that the event has been persisted.

The hasty producer is optimised for high-throughput, and will not block the HTTP response. Instead, it will immediately return a 202 status response as soon as EventGate has received the JSON message from the HTTP response body. The event will be validated and produced to Kafka afterward. The HTTP client that submitted the event will not know whether the event was valid or whether it was succesfully persisted in Kafka. If the event failed validation or failed production to Kafka, an error will be logged to Logstash however.

The guaranteed producer type is the default for the /v1/events endpoint. To POST an event in hasty mode, set hasty=true in the request query parameters.

EventGate clusters

At WMF, EventGate is deployed as multiple separate clusters, each with its own defined purpose.

All EventGate clusters are open to receive events from any internal production server. MediaWiki produces to EventGate using the EventBus extension. (Apologies if this is confusing! See Event* for a disambiguation page for related terms.)

eventgate-main

  • Visibility: internal, submissions restricted.
  • Schemas: bundled, from local filesystem.

The eventgate-main cluster produces events to the Kafka "main" clusters in both Eqiad and Codfw. It is used for low(ish) volume but high-priority events. These events are necessary for functioning of Wikimedia core services, like the MediaWiki Job Queue and change-propagation.

Events are submitted here by:

eventgate-analytics

  • Visibility: internal, submissions restricted.
  • Schemas: bundled, from local filesystem.
  • Stream config: static, from helmfile values.

The eventgate-analytics cluster produces events to the Kafka "jumbo" cluster, and is intended for high volume but low-priority events. Events produced to eventgate-analytics should not be required for functional production services. The Kafka jumbo cluster only exists in Eqiad, and does not have cross-dc replication and no cross-dc failover mode. Events originating from Codfw are produced directly to Kafka "jumbo-eqiad".

Events are submitted here by:

eventgate-analytics-external

The eventgate-analytics-external cluster produces events to the Kafka jumbo cluster. This replaces EventLogging Analytics, and can receive and validate events from external clients (like the EventLogging service before it).

Events are submitted here by:

eventgate-logging-external

The eventgate-logging produces events to the Kafka "logging" cluster. eventgate-logging-external accepts mediawiki/client/error events from external clients.

Events are submitted here by:

Event Stream Config

EventGate in production will request stream configuration from the EventStreamConfig MediaWiki API. Each service cluster restricts the stream configuration it uses via the destination_event_service setting; only streams that have destination_event_service matching the EventGate service cluster name (e.g. eventgate-main) will be used by that EventGate service cluster.

Validation Errors

All EventGate clusters at WMF are configured to send validation error events to Logstash. These errors are routed via Kafka and then ingested into Hive into the various event.eventgate_*_error_validation tables but also ingested into Logstash and viewable in Kibana.

Kibana dashboard: eventgate-validation

Local development

eventgate-wikimedia-dev.js

The eventgate-wikimedia codebase comes with an EventGate development implementation in eventgate-wikimedia-dev.js. To use it, npm install and then run ./eventgate-wikimedia-dev.js. Its default EventGate config is in ./config.dev.yaml.

MediaWiki Vagrant

Installation of an EventGate service in MediaWiki-Vagrant is included via role::eventbus. To install it, first edit puppet/hieradata/common.yaml and add:

npm::node_version: 10

EventGate requires NodeJS >= 10, but other NodeJS Mediawiki services are stuck on NodeJS 6 (for now). Ensuring that Node 10 is installed is required but may cause other NodeJS services to break.

Once done, you can enable the role and provision vagrant:

$ vagrant roles enable eventbus
$ vagrant provision

This will configure Mediawiki with the EventBus extension producing events to EventGate. The eventgate-wikimedia code repository will be cloned and the service will be launched from it using service-runner and a puppet installed config.vagrant.yaml. If you want to run the service manually (instead of letting systemd do it), you may stop it and run it with:

 nodejs /vagrant/srv/eventgate-wikimedia/node_modules/eventgate/server.js -c /vagrant/srv/eventgate-wikimedia/config.vagrant.yaml

You might want to do this if you are developing EventGate code and want to run the process in the foreground.

External links