You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Event*: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Ottomata
 
imported>Quiddity
(add navbox)
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Navigation Event Platform}}
EventBlabla what?!  So Many Event* services.  This page disambiguates, and links to their more extensive documentation.
EventBlabla what?!  So Many Event* services.  This page disambiguates, and links to their more extensive documentation.


In all of the following, an 'event' is defined as a  structured and schema-ed data object.  The data object is usually serialized as JSON with the schema specified in and validated with JSONSchema.  WMF does use some [https://github.com/wikimedia/mediawiki-event-schemas/tree/master/avro/mediawiki Avro based events], but none of these pass through any of the Event* services described below.
In all of the following, an 'event' is defined as a  structured and schema-ed data object.  The data object is usually serialized as JSON with the schema specified in and validated with JSONSchema.   


== [[Analytics/EventLogging|EventLogging]] ==
== [[Event Platform]] ==
 
EventLogging is the original event framework used by WMF.  It consists of 3 main parts.
 
=== EventLogging Services ===
The [https://github.com/wikimedia/eventlogging EventLogging python codebase] consists of various services that can be mixed and matched to consume, validate, transform, and produce streams of JSON events between different endpoints.  EventLogging was originally created for Mediawiki analytics purposes, but is now also used to support production streams of events via an HTTP eventlogging-service used for the EventBus service.
 
=== EventLogging Schemas ===


For the original analytics EventLogging use case, all schemas are maintained and stored on [https://meta.wikimedia.org/w/index.php?title=Special%3AAllPages&from=&to=&namespace=470 meta wiki].  The default EventLogging validation logic
The [https://office.wikimedia.org/wiki/Tech_program_proposals/TP2_-_Modern_Event_Platform Modern Event Platform Program] is about unifying the our analytics and production event systems into a generic event stream intake, validation, processing and consuming system.
looks for matching schemas in this online http/wiki repository.


=== EventLogging Extension ===
== [[Metrics Platform]] ==
The [https://www.mediawiki.org/wiki/Extension:EventLogging EventLogging Mediawiki extension] is used by Mediawiki to send JSON data to EventLogging by sending a <tt>GET</tt> request with query parameter encoded JSON data to a special Varnish endpointRequests to this Varnish endpoint are logged (via varnishkafka), and make there way into downstream EventLogging services.
Metrics Platform is a suite of services, standard libraries, and APIs for producing and consuming instrumentation data of all kinds from Wikimedia Foundation productsIt mainly consists of standard product metrics schemas and client library implementations. It is built on top of Event Platform components.


== [[EventBus]] ==
== [[EventBus]] ==
Like EventLogging, 'EventBus' refers to a few different pieces of infrastructure.  More detail can be found on the [[EventBus]] documentation page.
Like EventLogging, 'EventBus' refers to a few different pieces of infrastructure.  More detail can be found on the [[EventBus]] documentation page.
=== [https://www.mediawiki.org/wiki/Extension:EventBus EventBus Mediawiki extension] ===
This is the Mediawiki extension responsible for sending events from WMF Mediawiki installations to backend event intake service (EventGate).


=== [[EventBus#RESTful_Service_.28Production.29|eventlogging-service-eventbus]] ===
=== [[EventBus#RESTful_Service_.28Production.29|eventlogging-service-eventbus]] ===
{{Note|As of 2019-09 eventlogging-service-eventbus has been decommissioned in favor of EventGate}}


This is a particular deployment of <tt>eventlogging-service</tt> that is used to accept internally produced events over HTTP.  It is configured to use JSONSchemas from the [https://github.com/wikimedia/mediawiki-event-schemas/tree/master/jsonschema mediawiki/event-schemas] repository to validate incoming events.
This is a particular deployment of <tt>eventlogging-service</tt> that is used to accept internally produced events over HTTP.  It is configured to use JSONSchemas from the [https://github.com/wikimedia/mediawiki-event-schemas/tree/master/jsonschema mediawiki/event-schemas] repository to validate incoming events.


=== [[EventBus#Event_Schemas|Event Schemas]] [https://github.com/wikimedia/mediawiki-event-schemas/tree/master/jsonschema Repository] ===
== Event Schema Repositories and HTTP Service ==
Our event schemas are available via an HTTP API at https://schema.wikimedia.org/. This site simply hosts up to date schemas stored in git repositories.
 
WMF has 2 event schema git repositories.  [https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/primary/+/master schemas/event/primary] is for production/tier 1 events.  Schemas here are more tightly controlled (and bikeshed).  [https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/secondary/+/master schemas/event/secondary] is for non critical and analytics focused events. 
 
Previously, schemas also lived in the mediawiki/event-schemas repository, but this has been deprecated in favor of schemas/event/primary.
 
You can learn more about WMF's Event Schemas at [[Event_Platform/Schemas]].
 
== [[Event Platform/EventGate|EventGate]] ([https://github.com/wikimedia/EventGate repository]) ==
EventGate is the event service replacement for eventlogging-service-eventbus, a component of the Modern Event Platform program.  This service has the same API as eventlogging-service-eventbus, with a few updated expectations for the event schemas it can process.  Over 2019-2020, we will be replacing all eventlogging intake services with EventGate deployments.


This repository is the main source of truth for production event schemasAll new production events should be schemaed, and should exist in this repository<tt>eventlogging-service-eventbus</tt> validates incoming events against schemas in this repository.
== EventStreamConfig ==
While planning the Modern Event Platform program, product managers and engineers expressed a desire to be able to more dynamically configure the behavior of clients producing (and potentially consuming) event streamsEventGate also needs some dynamic configuration in order to enforce that specific event streams only include events of the same JSONSchema lineageThe EventBus extension (as well as other event producing clients) need a way to configure the destination event service (EventGate) instance where they should produce an event.


== EventStreams ==
The [[mw:Extension:EventStreamConfig|EventStreamConfig]] MediaWiki extension was built to support these use cases.  It uses MediaWiki's configuration to centralize the event stream configuration for all of these clients.  It exposes both a PHP MediaWiki interface as well as a HTTP API to look up configs for event streams by name.  The EventLogging extension uses the PHP API to load stream configuration to browser clients via ResourceLoader.
 
== EventStreams ([https://github.com/wikimedia/mediawiki-services-eventstreams repository]) ==
 
[[EventStreams]] is a public facing service.  It exposes streams of events via HTTP [https://en.wikipedia.org/wiki/Server-sent_events SSE/EventSource].  This service replaced the Mediawiki RecentChange specific service [[Obsolete:RCStream|RCStream]], and perhaps eventually will also deprecate [[mw:API:Recent_changes_stream#How_it_works_on_Wikimedia_wikis|irc.wikimedia.org]].
 
 
== [[Analytics/EventLogging|EventLogging]] ==
 
EventLogging is the original event framework used by WMF.  It consists of 3 main parts.
 
=== EventLogging Service ===
The [https://github.com/wikimedia/eventlogging EventLogging python codebase] consists of various services that can be mixed and matched to consume, validate, transform, and produce streams of JSON events between different endpoints.  EventLogging was originally created for Mediawiki analytics purposes, but was also used to support production streams of events via an HTTP eventlogging-service used for the EventBus service.
 
This service is [https://phabricator.wikimedia.org/T238230 being deprecated] in favor of EventGate.
 
=== EventLogging Schemas ===
 
For the original analytics EventLogging use case, all schemas are maintained and stored on [https://meta.wikimedia.org/w/index.php?title=Special%3AAllPages&from=&to=&namespace=470 meta wiki].  The default EventLogging validation logic
looks for matching schemas in this online http/wiki repository.
 
=== EventLogging Extension ===
The [https://www.mediawiki.org/wiki/Extension:EventLogging EventLogging Mediawiki extension] is used by Mediawiki to send JSON data to EventLogging by sending a <tt>GET</tt> request with query parameter encoded JSON data to a special Varnish endpoint.  Requests to this Varnish endpoint are logged (via varnishkafka), and make their way into downstream EventLogging services.


[https://github.com/wikimedia/mediawiki-services-eventstreams EventStreams] is a public facing service.  It exposes streams of events via HTTP [https://en.wikipedia.org/wiki/Server-sent_events SSE/EventSource].  This service replaces the Mediawiki RecentChange specific service [[RCStream]], and perhaps eventually will also deprecate [https://www.mediawiki.org/wiki/API:Recent_changes_stream#How_it_works_on_Wikimedia_wikis irc.wikimedia.org].
[[Category:Event Platform]]

Latest revision as of 04:01, 21 September 2021

EventBlabla what?! So Many Event* services. This page disambiguates, and links to their more extensive documentation.

In all of the following, an 'event' is defined as a structured and schema-ed data object. The data object is usually serialized as JSON with the schema specified in and validated with JSONSchema.

Event Platform

The Modern Event Platform Program is about unifying the our analytics and production event systems into a generic event stream intake, validation, processing and consuming system.

Metrics Platform

Metrics Platform is a suite of services, standard libraries, and APIs for producing and consuming instrumentation data of all kinds from Wikimedia Foundation products. It mainly consists of standard product metrics schemas and client library implementations. It is built on top of Event Platform components.

EventBus

Like EventLogging, 'EventBus' refers to a few different pieces of infrastructure. More detail can be found on the EventBus documentation page.

EventBus Mediawiki extension

This is the Mediawiki extension responsible for sending events from WMF Mediawiki installations to backend event intake service (EventGate).

eventlogging-service-eventbus

This is a particular deployment of eventlogging-service that is used to accept internally produced events over HTTP. It is configured to use JSONSchemas from the mediawiki/event-schemas repository to validate incoming events.

Event Schema Repositories and HTTP Service

Our event schemas are available via an HTTP API at https://schema.wikimedia.org/. This site simply hosts up to date schemas stored in git repositories.

WMF has 2 event schema git repositories. schemas/event/primary is for production/tier 1 events. Schemas here are more tightly controlled (and bikeshed). schemas/event/secondary is for non critical and analytics focused events.

Previously, schemas also lived in the mediawiki/event-schemas repository, but this has been deprecated in favor of schemas/event/primary.

You can learn more about WMF's Event Schemas at Event_Platform/Schemas.

EventGate (repository)

EventGate is the event service replacement for eventlogging-service-eventbus, a component of the Modern Event Platform program. This service has the same API as eventlogging-service-eventbus, with a few updated expectations for the event schemas it can process. Over 2019-2020, we will be replacing all eventlogging intake services with EventGate deployments.

EventStreamConfig

While planning the Modern Event Platform program, product managers and engineers expressed a desire to be able to more dynamically configure the behavior of clients producing (and potentially consuming) event streams. EventGate also needs some dynamic configuration in order to enforce that specific event streams only include events of the same JSONSchema lineage. The EventBus extension (as well as other event producing clients) need a way to configure the destination event service (EventGate) instance where they should produce an event.

The EventStreamConfig MediaWiki extension was built to support these use cases. It uses MediaWiki's configuration to centralize the event stream configuration for all of these clients. It exposes both a PHP MediaWiki interface as well as a HTTP API to look up configs for event streams by name. The EventLogging extension uses the PHP API to load stream configuration to browser clients via ResourceLoader.

EventStreams (repository)

EventStreams is a public facing service. It exposes streams of events via HTTP SSE/EventSource. This service replaced the Mediawiki RecentChange specific service RCStream, and perhaps eventually will also deprecate irc.wikimedia.org.


EventLogging

EventLogging is the original event framework used by WMF. It consists of 3 main parts.

EventLogging Service

The EventLogging python codebase consists of various services that can be mixed and matched to consume, validate, transform, and produce streams of JSON events between different endpoints. EventLogging was originally created for Mediawiki analytics purposes, but was also used to support production streams of events via an HTTP eventlogging-service used for the EventBus service.

This service is being deprecated in favor of EventGate.

EventLogging Schemas

For the original analytics EventLogging use case, all schemas are maintained and stored on meta wiki. The default EventLogging validation logic looks for matching schemas in this online http/wiki repository.

EventLogging Extension

The EventLogging Mediawiki extension is used by Mediawiki to send JSON data to EventLogging by sending a GET request with query parameter encoded JSON data to a special Varnish endpoint. Requests to this Varnish endpoint are logged (via varnishkafka), and make their way into downstream EventLogging services.