You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Event Platform/Producer Requirements
In order to ensure that Event Platform based event streams are automatically integratable with all consumers and downstream data systems, producers must ensure that the events satisify specific requirements before producing them. If you are using a supported WMF Event Platform producer service or library, these requirements should be satisified already. However, if you are working in a language or area that does not have a Event Platform client library, the you will be producing events directly to Kafka yourself.
This page describes the requirements that any Event Platform producer should satisfy.
Requirements
Events
- All events must have an event schema in a WMF Schema Repository.
- All event streams must be declared in WMF Event Stream Config
- All events must set required event fields.
Producer libraries
A producer will need to interact with and lookup Event Stream Configuration and WMF Schema Repositories. The URI locations from which to look up event stream configuration and event schemas should be configurable. E.g. A user should be able to provide your producer library a local schema repository base path for development.
All producer clients and libraries must be able to:
- Look up the event's schema via its $schema URI field in the configured base schema repository URIs.
- Look for any event stream configuration defined for the stream the event will be produced to. The stream the event will be produced to should be in the event's meta.stream field.
- Ensure that the event has a dt field set specifying its ISO-8601 event time.
- Optionally set the meta.dt field to indicate the event's ISO-8601 'system ingestion time' by the library.
- Ensure the event is allowed in its destination stream name, as specified in the meta.stream field. This can be determined by checking that the stream's configured schema_title matches the event's schema's title.
- Ensure the event is valid according to the schema it declares in its $schema field.
- Uses the key_fields event stream config setting to hoist event field values into a JSON message key object. (Example EventGate Javascript code that does this).
- Add datacenter prefixes to the stream name to make the destination Kafka topic name.
- Produces to the datacenter specific Kafka topic. If key_fields is configured for this steam, the Kafka message key should be a serialized JSON object of the key, and the Kafka producer should consistently partition by this key.
Supported Event Platform producer clients
- EventGate - An http producer proxy. At WMF, this takes events to produce over HTTP and produces them to Kafka.
- wikimedia-event-utilities Java library - A java library for working with Event Platform events and streams. Has APIs for preparing events for production to Kafka.
As new clients and libraries are implemented, please link to them here.