Jump to content

This is a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Data Platform/Data Lake/Events

From Wikitech

Data from Event Platform data stream is available in the Data Lake in the event and event_sanitized databases.

The event database stores the past 90 days of raw (unsanitized) events for all streams. The event_sanitized database stores events sanitized to protect privacy only for streams where sanitization has been manually configured in the sanitization allowlist .

Schemas for these data streams are stored in one of the two schema repositories:

The the old EventLogging system used schemas stored on Meta-Wiki. Any schemas still being used were migrated to the secondary schema repository, but the pages on Meta-Wiki may be useful for historical information.