You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Metrics Platform

From Wikitech-static
Revision as of 21:59, 21 September 2021 by imported>Kate Zimmerman (added information from https://www.mediawiki.org/wiki/Wikimedia_Product/Better_use_of_data/Event_Platform_Clients #raddocs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The Metrics Platform is a suite of services, standard libraries, and APIs for producing and consuming data of all kinds from Wikimedia Foundation products. For data producers, it aims to simplify the task of designing, implementing, and maintaining data-producing code (also called instrumentation), while offering better guarantees of quality, rigor, and safety. For data consumers, it aims to streamline the process of defining a new dataset and offer a rich set of tools to answer questions and generate insights with data.

The Metrics Platform:

  • specifies how clients work with the Event Platform
  • provides standardized algorithms, behaviors, and basic necessities for web and app tracking, including:
    • standardized session ID generation, consistent across MediaWiki, Android, and iOs
    • attaching the necessary metadata to logged events such as client-side timestamp recording when the event was generated
    • determining which events are in-sample or out-of-sample based on a specific identifier (pageview, session, device)

Topics Directory

Development Status

Current product platform support
Version Release Date MediaWiki JS MediaWiki PHP Wikipedia Android Wikipedia iOS Wikipedia KaiOS
1.0 (in development) 05-30-2021
0.2 12-30-2020 x x x x
0.1 06-30-2020 x x

History and Rationale

Previously, different teams implemented their own EventLogging-based analytics solutions, isolated from each other. The Metrics Platform is an effort to unify that previous work and to establish consistency across platforms. That uniformity and consistency makes it possible to leverage data from multiple platforms to yield insights into how our users use our whole ecosystem of products in unison.

It also enables analysts to support teams which are not their primary teams – to be more portable. The legacy system, in which every instrumentation has its own quirks and naming is inconsistent, places a heavy burden on each analyst to learn and remember the specifics of their assigned teams' data; and if another analyst had to come in as back-up, they too would need to learn those specifics.