You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

User:Phuedx/Metrics Platform: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Phuedx
(TODO)
 
imported>Phuedx
Line 37: Line 37:
# How to guides
# How to guides
## Hacking on the Metrics Platform
## Hacking on the Metrics Platform
## Creating a stream on the Beta Cluster and in production
## Create an integration
### Ensure that the stream appears on the Beta Cluster tooling?
# Reference guides – Client and Client/Implementations pages?
# Reference guides – Client and Client/Implementations pages?
# FAQ
# FAQ

Revision as of 15:27, 6 October 2022

The Metrics Platform is <section begin="excerpt" />a suite of services, standard libraries, and APIs for producing and consuming data of all kinds from Wikimedia Foundation products.<section end="excerpt" /> For data producers, it aims to simplify the task of designing, implementing, and maintaining data-producing code (also called instrumentation), while offering better guarantees of quality, rigor, and safety. For data consumers, it aims to streamline the process of defining a new dataset and offer a rich set of tools to answer questions and generate insights with data.

The Metrics Platform:

  • specifies how clients work with the Event Platform
  • provides standardized algorithms, behaviors, and basic necessities for web and app tracking, including:
    • standardized session ID generation, consistent across MediaWiki, Android, and iOS
    • standardized session expiry
    • enriching events with contextual data
    • determining which events are in-sample or out-of-sample based on a specific identifier (currently: pageview, session, or device ID)

Getting Started

Currently, the JS and PHP Metrics Platform Clients only work within the EventLogging extension. Before we get started defining streams and creating instruments, you will need to have an up-to-date MediaWiki development environment. If you do not have one, then please consider using MediaWiki-Docker.

Setup and Hello, World!

See this Event Platform configuration recipe for MediaWiki-Docker for detailed instructions about setting up an Event Platform (ish) environment alongside your MediaWiki development environment. Afterwards, add this to LocalSettings.php:

$wgEventStreams = [
    'test.all' => [
        'schema_title' => '/analytics/mediawiki/client/metrics_event',
        'producers' => [
            'metrics_platform_client' => [
                'events' => [ '' ], // Matches all event names
            ],
        ],
    ],
];
$wgEventLoggingStreamNames = [ 'test.all' ];

Navigate to http://localhost:8080/wiki/Main_Page and run this in the console:

mw.eventLog.dispatch( 'test.hello_world', {
    hello: 'World!'
} );

TODO

  1. Instrumentation end to end
  2. JS, PHP, Java, Swift
  3. How to guides
    1. Hacking on the Metrics Platform
    2. Create an integration
  4. Reference guides – Client and Client/Implementations pages?
  5. FAQ
    1. How does Metrics Platform relate to the Event Platform?
    2. Why a monoschema?

History and Rationale

Previously, different teams implemented their own analytics solutions in isolated from one another. Those solutions were typically based on the Legacy EventLogging pipeline and, more recently, the Event Platform. The Metrics Platform is an effort to unify that previous work and to establish consistency across platforms. That uniformity and consistency makes it possible to leverage data from multiple platforms to yield insights into how our users use our whole ecosystem of products in unison.

It also enables analysts to support teams which are not their primary teams – to be more portable. The legacy system, in which every instrumentation has its own quirks and naming is inconsistent, places a heavy burden on each analyst to learn and remember the specifics of their assigned teams' data; and if another analyst had to come in as back-up, they too would need to learn those specifics.

Topics Directory