You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Analytics/Data Lake
< Analytics
Jump to navigation
Jump to search
Revision as of 14:49, 7 April 2017 by imported>Joal (Add general Data Lake information.)
The Analytics Data Lake (ADL) is a large, analytics-oriented repository of data, both raw and aggregated, about Wikimedia projects (in industry terms, a data lake).
It contains:
- Traffic data -- webrequest, pageviews, unique devices ...
- Edits data -- Historical data about revisions, pages, and users [in beta as of 2017-04-07].
As the Data Lake matures, we will add any and all data, and try to safely make them public as much as possible.
For Technical aspects of the data lake pipelines, see Analytics/Systems/Data Lake.