You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Analytics/EventLogging/Publishing: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Halfak
(Removed reports. No consensus for that.)
 
imported>Milimetric
(Milimetric moved page Analytics/EventLogging/Publishing to Analytics/Systems/EventLogging/Publishing: Reorganizing documentation)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
{{draft}}
#REDIRECT [[Analytics/Systems/EventLogging/Publishing]]
WMF's EventLogging database is private, because it may hold sensitive information during a certain time window. To access it, one must be an employee of the Wikimedia Foundation or have signed an NDA. Hence, any data sets based on EventLogging data are potentially harmful and need to be subject of review before they can be published.
 
== Publishing data sets ==
 
We consider a data set: a collection of (whole or partial) records extracted from the database for the purpose of enabling future analyses.
 
The preferred option is NOT to release any such data sets publicly. If you'd like to open an exception, please contact the Legal team AND also the Community Advocacy team to review your data set, and ensure that it contains no sensitive data. If you have other questions, please ask the Analytics team or the Research team.
 
=== Which data will be vetted? ===
 
* PII (Personally identifiable information), like clientIp, userAgent, userName, userId, editCount, and in general, any piece of information that can uniquely identify a physical or virtual person.
* User-inputed textual fields, like pageTitle, imageTitle, summary, userName, userText, etc. Schemas containing this kind of data are marked as such in the schema talk page.

Latest revision as of 14:13, 7 April 2017