You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
This page is currently a draft.
More information and discussion about changes to this draft on the talk page.
WMF's EventLogging database is private, because it may hold sensitive information during a certain time window. To access it, one must be an employee of the Wikimedia Foundation or have signed an NDA. Hence, any data sets based on EventLogging data are potentially harmful and need to be subject of review before they can be published.
Publishing data sets
We consider a data set: a collection of (whole or partial) records extracted from the database for the purpose of enabling future analyses.
The preferred option is NOT to release any such data sets publicly. If you'd like to open an exception, please contact the Legal team AND also the Community Advocacy team to review your data set, and ensure that it contains no sensitive data. If you have other questions, please ask the Analytics team or the Research team.
Which data will be vetted?
- PII (Personally identifiable information), like clientIp, userAgent, userName, userId, editCount, and in general, any piece of information that can uniquely identify a physical or virtual person.
- User-inputed textual fields, like pageTitle, imageTitle, summary, userName, userText, etc. Schemas containing this kind of data are marked as such in the schema talk page.