You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Analytics/Web publication: Difference between revisions
imported>Nuria No edit summary |
imported>Nuria No edit summary |
||
Line 8: | Line 8: | ||
* '''Always Remember''': be careful what you share here | * '''Always Remember''': be careful what you share here | ||
To share data via this server just copy '''safe, non-identifying''' data to <code>/srv/published/</code> on any of the [[Analytics/Systems/Clients|Analytics clients]]. For example, [[Analytics/Reportupdater|reportupdater]] jobs copy their output to <code>/srv/published/datasets/periodic/reports</code>. | To share data via this server just copy '''safe, non-identifying''' data to <code>/srv/published/</code> on any of the [[Analytics/Systems/Clients|Analytics clients]]. For example, [[Analytics/Reportupdater|reportupdater]] jobs copy their output to <code>/srv/published/datasets/periodic/reports</code>. Another example: from stat1007, directories are synced from /srv/published/datasets to https://analytics.wikimedia.org/published/datasets/ |
Revision as of 15:48, 4 February 2020
This page describes how to make safe, non-identifying datasets, notebooks, or other research products public on the web in the analytics.wikimedia.org/published directory. For guidelines on how to formally release an open dataset (with metadata and persistent identifiers), please refer to Data releases. For regular, structured, and maintained datasets, please see Analytics#Datasets.
If you're looking for data here, some of it may not be maintained or documented. If possible, please reach out to the authors of the data for help, or to Analytics/Team. If you're publishing data here, there are some guidelines in the README on the server:
- Please name your folders in a friendly way, think of strangers browsing through this data
- Take a look at https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater for ongoing reports
- Always Remember: be careful what you share here
To share data via this server just copy safe, non-identifying data to /srv/published/
on any of the Analytics clients. For example, reportupdater jobs copy their output to /srv/published/datasets/periodic/reports
. Another example: from stat1007, directories are synced from /srv/published/datasets to https://analytics.wikimedia.org/published/datasets/