You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Portal:Data Services: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>BryanDavis
(Use mbox to highlight intro materials)
imported>Quiddity
(add direct quarry link)
Line 1: Line 1:
{{Mbox|image=[[File:Ambox PR.svg|40x40px|link=|alt=]]|text=Please read the [[Help:Cloud_Services_Introduction|Wikimedia Cloud Services Introduction]] and the [[Help:Getting_Started|Getting Started guide]].}}
{{Mbox|image=[[File:Ambox PR.svg|40x40px|link=|alt=]]|text=Please read the [[Help:Cloud_Services_Introduction|Wikimedia Cloud Services Introduction]] and the [[Help:Getting_Started|Getting Started guide]].}}
[[File:WMCS data services.svg|right|120px|alt=WMCS data services|link=]]
[[File:WMCS data services.svg|right|120px|alt=WMCS data services]]


'''Data Services''' include services that allow for direct access to databases and dumps, and web interfaces for querying and programmatic access to data stores. The Data Services currently offered are Wiki Replicas (naming [[User talk:BryanDavis/Rebranding Cloud Services products#Labsdb needs a new name|under discussion]]), ToolsDB, Wikimedia Dumps, Shared Storage, Quarry and PAWS.
'''Data Services''' include services that allow for direct access to databases and dumps, as well as web interfaces for querying and programmatic access to data stores. Services currently offered are: Wiki Replicas, ToolsDB, Wikimedia Dumps, Shared Storage, Quarry and PAWS.


== Wiki Replicas ==
== Wiki Replicas ==
Wiki Replicas are the sanitized public replicas of production MySQL Mediawiki databases. Access to the Wiki Replicas is granted for users with a ToolForge account automatically. See [[Help:Tool Labs/Database]] for how to access the Wiki Replicas.
Wiki Replicas are the sanitized public replicas of the production Wikimedia MediaWiki wiki databases. Access to the Wiki Replicas is granted for users with a Toolforge account automatically. See [[Help:Toolforge/Database]] for how to access the Wiki Replicas.


== ToolsDB ==
== ToolsDB ==
ToolsDB is a service that allows a Tool shared user to create and maintain a Tool specific database. See [[Help:Tool Labs/Database#User databases]] for help on ToolsDB.
ToolsDB is a service that allows a Tool shared user to create and maintain a Tool specific database. See [[Help:Toolforge/Database#User databases]] for help on ToolsDB.


== Wikimedia Dumps ==
== Wikimedia Dumps ==
[https://dumps.wikimedia.org/ Wikimedia Dumps] offers a range of data downloads including full text dumps, and other datasets. ToolForge users can directly access dumps data through their Tool account, see [[Help:Tool Labs#Dumps]]. VPS users can request to have the share available, see [[Help:Shared storage#.2Fpublic.2Fdumps]]
[https://dumps.wikimedia.org/ Wikimedia Dumps] offers a range of data downloads including full text dumps, and other datasets. Toolforge users can directly access dumps data through their Tool account, see [[Help:Toolforge#Dumps]]. Cloud VPS users can request to have the share available, see [[Help:Shared storage#.2Fpublic.2Fdumps]]


== Shared Storage ==
== Shared Storage ==
Shared Storage is offered via [[wikipedia:Network_File_System|NFS]] for ToolForge and VPS users. Shares currently offered are described at [[Help:Shared storage]]. The ToolForge environment is setup for access by default, and VPS Projects can access some resources on special request.  
Shared Storage is offered via [[w:Network_File_System|NFS]] for Toolforge and Cloud VPS users. Shares currently offered are described at [[Help:Shared storage]]. The Toolforge environment is setup for access by default, and other Cloud VPS projects can access some resources on special request.  


[https://dumps.wikimedia.org/ Wikimedia Dumps] are also offered via the Shared Storage services, but called out separately as a Data Service because of their wide use.
[https://dumps.wikimedia.org/ Wikimedia Dumps] are also offered via the Shared Storage services, but treated as a Data Service because of their wide use.


== Quarry ==
== Quarry ==
Quarry is a graphical web interface that allows users to write SQL to query the Wiki Replicas. It only needs a Wikimedia (Meta) account to login, and is extensively used by analysts, researchers, and people of all experience levels to easily access the databases. See [[m:Research:Quarry]] for help.
[https://quarry.wmflabs.org/ Quarry] is a graphical web interface that allows users to write SQL to query the Wiki Replicas. It only needs a Wikimedia (Meta) account to login, and is extensively used by analysts, researchers, and people of all experience levels to easily access the databases. See [[m:Research:Quarry]] for help.


== PAWS ==
== PAWS ==
[https://paws.wmflabs.org PAWS] is a [https://jupyter.org Juypter] notebooks on the cloud service that hosts python notebooks and a terminal accessible through a web browser. It also only needs Meta account to login, and allows for access to the Wiki Replicas, ToolsDB and Dumps. See [[PAWS]] for help.
[https://paws.wmflabs.org PAWS] is a [https://jupyter.org Juypter] notebooks on the cloud service that hosts python notebooks and a terminal accessible through a web browser. It also only requires a Wikimedia (Meta) account to login, and allows for access to the Wiki Replicas, ToolsDB and Dumps. See [[PAWS]] for help.




[[Portal:Data_Services/Admin|Administration]]
== See also ==
* [[Portal:Data_Services/Admin|Data Services administrative documentation]]


[[Category:Portals|DaaS]]
[[Category:Portals|DaaS]]

Revision as of 18:10, 7 September 2017

WMCS data services

Data Services include services that allow for direct access to databases and dumps, as well as web interfaces for querying and programmatic access to data stores. Services currently offered are: Wiki Replicas, ToolsDB, Wikimedia Dumps, Shared Storage, Quarry and PAWS.

Wiki Replicas

Wiki Replicas are the sanitized public replicas of the production Wikimedia MediaWiki wiki databases. Access to the Wiki Replicas is granted for users with a Toolforge account automatically. See Help:Toolforge/Database for how to access the Wiki Replicas.

ToolsDB

ToolsDB is a service that allows a Tool shared user to create and maintain a Tool specific database. See Help:Toolforge/Database#User databases for help on ToolsDB.

Wikimedia Dumps

Wikimedia Dumps offers a range of data downloads including full text dumps, and other datasets. Toolforge users can directly access dumps data through their Tool account, see Help:Toolforge#Dumps. Cloud VPS users can request to have the share available, see Help:Shared storage#.2Fpublic.2Fdumps

Shared Storage

Shared Storage is offered via NFS for Toolforge and Cloud VPS users. Shares currently offered are described at Help:Shared storage. The Toolforge environment is setup for access by default, and other Cloud VPS projects can access some resources on special request.

Wikimedia Dumps are also offered via the Shared Storage services, but treated as a Data Service because of their wide use.

Quarry

Quarry is a graphical web interface that allows users to write SQL to query the Wiki Replicas. It only needs a Wikimedia (Meta) account to login, and is extensively used by analysts, researchers, and people of all experience levels to easily access the databases. See m:Research:Quarry for help.

PAWS

PAWS is a Juypter notebooks on the cloud service that hosts python notebooks and a terminal accessible through a web browser. It also only requires a Wikimedia (Meta) account to login, and allows for access to the Wiki Replicas, ToolsDB and Dumps. See PAWS for help.


See also