You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

PAWS/PAWS examples and recipes: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>SRodlund
m (SRodlund moved page User:SRodlund/PAWS (staging)/pagedesign/PAWS examples and recipes to PAWS/PAWS examples and recipes without leaving a redirect: Moving staged pages to Main)
 
imported>Slavina Stefanova
m (→‎Tutorials and How-tos: Edit tutorial link for the n-th time, losing hope)
 
(14 intermediate revisions by 8 users not shown)
Line 2: Line 2:
{{Ptag|paws}}
{{Ptag|paws}}


This page is work in progress and will be developed further.
== Overview ==
This is a list of existing PAWS notebooks created by users that can serve as examples for others. The list includes notebooks that employ database connections and API connections and are useful to individuals wishing to complete research and on-wiki tasks.  


=Overview=
If you want to download and re-use these examples, see the instruction on how to [[PAWS/Getting_started_with_PAWS#Fork|quickstart with PAWS]].
This page offers a growing number of recipes, how-tos, and example notebooks that you may find useful while learning and exploring PAWS. This page is not meant to be an exhaustive list. There are many examples of public notebooks available in many places. To see ''all'' of the notebooks currently hosted on PAWS, check out [https://public.paws.wmcloud.org/36582847/?C=M&O=D the public index].  


There is currently no way to search the public index for specific types of notebooks, but it can be useful to explore them to see what others have done with PAWS.
The notebooks are marked with specific topic tags (see key below) to help aid in understanding what they cover and what tasks they are best suited for.


This page is a work in progress, and more will be added over time. If you find a public notebook of interest or a how-to or tutorial you would like to share, feel free to add it here.
=== Key ===
{{Colored box
|title = Color key
|content = A visual key to help keep track of what examples and tutorials are available
{{Topic|Example|#ec7063}} {{Topic|Tutorial or How-to|#aed6f1}} {{Topic|API|#00af89}} {{Topic|Wikireplicas|#fc3}} {{Topic|Datadumps|#eaf3ff}} {{Topic|Research & Analysis|#fee7e6}} {{Topic|On-Wiki tasks|#fef6e7}} {{Topic|Pywikibot|#d5fdf4}} {{Topic|Wikidata|#d0ece7}} {{Topic|SPARQL|#ebdef0}}
}}


== How-tos ==
== Wiki replicas and datasets ==


=== Mediawiki Rest API ===
=== Connecting to Wiki replicas ===
* [https://public.paws.wmcloud.org/User:APaskulin_(WMF)/en-wikipedia-search.ipynb Search Wikipedia articles] - The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API search endpoints to search for articles about the Solar System on English Wikipedia.
'''Note:''' As of April 2021, there is a new method for connecting to Wikireplicas. Please see the following notebooks for examples of how to connect to the Wiki replicas. If you plan on following the other Tutorials or How-tos or fork the examples below. Make sure to use the '''most current''' method of connecting to the Wiki replicas.
* [https://public.paws.wmcloud.org/User:APaskulin_(WMF)/en-wikipedia-page-history.ipynb Exploring page history] - The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API page history endpoints to explore the history of articles on English Wikipedia.


=== Wikidata ===
* [https://public.paws.wmcloud.org/User:SRodlund_(WMF)/%2A2021%20UPDATED%2A%20Replica%20Helper%20%26%20Database%20connections%20with%20PAWS.ipynb Working with Wiki replicas and datasets]
* [https://www.wikidata.org/wiki/Category:Pywikibot_tutorial Pywikibot Tutorials]- an index of Pywikibot tutorials related to Wikidata.
* [https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20Wikireplicas%20from%20PAWS.ipynb Using Wikireplicas with PAWS and Python]
* [https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20the%20new%20replicas,%20changes%20from%20the%20previous%20cluster.ipynb Accessing the new replicas & changes from the previous cluster]


=== Pywikibot ===
=== Tutorials and How-tos ===
* [https://public.paws.wmcloud.org/36582847/Using-Pywikibot-with-Paws.ipynb Using Pywikbot with PAWS] - A basic introduction to using Pywikibot with PAWS. This tutorial gives you the information you need to get started using a Python 3 notebook or the PAWS terminal.
{{Topic|Tutorial or How-to|#aed6f1}} {{Topic|Research & Analysis|#fee7e6}} {{Topic|Wikireplicas|#fc3}} {{Topic|Wikidata|#d0ece7}} {{Topic|SPARQL|#ebdef0}}{{Topic|Datadumps|#eaf3ff}}
* [https://public.paws.wmcloud.org/47948701/02%20-%20Intro%20to%20pywikibot.ipynb An intro to Pywikibot] - A notebook based Pywikibot tutorial
* [https://www.wikidata.org/wiki/Category:Pywikibot_tutorial Pywikibot Tutorials] - an index of Pywikibot tutorials related to Wikidata.


== Example notebooks ==
* [https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20Wikireplicas%20from%20PAWS.ipynb Using Wikireplicas from PAWS with Python] - A quick tutorial that explores how to connect to the Wikireplica databases and make a query.
* [https://public.paws.wmcloud.org/11901271/Lab%201%20-%20Revision%20Histories.ipynb Revision histories] - A lab that explores how to do some introductory data extraction and analysis from Wikipedia data
* [https://public.paws.wmcloud.org/15811/Lab%202%20-%20Hyperlink%20Networks.ipynb Hyperlink networks (APIs)]
* [https://public.paws.wmcloud.org/15811/Lab%203%20-%20Collaboration%20Networks.ipynb Collaboration networks] - A lab that explores how to analyze the structure of collaborations in Wikipedia data about users revisions across multiple articles
* [https://public.paws.wmcloud.org/17431941/Lab%204%20-%20Pageviews.ipynb Pageviews] -  A lab that explores how to analyze the structure of collaborations in Wikipedia data about users; revisions across multiple articles
* [[paws:User:Slst2020/How to explore Wikimedia data using Python – XML, SQL dumps and APIs .ipynb|Wikimedia public tools for researchers]] - In this notebook you will find a set of working examples to connect Wikimedia Research Resources, to Jupyter notebooks, importing the data to Pandas Dataframes.
* [[paws:User:Slst2020/mwsql-tutorial.ipynb|How to explore Wikimedia data using Python – XML, SQL dumps, and APIs]]  A Jupyter notebook tutorial that explores, analyzes and visualizes data from the Simple English Wikipedia using Python, [[mw:Mediawiki-utilities|mediawiki-utilites]], and Pandas
* [https://public.paws.wmcloud.org/43975115/geeq-sql_demo.ipynb SQL Demo and examples] - A variety of examples for working with SQL and PAWS
* [https://public.paws.wmcloud.org/49603154/Wikipedia%20Topics.ipynb How-to - Visualizing Wikipedia topics] - Connect to the database and use several Python libraries to create visualizations of data from Wikipedia
* [https://public.paws.wmcloud.org/47536603/THQ-archive-builder.ipynb How-to - Teahouse question archive builder] - This notebook will build a queryable data object out of a parsed thread dataset
* [https://public.paws.wmcloud.org/54798472/DataLoading.ipynb How-to - Event Stream, API, Database connections] - A variety of methods for accessing data about revisions
* [https://public.paws.wmcloud.org//58332189//Querying_Wikidata_with_SPARQL_sarasua.ipynb Querying Wikidata with SPARQL]


=== Notebooks that use datasets ===
==== Wikidata dumps tutorials ====
* [https://public.paws.wmcloud.org/56657513/Paper_covid.ipynb A notebook that explores Covid-19 data. Includes really interesting visualizations].
{{Topic|Tutorial or How-to|#aed6f1}} {{Topic|Research & Analysis|#fee7e6}} {{Topic|Wikidata|#d0ece7}} {{Topic|Datadumps|#eaf3ff}}


=== Notebooks that use replicas and databases ===
* [https://public.paws.wmcloud.org/User:Isaac_(WMF)/Outreachy%20Dec%202020/Wikidata_Data_Example.ipynb Wikidata JSON dump / API]
* [https://public.paws.wmcloud.org/59131898/USPlacenames.ipynb A notebook that uses SQL to compile a list of US place names]
* [https://public.paws.wmcloud.org/User:Isaac_(WMF)/Kiswahili-English%20Parallel%20Translation%20Data.ipynb Content Translation dumps / API]
* [https://public.paws.wmcloud.org/User:Isaac_(WMF)/isiZulu%20Topic%20Classification%20Data.ipynb Wikitext dumps]


=== Notebooks that use multiple Python libraries ===
=== Example notebooks ===
* [https://public.paws.wmcloud.org/54410566/machine%20learning/Untitled.ipynb A machine learning notebook with visualizations]. Multiple Python libraries are imported to create this notebook.
Note: This only includes notebooks without JOINS in their SQL queries -- which may not work correctly after planned changes to Wiki replicas. For a list of notebooks that include JOINS by USER-ID, see this list: https://wikitech.wikimedia.org/wiki/User:SRodlund/PAWS_examples_lists/notebooks_with_joins
* [https://public.paws.wmcloud.org/54410566/Painter%20test.ipynb A notebook using Wikidata to list painters in multiple languages].


=== Notebooks that use Pywikibot ===
==== Wiki replicas ====
{{Topic|Wiki replicas|#fc3}} {{Topic|Research & Analysis|#fee7e6}} {{Topic|Wikidata|#d0ece7}} {{Topic|Example|#ec7063}}


'''Understand users and user behavior on a wiki'''
* [https://public.paws.wmcloud.org/27425175/Untitled.ipynb Find Wikidata Q ids for all pages in category]
* [https://public.paws.wmcloud.org/User:AntiCompositeBot/Old/Global_block_history.ipynb See the global block history for a user across wikis]
* [https://public.paws.wmcloud.org/27666631/Page%20curation%20backlog.ipynb Curation log]
* [https://public.paws.wmcloud.org/User:AntiCompositeBot/Old/Openbrack.ipynb Pages created from external links by non-autoconfirmed users]. This can be used to reduce spam on wikis.
* [https://public.paws.wmcloud.org/27666631/new%20pages.ipynb Get count of unreviewed pages per creation day, by autoconfirmed status]
* [https://public.paws.wmcloud.org/2888483/RC.ipynb Get the recent changes of the day]
* [https://public.paws.wmcloud.org/309423/Commons%20edits%20by%20WMF%20staff.ipynb Common edits by WMF staff]
* [https://public.paws.wmcloud.org/309423/NPP_delete_vs_deleted.ipynb How many NPP pages marked for deletion are actually deleted?]
* [https://public.paws.wmcloud.org/309423/Renamed%20Teahouse%20answerers.ipynb Teahouse Answers]
* [https://public.paws.wmcloud.org/32615317/Language%20revision%20counts%20per%20day.ipynb Language revision counts per day]
* [https://public.paws.wmcloud.org/32772391/Test.ipynb SELECT page_title FROM page WHERE page title like ;% %;]
* [https://public.paws.wmcloud.org/3317/database03-Templatetiger.ipynb Wikidata database - Names similar to Karl]
* [https://public.paws.wmcloud.org/38131863/SQL.ipynb Number of pages with "Berlin" - Wikimedia DE]
* [https://public.paws.wmcloud.org/45853834/dbThing.ipynb Changes made to pages using MyPySQL and Pywikibot - HY Wikipedia]
* [https://public.paws.wmcloud.org/46197791/Grab%20Wiki%20edit%20counts.ipynb User Ids and their edit counts - Teahouse]
* [https://public.paws.wmcloud.org/51232205/GetCategoriesTopViewed.ipynb Get top viewed categories]
* [https://public.paws.wmcloud.org/54751007/tables_download.ipynb Tables Download]
* [https://public.paws.wmcloud.org/56371856/Querying%20Media%20Counts.ipynb Querying Media Counts - WikiLovesAfrica]
* [https://public.paws.wmcloud.org/56371856/Query%20WikiLovesAfrica%20Images%20and%20How%20Often%20They%20Were%20Viewed.ipynb Querying images and how often they were used - WikiLovesAfrica]
* [https://public.paws.wmcloud.org/57657198/section_alignment.ipynb This notebook contains functions for article comparison]
* [https://public.paws.wmcloud.org/59131898/editnotices.ipynb Edit notices - En Wikipedia]
* [https://public.paws.wmcloud.org/7702059/welcome_best_newcomer.py.ipynb A look at Barnstars]
* [https://public.paws.wmcloud.org/8135444/not%20mark%20fair%20use%20imagelinks.ipynb Images not marked for fair use]
* [https://public.paws.wmcloud.org/8135444/zhwiki%20abusefilter%20list.ipynb Wiki abuse filter list]
* [https://public.paws.wmcloud.org/9490343/Patrolling%20analysis.ipynb What is the annual volume of patrolling?]


'''Make it easier for editors to organize articles and information'''
==== Dumps====
* [https://public.paws.wmcloud.org/User:AntiCompositeBot/Old/RussiaStubs.ipynb Extract information about stubs for editors who are considering merging them] This example uses the catgory "Rural localities in Russia."
{{Topic|Research & Analysis|#fee7e6}} {{Topic|Datadumps|#eaf3ff}} {{Topic|Example|#ec7063}}
* [https://public.paws.wmcloud.org/User:AntiCompositeBot/Old/Nazarene.ipynb Search for pages with deprecated templates]


''' Contribute to Wikidata '''
* [https://public.paws.wmcloud.org//64172541//Binary%20Classification.ipynb Accessing page protections]
* [https://public.paws.wmcloud.org/63056561/missing_person_label.ipynb Add missing labels to Wikidata]
* [https://public.paws.wmcloud.org/54808064/wikimedia.ipynb Wikimedia - public dumps]
* [https://public.paws.wmcloud.org/64225498/Infering%20Countries%20from%20articles.ipynb Inferring countries from articles - public dumps]
* [https://public.paws.wmcloud.org/712/Pageviews.ipynb Pageviews - public dumps]
* [https://public.paws.wmcloud.org/54485784/Random%20Walk.ipynb A variety of tasks with dumps]
* [https://public.paws.wmcloud.org//64217908//DATA%20EXTRACTION.ipynb Public dumps]
* [https://public.paws.wmcloud.org//27666631//Generic%20notebook%20for%20dump%20processing.ipynb Generic notebook for dump processing]
* [https://public.paws.wmcloud.org//55703823//Simplified_Wikidata_Dumps.ipynb Simplified Wikidata dumps]
* [https://public.paws.wmcloud.org//55782206//English.ipynb Extract pages containing a keyword from a dump]


===Notebook based tutorials ===
==== SPARQL====
* [https://public.paws.wmcloud.org/User:SRodlund_(WMF)/PAWS-Tutorial.ipynb Getting started with PAWS tutorial]
{{Topic|Research & Analysis|#fee7e6}} {{Topic|SPARQL|#ebdef0}} {{Topic|Wikidata|#d0ece7}} {{Topic|Example|#ec7063}}
* [https://public.paws.wmcloud.org/User:Jtmorgan/ds4ux/paws-cheatsheet.ipynb PAWS Cheatsheet] This "cheatsheet" contains a number of useful tasks you can run right away.
 
* [https://public.paws.wmcloud.org//8952454//Wikidata%20SPARQL%20Query.ipynb Call SPARQL with Python]
* [https://public.paws.wmcloud.org//27843358//WikidataMapMakingWorkshop.ipynb Building layered maps using SPARQL]
* [https://public.paws.wmcloud.org//64056969//Add%20Reference.ipynb Add references to items already in Wikidata]
* [https://public.paws.wmcloud.org//12410844//Wikipedia_languages.ipynb Get Wikipedia languages SPARQL query]
* [https://public.paws.wmcloud.org/User:Fuzheado/smithsonian/SI%20SPARQL%20queries.ipynb Exploring Smithsonian content on Wikidata - queries and stats]
 
==== Wikidata Query ====
{{Topic|Research & Analysis|#fee7e6}} {{Topic|Wikidata|#d0ece7}} {{Topic|Example|#ec7063}}
 
* [https://public.paws.wmcloud.org//33144179//Uffizi%20Museum%20Collections.ipynb Runs Wikidata query in iframe and displays results]
* [https://public.paws.wmcloud.org//4348720//bookmarklet.ipynb Get Wikidata info from an arbitrary URL]
* [https://public.paws.wmcloud.org//37436385//Species.ipynb Species without English descriptions - Wikidata]
 
== APIs ==
 
PAWS notebook: [https://public.paws.wmcloud.org/User:SRodlund_(WMF)/*2021%20UPDATED*%20API%20Connections%20With%20PAWS.ipynb API Connections]
 
=== Tutorials and How-tos ===
{{Topic|API|#00af89}} {{Topic|Tutorial or How-to|#aed6f1}}
 
* [https://public.paws.wmcloud.org/User:SRodlund_(WMF)/*2021%20UPDATED*%20API%20Connections%20With%20PAWS.ipynb API Connections With PAWS] - An overview of how to use PAWS with APIs. '''Updated 2021'''
* [https://public.paws.wmcloud.org/User:APaskulin_(WMF)/en-wikipedia-page-history.ipynb MediaWiki page history] - The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API page history endpoints to explore the history of articles on English Wikipedia.
* [https://public.paws.wmcloud.org//59320301/MediaWiki-REST-API-examples.ipynb MediaWiki Rest API examples]- This notebook contains a variety of MediaWiki Rest API examples: search pages, autocomplete page title, get page history, get page history counts, get revision, compare revision, get page, get page offline, get page source, get languages, get files, get files on a page, create page, update page.
* [https://public.paws.wmcloud.org/User:APaskulin_(WMF)/Wikimedia-Feeds-API-Intro.ipynb Wikimedia Feeds intro] - Many Wikipedias include daily featured articles and other curated content on their homepages. You can see an example of this content on the main page of English, German, and French Wikipedias. The Wikifeeds API lets you access this content programmatically and add high-quality, multilingual content to your apps.
* [https://public.paws.wmcloud.org//59320301/en-wikipedia-images-tile-app.ipynb Create an image grid using free images from Wikimedia Commons] - This guide uses the MediaWiki REST API to explore media files on Wikimedia Commons. Wikimedia Commons is a collection of over 60,000,000 freely usable media files, many of which are used in Wikipedia articles.
* [https://public.paws.wmcloud.org//59320301/en-wikipedia-images.ipynb Reuse free images from Wikimedia Commons] - This guide uses the MediaWiki REST API to explore media files on Wikimedia Commons. Wikimedia Commons is a collection of over 60,000,000 freely usable media files, many of which are used in Wikipedia articles.
* [https://public.paws.wmcloud.org//59320301/en-wikipedia-page-history.ipynb Exploring page history]- The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API page history endpoints to explore the history of articles on English Wikipedia.
* [https://public.paws.wmcloud.org/User:APaskulin_(WMF)/en-wikipedia-search.ipynb Search Wikipedia articles] - The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API search endpoints to search for articles about the Solar System on English Wikipedia.
* [https://public.paws.wmcloud.org//59320301/en-wikipedia-search-content.ipynb Retrieving free knowledge] - This guide uses the MediaWiki REST API to explore articles on English Wikipedia.
* [https://public.paws.wmcloud.org//59320301/en.wikipedia-page-counts.ipynb Wikipedia page stats comparison] - This guide uses the MediaWiki REST API to explore articles on English Wikipedia.
* [https://public.paws.wmcloud.org//59320301/restbase-wikifeeds.ipynb Get featured content from English Wikipedia] - The Wikifeeds API provides convenient access to content featured on the Main Page of English Wikipedia.
 
=== Example notebooks ===
 
==== Various notebooks using APIs ====
{{Topic|API|#00af89}} {{Topic|Example|#ec7063}} {{Topic|On-Wiki tasks|#fef6e7}}
 
* [https://public.paws.wmcloud.org//59320301//Action-API-tests.ipynb Action API tests]
* [https://public.paws.wmcloud.org//51232205//Article%20quality%20demo%20temp.ipynb Article quality demo]
* [https://public.paws.wmcloud.org//47694471//Blocked%20users.ipynb Blocked users Wikipedia DE]
* [https://public.paws.wmcloud.org//61717526//namespace_names.py.ipynb Get namespace names - MediaWiki API]
* [https://public.paws.wmcloud.org//51405838//RESPECT%202019.ipynb Find vandalism on a give set of pages]
* [https://public.paws.wmcloud.org//57510755//eng2hindi.ipynb Find pages translated from English to Hindi]
* [https://public.paws.wmcloud.org//57724042//English2Hindi_Translation_Analysis.ipynb Understand impact of the content translation tool]
* [https://public.paws.wmcloud.org//57724042//Important.ipynb Content translation exploration] A complex notebook featuring content translation Super interesting; not sure if it is entirely useful for this purpose
* [https://public.paws.wmcloud.org//58332189//Wikidata%20API%20Example.ipynb Wikidata API example - update descriptions]
* [https://public.paws.wmcloud.org//59320301/en-wikipedia-covid-data.ipynb Extracting Covid-19 data from English Wikipedia]
 
==== Pywikibot (Uses MediaWiki API) ====
{{Topic|Pywikibot|#d5fdf4}} {{Topic|API|#00af89}} {{Topic|Example|#ec7063}} {{Topic|On-Wiki tasks|#fef6e7}}
 
* [https://public.paws.wmcloud.org//64056969//Add%20Copyright%2C%20Inventory.ipynb Add copyright to items in Wikidata]
* [https://public.paws.wmcloud.org//64056969//Add%20Inv%2C%20copyright%2C%20creator.ipynb Add copyright, creator to items in Wikidata]
* [https://public.paws.wmcloud.org//905//Add%20prize%20to%20category.ipynb Add awards to Wikidata category Sports Hall of Fame]
* [https://public.paws.wmcloud.org//64056969//Add%20Reference.ipynb Add references to items already in Wikidata]
* [https://public.paws.wmcloud.org//54830972//auto%20wikiproject.ipynb Auto Wikiproject]
* [https://public.paws.wmcloud.org//60864856//Biography%20short%20descriptions.ipynb Add short descriptions to biographies on Wikipedia EN]
* [https://public.paws.wmcloud.org//64056969//Bot_Upload.ipynb Add items to Wikidata]
* [https://public.paws.wmcloud.org//19174//Bulk%20change%20qualifiers%20in%20a%20P39%20statement.ipynb Change qualifier in P39 statements - Wikidata]
* [https://public.paws.wmcloud.org/45853834/dbThing.ipynb Make changes to pages using MyPySQL and Pywikibot - HY Wikipedia] - On Wiki task using replicas and API
* [https://public.paws.wmcloud.org/8135444/remove-broken-files.ipynb Remove broken files]
* [https://public.paws.wmcloud.org/64308420/InvestigateBotIssues.ipynb Investigate bot issues]
* [https://public.paws.wmcloud.org/8135444/policy%20changes.ipynb Policy changes - ZH Wikipedia] - Uses databases, pywikibot, JSON files, etc
* [https://public.paws.wmcloud.org//17347449//TeaHouse.ipynb Teahouse archives answers] - Uses databases, pywikibot, JSON files, etc
* [https://public.paws.wmcloud.org//53919709//Counting%20new%20editors.ipynb Analyze number of new editors per month]
* [https://public.paws.wmcloud.org//12256150//Categorize%20images%20sent%20after%20end%20of%20WLL.ipynb Categorize images after the end of Wiki Loves Love]
* [https://public.paws.wmcloud.org//1854//Clean%20history%20merge%20list.ipynb Clean history merge list - WikiProject history]
* [https://public.paws.wmcloud.org//30925804//Commons.ipynb Categorize images from Wiki Loves Earth]
* [https://public.paws.wmcloud.org//30795180//commons%20patronymic%20category%20fix.ipynb Move and recategorize patronymic names on Commons]
* [https://public.paws.wmcloud.org//53355560//DeadInterlanguageInTemplates.ipynb Dead interlanguage links]
* [https://public.paws.wmcloud.org//484044//Fix%20BDA%20ids%20on%20wikidata.ipynb Fix BDA Ids on Wikidata]
* [https://public.paws.wmcloud.org//64056969//Fix%20Titles.ipynb Fix titles on Wikidata]
* [https://public.paws.wmcloud.org//38100986//get_articles_wthout_images.ipynb Get articles without images]
* [https://public.paws.wmcloud.org//484044//global%20replace%20in%20WP-de.ipynb Global replace in Wikipedia DE]
* [https://public.paws.wmcloud.org//484044//global%20replace%20in%20WP-de.ipynb Categorize graves in cemeteries - commons]
* [https://public.paws.wmcloud.org//12300809//Mass-Remove.ipynb Mass remove claims - Wikidata]
* [https://public.paws.wmcloud.org//47340797//movepages.ipynb A script to move pages]
* [https://public.paws.wmcloud.org//59131898//NASAImageLib.ipynb Get files with NASA image template - Commons]
* [https://public.paws.wmcloud.org//64308420//RemoveRedirectClass.ipynb Remove redirect class]
* [https://public.paws.wmcloud.org//30795180//ruwikimedia%20check%20userpage%20authorship.ipynb Check userpage authorship - RU Wikipedia]
* [https://public.paws.wmcloud.org//30795180//testwiki%20bad%20interwiki%20fix.ipynb Fix bad interwiki links]
* [https://public.paws.wmcloud.org//30795180//testwiki%20page%20move%20test.ipynb Recategorize and move pages]
* [https://public.paws.wmcloud.org//769030//upload%20text.ipynb Upload text]
* [https://public.paws.wmcloud.org//55703823//Vital_Articles_Examples.ipynb Parse data from talk pages]
* [https://public.paws.wmcloud.org//450979//Add%20a%20property%20to%20a%20category.ipynb Add a property to a category - Wikidata]
* [https://public.paws.wmcloud.org//56125718//Auto%20status%20update%20for%20wikiproject.ipynb Autostatus update for WikiProject]
* [https://public.paws.wmcloud.org//2888483//Batch%20delete%20and%20unlink%20image.ipynb Batch delete and unlink images]
* [https://public.paws.wmcloud.org//47389247//commons_file_names.ipynb Identify unhelpful file names on Commons]
* [https://public.paws.wmcloud.org//6971035//deprecate%20edition.ipynb Bulk deprecate a template]
* [https://public.paws.wmcloud.org//6971035//deprecate%20index%20parameter.ipynb Bulk deprecate an index parameter]
* [https://public.paws.wmcloud.org//5730869//Elections%20in%20Canada.ipynb Add statements to candidates in Canada elections - Wikidata]
* [https://public.paws.wmcloud.org//6971035//move%20all%20pages%20from%20a%20subcat%20to%20another.ipynb Move all pages from one subcategory to another]
* [https://public.paws.wmcloud.org//57094312//New%20user%20page.ipynb Create new user pages]
* [https://public.paws.wmcloud.org//57094312//Redirect%20talk%20page.ipynb Redirect a talk page]
* [https://public.paws.wmcloud.org//46054761//relicense.ipynb Relicense uploads to Wikimedia Commons]
* [https://public.paws.wmcloud.org//6971035//replace%20en%20with%20en-icon.ipynb Replace page text]
* [https://public.paws.wmcloud.org//6971035//update%20redirect.ipynb Update a redirect]


== Further resources ==
== Further resources ==
* PAWS is a Jupyter notebooks installation hosted by Wikimedia Cloud Services. The existing [https://jupyter-notebook.readthedocs.io/en/stable/ Jupyter Notebooks documentation] is an excellent resource for PAWS users.
* PAWS is a Jupyter notebooks installation hosted by Wikimedia Cloud Services. The existing [https://jupyter-notebook.readthedocs.io/en/stable/ Jupyter Notebooks documentation] is an excellent resource for PAWS users.
* Check out the [https://github.com/toolforge/paws/blob/master/README.md PAWS Readme on Gitub] for information on useful libraries and storage space.
* Check out the [https://github.com/toolforge/paws/blob/master/README.md PAWS Readme on GitHub] for information on useful libraries and storage space.

Latest revision as of 08:24, 18 March 2022

PAWS.svg


Overview

This is a list of existing PAWS notebooks created by users that can serve as examples for others. The list includes notebooks that employ database connections and API connections and are useful to individuals wishing to complete research and on-wiki tasks.

If you want to download and re-use these examples, see the instruction on how to quickstart with PAWS.

The notebooks are marked with specific topic tags (see key below) to help aid in understanding what they cover and what tasks they are best suited for.

Key

Color key

A visual key to help keep track of what examples and tutorials are available

Example Tutorial or How-to API Wikireplicas Datadumps Research & Analysis On-Wiki tasks Pywikibot Wikidata SPARQL

Wiki replicas and datasets

Connecting to Wiki replicas

Note: As of April 2021, there is a new method for connecting to Wikireplicas. Please see the following notebooks for examples of how to connect to the Wiki replicas. If you plan on following the other Tutorials or How-tos or fork the examples below. Make sure to use the most current method of connecting to the Wiki replicas.

Tutorials and How-tos

Tutorial or How-to Research & Analysis Wikireplicas Wikidata SPARQLDatadumps

Wikidata dumps tutorials

Tutorial or How-to Research & Analysis Wikidata Datadumps

Example notebooks

Note: This only includes notebooks without JOINS in their SQL queries -- which may not work correctly after planned changes to Wiki replicas. For a list of notebooks that include JOINS by USER-ID, see this list: https://wikitech.wikimedia.org/wiki/User:SRodlund/PAWS_examples_lists/notebooks_with_joins

Wiki replicas

Wiki replicas Research & Analysis Wikidata Example

Dumps

Research & Analysis Datadumps Example

SPARQL

Research & Analysis SPARQL Wikidata Example

Wikidata Query

Research & Analysis Wikidata Example

APIs

PAWS notebook: API Connections

Tutorials and How-tos

API Tutorial or How-to

  • API Connections With PAWS - An overview of how to use PAWS with APIs. Updated 2021
  • MediaWiki page history - The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API page history endpoints to explore the history of articles on English Wikipedia.
  • MediaWiki Rest API examples- This notebook contains a variety of MediaWiki Rest API examples: search pages, autocomplete page title, get page history, get page history counts, get revision, compare revision, get page, get page offline, get page source, get languages, get files, get files on a page, create page, update page.
  • Wikimedia Feeds intro - Many Wikipedias include daily featured articles and other curated content on their homepages. You can see an example of this content on the main page of English, German, and French Wikipedias. The Wikifeeds API lets you access this content programmatically and add high-quality, multilingual content to your apps.
  • Create an image grid using free images from Wikimedia Commons - This guide uses the MediaWiki REST API to explore media files on Wikimedia Commons. Wikimedia Commons is a collection of over 60,000,000 freely usable media files, many of which are used in Wikipedia articles.
  • Reuse free images from Wikimedia Commons - This guide uses the MediaWiki REST API to explore media files on Wikimedia Commons. Wikimedia Commons is a collection of over 60,000,000 freely usable media files, many of which are used in Wikipedia articles.
  • Exploring page history- The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API page history endpoints to explore the history of articles on English Wikipedia.
  • Search Wikipedia articles - The MediaWiki REST API lets you build apps and scripts that interact with any MediaWiki-based wiki. In this tutorial, we'll use the REST API search endpoints to search for articles about the Solar System on English Wikipedia.
  • Retrieving free knowledge - This guide uses the MediaWiki REST API to explore articles on English Wikipedia.
  • Wikipedia page stats comparison - This guide uses the MediaWiki REST API to explore articles on English Wikipedia.
  • Get featured content from English Wikipedia - The Wikifeeds API provides convenient access to content featured on the Main Page of English Wikipedia.

Example notebooks

Various notebooks using APIs

API Example On-Wiki tasks

Pywikibot (Uses MediaWiki API)

Pywikibot API Example On-Wiki tasks

Further resources