You are browsing a read-only backup copy of Wikitech. The live site can be found at

News/Wikireplicas 2020 Redesign/Wiki Replicas Cross-DB Query Data

From Wikitech-static
< News‎ | Wikireplicas 2020 Redesign
Revision as of 17:07, 16 April 2021 by imported>Jhernandez (Mark as draft for now)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

User input analysis

In this section we keep track of the user input received via emails to the various cloud mailing lists, chat venues like #wikimedia-cloud, and comments on phabricator tasks:

Origin User Queries Tool name Tool link Toolforge tool Wiki DBs joined Phabricator
email Huji [1],[2],[3] HujiBot [1] fa fawiki, commonswiki phab:T268244


phabricator Green_Cardamom [1] GreenC bot [1],


botwikiawk en enwiki, commonswiki phab:T267992
email Fastily [1][2][3][4] FastilyBot en:User:FastilyBot,



fastilybot-reports en enwiki, commonswiki
phabricator Zache [1],[2], [3] fiwiki, wikidatawiki

fiwiki, commonswiki

phabricator Superyetkin [1] tr enwiki, wikidatawiki phab:T268498
phabricator ShakespeareFan00 [1],[2] enwiki, commonswiki

Random query logger sample analysis

In order to better understand queries that will potentially break with the new replicas architecture, a query logger was run to save queries at random intervals of time for a short time, to perform analysis later. Due to the size of the data and physical constraints the logger couldn't be run continuously for a long period of time, so a random sampling was applied.


Total queries 60008 100%
Cross-DB queries 1659 2.76%
Distinct queries
Total distinct queries 18709 100%
Distinct cross-db queries 339 1.81%
Distinct normalized queries
Total distinct normalized queries 3858 100%
Distinct normalized cross-db queries 107 2.77%
Total users logged 216 100%
Distinct cross-db queries users 22 ~10%

Pending DB analysis in cross-db queries

Pending join column analysis in cross-db queries

Raw data

Pending upload