You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
User:Joal/WDQS Traffic Analysis
Analysis of WDQS traffic on public and internal clusters. The charts and data have been computed on Jupyter notebooks running Spark on the Analytics hadoop cluster. Processed data is events sourced through the Modern Event Platform. Originally written on March 2020, reran with June 2020 for the current version of the charts.
Global traffic information
HTTP response codes
Most requests on internal and external traffic generate HTTP result-code 200
(success). The rest of this analysis consider only requests having ended in a 200
response code, except explicitly stated otherwise.
Note: The scale of the two charts are different - See the next section for a comparison of number of requests among clusters.
Public vs Internal
The number of 200
requests to the public cluster is about half of the number of requests to the internal one.
Distinct queries
The number of daily distinct queries for the public cluster is about 61% of the total number of queries for June 2020, and 27% for the internal cluster. This means that each request is repeated on average 0.65 times per day for the public cluster, and 2.75 times per day for the internal one.
Query-time
One of the reasons for which analyzing query-time is interesting is as a proxy for resource-usage in the backend system: a long query is supposedly using more computation resource than a fast one.
Public vs Internal
Despite serving about 2 times more requests than the public cluster, the internal cluster has a daily sum of query-time about 10 times smaller than the public cluster.
Query-time classes
It is interesting to note that for the public cluster, the requests taking more than 10s represent a very small number of requests and take most of the processing time.
Note: These charts are generated using Google Docs as the chart-system used in notebooks doesn't feature dual-axis charts.
Correlations (or not)
For the internal cluster, the sum of query-time is visually strongly correlated to the number of requests done (known query-class performing at regular speed). For the public cluster, there is no such correlation, due to the variety of query-classes (and implementations). Similarly, there is no visually-noticeable correlation between query-time and request-length (number of characters in the request), meaning that the query-length is not a good enough predictor of query complexity.