You are browsing a read-only backup copy of Wikitech. The primary site can be found at

Performance/Runbook/Measure backend performance

From Wikitech-static
< Performance‎ | Runbook
Revision as of 17:02, 1 June 2022 by imported>Krinkle (Created page with "This provides an entrypoint for '''measuring performance of existing code'''. If you are starting a new project or otherwise have not yet set performance objectives, '''first read''' our guidance on Backend performance. == Production impact == Once code is in production, either during an incident or in the hours/days after deploying a new feature to a large wiki, the following are good starting points for measuring...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This provides an entrypoint for measuring performance of existing code.

If you are starting a new project or otherwise have not yet set performance objectives, first read our guidance on Backend performance.

Production impact

Once code is in production, either during an incident or in the hours/days after deploying a new feature to a large wiki, the following are good starting points for measuring potential impact:

  • Grafana: Application Servers RED, this provides an overview for the MW server cluster as a whole, in particular look for response duration ("latency").
  • Grafana: MediaWiki Exceptions, counts of production errors as reported to Logstash.
  • Grafana: WANObjectCache. Memcached should be accessed via the WANObjectCache interface. Among many hidden operational benefits, this also provides rich telemetry on how groups of cache keys are behaving. Use the "by keygroup" breakdown in Grafana for your feature, and look for "cache hit rate" and "regeneration time". This measures the decisions and time taken by getWithSetCallback.
  • Logstash: mediawiki-errors (restricted), details of those production errors (learn more: OpenSearch Dashboards).
  • Logstash: slow queries (restricted), details of database queries from MW that above general performance thresholds.

New code generally rolls out over the course of a week, each week, starting with smaller wikis, and moving to higher traffic sites like Wikipedia. This naturally ramps up and load-tests changes to all code as part of our Train deployment process. See wikitech:Deployments/One week for more information.

The flame graphs linked above include the deployment branch (train week), which helps orient which version of the code is affected, and allows for week over week comparison by loading both in a separate tab. When comparins week over week, make sure to pick a day where the majority of the flame graph is from a single deployment branch. If the graph is clearly split between two major versions, pick a day earlier instead.

In addition to the above continuous monitoring, you can also use WikimediaDebug to capture a performance profile of a relevant user action or web request from before and after a deployment, and compare it with more details that way.

Local development

You can capture detailed trace logs, timing measures, and flame graphs from your local MediaWik install.

If you use MediaWiki-Docker, the packages needed are already installed and you can follow the MediaWiki-Docker/Configuration recipes/Profiling page instead. Otherwise, refer to Manual:Profiling on for how to install the relevant packages in your local development environment.

It is recommended that you include the DevelopmentSettings.php preset in your LocalSettings.php file. This is done for you by default in MediaWiki-Docker. Among other things, this enables a medium base line of various debug mode. There is an additional section of commented-out Ad-hoc debugging that you can copy to LocalSettings.php as well, and enable when as as you need it (such as $wgDebugToolbar and $wgDebugDumpSql, referred to below).

Database queries

As you develop your database queries, use EXPLAIN and MySQL's DESCRIBE statements to find which indexes are involved in a particular query.

You can find out which exact queries are coming from your code by enabling $wgDebugToolbar in LocalSettings (see also Manual:How to debug). This provides an overview of all queries from a certain page. For API or other misc web requests, you can consult the debug log file which logs all SQL queries when $wgDebugDumpSql is enabled.

When adding a new query to your code (e.g. via the Database::select() helper from our Rdbms library), try to run a version of those queries at least once with the EXPLAIN statement, and make sure that it is effectively using indexes. While a select query without index may run fast for you locally, it is going to perform differently when there are several billion objects in the database.

With the Debug toolbar enabled, look out for:

  • repeat queries, data should be queried authoritivatively once by a service class and then re-used or passed around as needed. If two unrelated callers to a service class regularly need the same data, consider an in-class cache, and limit the size of this cache to avoid uncontrolled growth (e.g. in API batch requests, jobs, or CLI scripts). Even if you don't have a UI for batch operations, a higher level feature may still cause your code to be called in a loop. We providde MapCacheLRU and HashBagOStuff in MediaWiki core to make it easy to ad-hoc keep a limited number of key-value pairs in a class instance.
  • generated queries, if you see many similar queries with one different varaible, this may be coming from a loop that should instead query the data in a batch upfront.

For more details, see also: Roan Kattouw's 2010 talk on security, scalability and performance for extension developers, Roan's MySQL optimization tutorial from 2012 (slides), and Tim Starling's 2013 performance talk.

  • You must consider the cache characteristics of your underlying systems and modify your testing methodology accordingly. For example, if your database has a 4 GB cache, you'll need to make sure that cache is cold as otherwise your data is likely still in the cache from previous queries.
  • Particularly with databases, but in general, performance is heavily dependent on the size of the data you are storing (as well as caching) -- make sure you do your testing with realistic data sizes.
  • Spinning disks are really slow; use cache or solid state whenever you can; However as the data size grows, the advantages of solid state (avoiding seek times) are reduced.

Beta Cluster

The Beta Cluster is hosted in Wikimedia Cloud. This is a good place to detect functional problems, but may not be a representative environment for performance measures as it runs in a virtualised multi-tennant environment. Meaning, the machines are less powerful than production, and often under heavy load. See also T67394.

See also


Portions of this page were copied from "Performance profiling for Wikimedia" on as written by Sharihareswara (WMF) in 2014.