You are browsing a read-only backup copy of Wikitech. The primary site can be found at


From Wikitech-static
< Performance
Revision as of 15:01, 9 August 2022 by imported>Krinkle (Document AWS console login)
Jump to navigation Jump to search

WebPageTest is a web performance tool that uses real browsers to access web pages and collect timing metrics. The killer feature of WebPageTest is the metric called SpeedIndex – a measure of how fast the above-the-fold content is displayed. The Wikimedia Performance Team runs a private instance of WebPageTest at on AWS, and you can view the metrics we collect at If you want to read more you can read the official documentation for WebPageTest.


There are two standard ways of collecting timing metrics for web pages today: RUM (Real User Measurement) and synthetic testing. Collecting data using RUM means adding some JavaScript to pages. The script runs in users' browsers, collects some metrics (eg. Navigation Timing and User Timing), and beacons back the data to our servers. The upside of RUM metrics is that they are from real users, the downside is that it varies quite much and it's hard to isolate changes and correlate things. Furthermore, the metrics depend on latency and connection type. We are also missing good ways of measuring things that really matters like first paint (only in Chrome & IE11) and when content is finished loading within the current view port.

Synthetic testing on the other hand tries to minimize the different factors that can impact the metrics to let us pinpoint the correlation between code changes and metrics impact. Synthetic testing tries to run from the same location, same latency, same browser and measuring the same way. Using WebPageTest as our synthetic tool also has another advantage: SpeedIndex - the best way today to measure when the above the fold content is ready for the user. Synthetic testing's downside is that the metrics aren't from real users.

We use the NavigationTiming extension to add our own script to collect RUM metrics, we run a private instance of WebPageTest to collect synthetic testing and we run Browsertime/WebPageReplay to collect synthetic testing under isolated premisses.


Our setup as of January 2020 looks like this:

New WebPageTest setup.png


We use our WebPageTest instance to continuously monitoring a couple of URLs. You can find the URLs we test here. If you want to add more URLs to monitor, you can do that to our Gerrit repo. If you want to spontaneously test a URL you can do that on the public WebPageTest instance.

Find results using Grafana

You can find each run by looking in Grafana.

First make sure that you have turned on the annotations for "Show each test".

Click on each test to show them in Grafana.jpg

When that is turned on, you will see vertical green lines that represents each time WebPageTest collected metrics. Use your mouse to hover of one of the annotations and you will see links and screenshots from that run. Click on the result link to get to the result. Click on WebPageTest to get to the WebPageTest instance with the result.

Hover on annotation in Grafana for WebPageTest.png

Find results using search

We automatically run WebPageTest continuously to collect metrics and send them to Graphite. If you want to look at a specific test, go to and choose Show tests from all users. You will then look at all the test runs for the last 30 days. You can change the time span by changing the View and choose Update list.

Caution: Choose what to see on the result page

By default WebPageTest will pick the median run of pageLoadTime. That's not optimal because we want to focus on SpeedIndex or start render time. By adding parameters to the start result page, you can choose what run and metric that will picked up as the median run. Choose which metric to use and if you want the median or the fastest run:





WebPageTest and AWS

WebPageTest consists of two separate entities: a server and agent(s). On AWS there are ready made AMI:s (prepared images) for the two, so it is an easy to click and deploy.

WebPageTest can run headless or not. Headless in this context meaning no GUI available to start a test, you then need to use the API to submit tests.

AWS Console

We mainly use dedicated IAM-style logins (separate from general Amazon or AWS accounts).

  • Account ID: "wikimedia" (or use
  • IAM user name, typically matching the WMF shell name or name portion of address.
  • MFA is required. It can be enabled from the profile under "Security credentials > Manage MFA devics".

Login to the servers

Login to the WebPageTest server:

ssh -i webpagetest.pem

Login to the WebPageTest job runner:


Login to the WebPageTest agent:

ssh -i WebPageTestAgent.pem ubuntu@

Setup the server

It can be hard finding the right AMI, for a server in us-west we use AMI id ami-d7bde6e7

  1. Find the right AMI (ami-d7bde6e7) under Images/AMI and pick it (make sure to choose Public images).
  2. Choose Launch and use type t2.micro. Make sure to choose Next: Configure Instance Details
  3. Go the the Advanced Details section and add the configuration for the server. Make sure you change all the secret placeholders to the real values and choose Next: Choose storage
  4. Nothing you need to do here, choose Next: Tag Instance
  5. Add the tag Name with the value: WebPageTest Server and then Next: Configure Security Group
  6. Change the SSH access to only be our IP range.
  7. Add access for HTTP by choosing Add Rule - use the dropdown and choose HTTP and keep the rest of the values default.
  8. Choose Review and launch and then Launch. You will be asked to choose an existing key pair or use an existing. Create a new pair (name it webpagetest) and download it. You will need the keys to be able to SSH to the server so make sure to download them.
  9. Attach the new server the Elastic Path IP: NETWORK & SECURITY/Elastic IPs and choose the IP and Associate Address (we use ip
  10. Use the tag WebPageTest Server in the instance field and choose Associate.
  11. You should now be able to access and see the "headless" start page.

Login to the server

ssh -i webpagetest.pem


These are the configuration details that you use in the Advanced Details section.

; the key used when starting tests

; no GUI for submitting tests, but we can check the results

; Define maximums runs per URL

; Quality of images, lets define this to something good

; save png full-resolution screen shots

; automatically update the agent when a new version is available

; needed for autoscaling

; keep an instance up and running

; how long to keep tests locally before sending them to S3

; archiving to s3 (using the s3 protocol, not necessarily just s3)

Setup S3

WebPageTest can automatically store the test results on S3 and that is perfect for us so we can drop the server instance whenever we want.

To setup S3 (these are the instructions to do it the first time):

  1. Log into the AWS console and choose S3
  2. Choose Create a bucket
  3. Add a Bucket name and name it wpt-wikimedia (the bucket name needs to correspond to the property archive_s3_bucket when you configure the server).
  4. Add the Region. We use Oregon and that matches the configuration property archive_s3_server key.
  5. Choose Create and we have created the bucket.
  6. Next step is to setup the properties on the bucket, meaning giving access for HTTP traffic and the server to upload the test results.
  7. Choose your bucket (webpagetest) and choose Properties/Permissions.
  8. Choose Add more permissions and add Authenticated Users as Grantee and give it Upload/Delete permissions.
  9. Then we need to configure how long time we want to store the data.
  10. Choose Lifecycle and Add rule.
  11. Apply the rule for the whole bucket
  12. Choose Permanently Delete and 370 days.
  13. Choose Review and add a Rule Name: Permanently remove tests after 370 days
  14. Choose Create and Activate Rule

We are now finished setting up S3.

Setup EBS

WebPageTest is stateless and stores everything on file. To be able to find old tests, WebPageTest uses a log file. The log file is not backed up to S3, so to be able to find old tests if the server is dropped, we need to store the logs on a separate disk.

  1. Choose Elastic Block Store / Volumes
  2. Choose Create Volume
  3. Choose a Size in GiB (the lowest 30GB will do fine)
  4. Choose Availability Zone. Use the same as the server
  5. Leave everything else as the default and choose Create.
  6. Choose the radio button for the newly created volume and the Tag label.
  7. Add a new Tag with the key Name and the value WebPageTest logs and choose Save.
  8. Make sure the volume is selected with the radio button and choose Action/Attach Volume.
  9. Choose WebPageTest server as the instance and use the default Device and choose Attach.
  10. The volume is now attached to our server, the next step is to login to the server and make sure that the logs are stored on the device.
  11. Use the pem-file for the server and login: ssh -i NAME.pem ubuntu@SERVER_IP (change name of the pem file to your pem file and the SERVER_IP to the real IP and follow these instructions:
  12. Follow the instructions and mount the device to /data
  13. Now your volume is mounted, the next step is to change WebPageTest log dir to a symbolic link to a directory that exists on the volume. If you haven't done any tests, the directory should be empty except for a .htaccess file.
  14. Make your new directory on the mounted device: sudo mkdir /data/logs
  15. Move the access file: sudo mv /var/www/webpagetest/www/logs/.htaccess /data/logs
  16. Remove the old one: sudo rm -fR /var/www/webpagetest/www/logs/
  17. Make the symbolic: sudo ln -s /data/logs /var/www/webpagetest/www/logs
  18. Make sure we have the right owner for the directory: sudo chown -h www-data:www-data /data/logs

Connectivity profiles

Depending on the AMI image, it could be that we are missing connectivity profiles: 3GFast, 3GSlow and 2G. If they are missing, you should add them in /var/www/webpagetest/www/settings/connectivity.ini

label="Mobile 3G - Fast (1.6 Mbps/768 Kbps 150ms RTT)"

label="Mobile 3G - Slow (780 Kbps/330 Kbps 200ms RTT)"

label="Mobile 2G (280 Kbps/256 Kbps 800ms RTT)"


The username and password for the master AWS account is recorded in iron:/srv/passwords/aws-webpagetest. Please avoid using this account directly. Ask an existing maintainer to create an IAM user for you instead. The IAM users sign-in link for this account is .


If something isn't working for you on the WebPageTest server instance you can find the logs here (yep they are on different locations)


Restart the server

If for some reason you want to restart the server (normally if you manually changed settings in /var/www/webpagetest/www/settings/settings.ini) restart nginx:

sudo service nginx restart

Archive old tests

Old test data will automatically be sent to S3. But we also need to remove old tests, easiest way to do it is to run the archive page. We do it in the crontab. Edit the crontab (as the ubuntu user): crontab -e

0 * * * * curl -sS >> /tmp/cron.txt

We then run archiving every our.

Cloud watch

We use Amazon cloud watch to keep track of the disk space of the WebPageTest server. You need to install a couple of libraries to get it up and running, follow the instructions. Then add one line to your crontab to start sending the metrics to Amazon.

*/5 * * * * ~/aws-scripts-mon/ --disk-space-used --disk-space-avail --disk-path=/ --from-cron

Then setup an alarm for the disk space. The current alarm warns (sends an email to the web perf list) when we only have 2 gb disk free.

Setup the agents

We have an agent up and running on us-east-1 on EC2 and it is called us-east. That's the one that we will use for now and if you need to setup your own instance, you need to do like this:

Illustration of Step 3.
  1. In the AWS console, choose to Launch instance and choose Community AMIs. There you can find the prepared AMI. You can find all prepared AMIs here (you need a specific per location). Make sure to choose the Linux AMIm
  2. Choose the instance size c5.xlarge.
  3. Then you need to make sure that your instance talk to the WebPageTest server, you do that by adding the configuration in the Advanced Details text field. The SECRET key is available on your WebPageTest server. wpt_loc=us-east wpt_url= wpt_key=SECRET

Then you can start your agent.

The next step is to configure the WebPageTest server so that it knows about the agent. You do that in /var/www/webpagetest/www/settings/locations.ini. That file is parsed with the ec2_locations.ini file and the result is the configured agents.


label="Linux US east 1"

browser=Chrome,Chrome Beta,Chrome Canary,Firefox,Firefox Nightly,Opera,Opera Beta,Opera Developer
label="Linux US east"

You can read more about how to configure the locations.ini file at

When you started your agent, and changed the locations.ini file, restart nginx on the WebPageTest server:

sudo service nginx restart

And then verify that you can see your instance at

Connect to an agent

You can ssh to the agent with the WebPageTestAgent.pem file. You can find the IP of the agent on AWS.

Timeout and agents not responding

We have seen that a couple of times one of the agents just stop working. You can see that by that all tests timeout and if you go to and check the latest finished results, the report will say that the agent couldn't be contacted by the server. To fix that, you need to login to the AWS console and go to EC2 management and make sure you are on "US East" region, choose the agent (it is named WebPagetest Agent), and choose Instance state -> Restart.

Add a new URL to test

All configuration files exists in our synthetic monitoring tests repo. Clone the repo and go into the tests folder:

git clone ssh://
cd synthetic-monitoring-tests/tests

All test files lives in that directory. WebPageTest tests exists in two directories:

  • desktop - the tests that test desktop URLs. All these tests runs on one machine.
  • emulatedMobile - the test URLs that gets tested for emulated mobile. All these tests runs on one machine.

The directory structure looks like this. Each file contains the URLs that is tested for that wiki (emulated mobile only tests on enwiki):

└── webpagetest
        └── webpagetest
            ├── desktop
            │   └── urls
            │       ├── webpagetest.beta.txt
            │       ├── webpagetest.enwiki.txt
            │       ├── webpagetest.ruwiki.txt
            │       └── webpagetest.wikidata.txt
            └── emulatedMobile
                └── urls
                    └── webpagetestEmulatedMobile.txt

Let have a look at webpagetest.enwiki.txt, it contains of three URLs: Banksy

If you want to add a new URL to be tested on the English Wikipedia, you open webpagetest.enwiki.txt, add the URL on a new line and commit the result. When the commit passed through the Gerrit review, that URL will be picked up on the next iteration automatically by the test agent.

Bulk test

If you want to test changes before and after it's super important to test it many times to get correct values, use WPTBulkTest for that. Make sure to setup a new agent for your bulk test!


At the moment our test instance is busy running our continuously performance tests that we graph on We run one test agent to minimize the costs. If you want to use to run your own one shot tests, I ( can help you with that. You will need the key for the instance and choose which location you wanna use and then I can help you verify that the location is setup with the correct instance type.

Alert setup

We run automatic tests every hour (you can find the tests here. We test mainly test the English Wikipedia: 3 desktop URLs using Chrome, the same URLs using Firefox and the mobile version on emulated mobile. You can see how we graph the metrics (and alert on regression):

Our mainly focus is testing on empty browser cache but we also run test with multiple page views (first hit one and then another) and as authenticated users. The metrics are too unstable at the moment to add alerts but we hope we can do that in the future.

Keeping WebPageTest up to date and running

There's a couple of gotchas about WebPageTest:

  • WebPageTest store everything as files to disk (there's no database). All the tests is stored in log files in /var/www/webpagetest/www/logs. That directory needs to be backed up or/and on a shared disk if something happens with the WebPageTest server. The search functionality in WebPageTest search the log files to find tests, that's why there by default is a 7 days limit (see T263838).
  • WebPageTest is updated automatically by a git pull from WebPageTest. Or at least that is the official way. It has happened that test breaks because of a bad update or a update that needed changes in the environment and the code. We do not do git pulls at the moment and just do it when needed. The reason is mostly because we have used to have "patched" code (like removing third party requests, fixed the hard coded limit of 7 days search etc).
  • The WebPageTest agent is updated automatically and new browser versions his rolled out automatically. That is good because we don't need to manually do anything, but its bad because it's uncontrolled (we don't know when it happens) and new browser versions can break testing. We have tried to get the Docker version to work (T192050), but Firefox we never got Firefox to work).

Debug missing metrics

The most common problem/error is that metrics stops arriving in Graphite/Grafana and an alert will fire. If you check Grafana it will look something like this:

What's wrong? It could be one of these problems:

  1. Graphite isn't working correctly.
  2. The runner that adds the tests to WebPageTest has stopped.
  3. The WebPageTest server is down/has problems,
  4. The WebPageTest agent that runs the actual tests has problems.

Verify that Graphite works

WebPageTest sends metrics to Performance/Graphite/Synthetic Instance. If if you see old data from WebPageTest and if other tools (tests running WebPageReplay or our performance device lab) gets data into Graphite, Graphite isn't the problem.

Verify that the runner works

Verify that the job that adds the work to WebPageTest really works. Login to wpt-runner.webperf.eqiad.wmflabs and check the log file /tmp/ for errors. Can you see that jobs get added to WebPageTest? It will look something like this:

[2020-10-07 17:17:38] INFO: Sending url to test on

If you see tests being added the latest hours, we know that the worker works.

Verify that the WebPageTest server works

Login to wpt-runner.webperf.eqiad.wmflabs. Look for recent log entries that looks like:

  • [2020-10-07 17:17:38] INFO: Sending url to test on
  • [2020-10-07 17:21:44] INFO: Got analysed with id 201007_3Y_2A3 from

If you have log entries like that, everything is fine on the WebPageTest side and the problem must be with sending the metrics to Graphite.

If you see log entries that looks like:

[2020-10-02 00:00:16] ERROR: Could not run test for WebPageTest {"name":"WPTAPIError","code":500,"message":"Internal Server Error"}

Then something is broken on the WebPageTest server or the agent.

If you see log entries like:

[2020-10-07 07:05:16] ERROR: The test for WebPageTest timed out. Is your WebPageTest agent overloaded with work? You can try to increase how long time to wait for tests to finish by configuring --webpagetest.timeout to a higher value (default is 600 and is in seconds).  {"error" {"code":"TIMEOUT","testId":"201007_J7_24F","message":"timeout"}}

That usually happens when there's a major queue of jobs for WebPageTest. For normal usage that would never happen. It can happen if the server or agent has been down and the runner has been adding jobs. If that's the case you can safely remove all items in the queue. Log in to the WebPageTest server and remove all jobs (files) in /var/www/webpagetest/www/work/jobs/us-east/.

Verify that the agent work

TODO - it was a long time the agent stopped to work, the next time it happens, let add the documentation how how to know it.

Verify the fix

The best way is to wait for a couple of hours and then check the graphs again in Grafana. Do we get the metrics? Great, it works. If we don't get the metrics, start to check the logs again.