You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Performance/Synthetic testing/Mobile phones: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Phedenskog
(Updated to new file structure)
imported>Phedenskog
(Preparing for adding the new BitBar setup)
Line 1: Line 1:
== Summary ==
== Summary ==
In the path to get more realistic performance metric run tests on real mobile devices. That makes it easier to find performance regressions. We use a Mac Mini to drive five  Android phones and in the future we also want to add iOS devices. All tasks are reported under the [https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-lvz2m2dug73iurgns2qt&statuses=open()&group=none&order=newest#R Performance Device Lab tag] in Phabricator.  
In the path to get more realistic performance metric run tests on real mobile devices. That makes it easier to find performance regressions. We use BitBar as the provider of our devices. All tasks are reported under the [https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-lvz2m2dug73iurgns2qt&statuses=open()&group=none&order=newest#R Performance Device Lab tag] in Phabricator.  


== Performance testing on mobile phones ==
== Performance testing on mobile phones ==
Line 6: Line 6:


* A stable network: We throttle the connection to look like a 3g or 4g connection.By limiting the upload/download speed and adding delay, we make the phone get requests in a more realistic scenario. By making sure the network is the same all the time, the network will not affect our metrics.
* A stable network: We throttle the connection to look like a 3g or 4g connection.By limiting the upload/download speed and adding delay, we make the phone get requests in a more realistic scenario. By making sure the network is the same all the time, the network will not affect our metrics.
* Low phone temperature: We measure the battery temperature as a proxy for CPU temperature. Android phones change behavior when the CPU gets warm and we want to avoid that. Some of our phones are rooted to try to make sure the phone has the same performance characteristics'''.''' We use the [https://github.com/sitespeedio/browsertime/blob/main/lib/android/root.js same settings] as the Mozilla performance team to setup the rooted phones. We measure the temperature before and after we start a test. If the temperature is too high we wait X minutes and try again. If the temperature hasn't gone down after Y tries, we reboot the phone and wait some more.
* Low phone temperature: We measure the battery temperature as a proxy for CPU temperature. Android phones change behavior when the CPU gets warm and we want to avoid that. Some of our phones are rooted to try to make sure the phone has the same performance characteristics'''.''' We use the [https://github.com/sitespeedio/browsertime/blob/main/lib/android/root.js same settings] as the Mozilla performance team to setup the rooted phones. We measure the temperature before and after we start a test. If the temperature is too high we wait X minutes and try again.


== Setup ==
== Setup ==
We use five mobile phones, a server and two wifi:s setup with throttled connection to simulate 3g and 4g traffic.
We use five mobile phones, a server and two wifi:s setup with throttled connection to simulate 3g and 4g traffic.
[[File:Performance Device Lab setup.png|center|frame|Setup showing the mobile performance device lab.]]The workflow: The jobs are started on the server that runs sitespeed.io that drives the phones using WebDriver. The configuration and URLs to tests exists in a pubic Git repo. The tests runs on the phones, access Wikipedia and we record a video of the screen and analyze the result to get visual metrics. The metrics is sent to our Graphite instance and the test results (screenshots, HTML result pages) are stored on S3. We also run one tests using WebPageReplay where we record and replay Wikipedia locally on the server to try to get as stable metrics as possible between runs.  
[[File:Performance Device Lab setup.png|center|frame|Setup showing the mobile performance device lab.]]The workflow: The jobs are started on the server that runs sitespeed.io that drives the phones using WebDriver. The configuration and URLs to tests exists in a public Git repo. The tests runs on the phones, access Wikipedia and we record a video of the screen and analyze the result to get visual metrics. The metrics is sent to our Graphite instance and the test results (screenshots, HTML result pages) are stored on S3. We also run one tests using WebPageReplay where we record and replay Wikipedia locally on the server to try to get as stable metrics as possible between runs.  
 
===How to setup the server===
We use a Mac Mini to drive and run the phones. To get that to work we need the following software:
*NodeJS latest LTS (to be able to run sitespeed.io)
*ADB (to drive the Android phone)
*FFMPEG, ImageMagick 6, Python and pyssim to be able to analyze the video and get Visual Metrics
At Kobiton I've installed on two different servers and they have had different software, so if you need to install it, you may need to adjust the list. You need to have [https://brew.sh homebrew] installed.
 
====Install NodeJS====
<code>brew install node@12</code>
 
<code>echo 'export PATH="/usr/local/opt/node@12/bin:$PATH"' >> ~/.zshrc</code>
 
====Install ImageMagick====
 
<code>brew install imagemagick@6</code>
 
<code>echo 'export PATH="/usr/local/opt/imagemagick@6/bin:$PATH"' >> ~/.zshrc</code>
 
====Install Pip and pyssim====
 
<code>curl https://bootstrap.pypa.io/get-pip.py --output get-pip.py</code>
 
<code>python get-pip.py</code>
 
<code>python -m pip install --upgrade --user setuptools</code>
 
<code>python -m pip install --user pyssim</code>
 
====Install Pip and pyssim====
<code>brew cask install android-platform-tools</code>
 
====Install FFMPEG====
Download a [https://evermeet.cx/ffmpeg/ static build] of FFMPEG 4.X and and add it to the path.
 
==== Install scrcpy ====
<code>brew install scrcpy</code>
 
=== Access the server ===
To access the server you need to have [https://www.teamviewer.com Team Viewer] installed.
 
=== Start/stop tests on the server ===
Use the start script to start each phone. If you use the start script without any parameter, all tests on all phones are started:
 
<code>./start.sh</code>
 
If you want to start tests for a specific phone, add the name of the phone:
 
<code>./start.sh ZY2242HHQ8-Moto-G4</code>
 
Stop all the phones by running the stop script:
 
<code>./stop.sh</code>
 
The script will wait until all test processes has ended and then exit.
 
You can also stop tests on one phone, using the name:
 
<code>./stop.sh ZY2242HHQ8-Moto-G4</code>
 
And then wait on the phone to stop running the tests.
 
=== Our setup ===
All our code lives in <code>~/wikimedia</code>
 
There is three main directories:
 
*<code>config</code>- here lives the main configuration file that has secrets for AWS and Graphite and generic phones setup (things that we want to have the same on all tests). Each phone can also override these settings.
*<code>chromedriver</code> - here we keep different versions of Chromedriver (the driver needs to match the same browser version). Each phone configuration then points out the driver version, so that we can update phones one by one and verify that the new Chromedriver works. For Firefox the driver works independently of Firefox version so there we can run the same version.
*<code>performance-mobile-synthetic-monitoring-tests</code>- that is a clone of https://github.com/wikimedia/performance-mobile-synthetic-monitoring-tests where each phone has their own main directory to make it easier to start/stop/keep track of logs. They are named by the unique id and the type of the phone <code>ZY2242HHQ8-Moto-G4</code>. Then each phone has their own configuration file that configures Chromedriver version and specific phone configurations, to make it easier to have a generic start/stop script.
 
 
Lets look into the <code>performance-mobile-synthetic-monitoring-tests</code> folder. '''config''' holds the configuration files, '''tests''' the URLs and scripts used for testing, '''start.sh''' is the start script used to start tests and '''wpr''' has extra files needed to run tests using WebPageReplay.<syntaxhighlight lang="bash">
├── README.md
├── checkPhones.sh
├── clearApplications.sh
├── config
│   ├── ZY223LNPVM-Moto-Z.json
│   ├── ZY2242HHQ8-Moto-G4.json
│   ├── ZY2242HHQ8-Moto-G4.loginMobile.json
│   ├── ZY2242HHQ8-Moto-G4.searchMobile.json
│   ├── ZY2242HHQ8-Moto-G4.secondViewMobile.json
│   ├── ZY2242HJDX-Moto-G4.crawl.json
│   ├── ZY2242HJDX-Moto-G4.random.json
│   ├── ZY3222N2CZ-Moto-G5-Root-WebPageReplay.json
│   ├── ZY322HX8RC-Moto-G5.json
│   └── budget
│      └── cpu.json
├── loop.sh
├── npm-shrinkwrap.json
├── package.json
├── runTheTests.sh
├── start.sh
├── stop.sh
├── tests
│   ├── ZY2242HHQ8-Moto-G4
│   │   ├── enwiki.txt
│   │   ├── loginMobile.js
│   │   ├── searchMobile.googleObama.js
│   │   ├── searchMobile.pageObama.js
│   │   ├── secondViewMobile.elizabeth.js
│   │   └── secondViewMobile.facebook.js
│   ├── ZY2242HJDX-Moto-G4
│   │   ├── crawl.txt
│   │   └── random.js
│   ├── ZY3222N2CZ-Moto-G5-Root-WebPageReplay
│   │   └── enwiki.wpr
│   └── ZY322HX8RC-Moto-G5
│      └── enwiki.txt
└── wpr
    ├── README.md
    ├── deterministic.js
    ├── wprAndroid.sh
    ├── wpr_cert.pem
    └── wpr_key.pem
</syntaxhighlight>


=== Setup the phones ===
=== Setup the phones ===
Setting up a new phone, you need to follow the [https://www.sitespeed.io/documentation/sitespeed.io/mobile-phones/#on-your-phone following instructions].  
BitBar is handling the setup of phones.  


We have five phones running at the Performance Device Lab with the following setup.
We have five phones running at the Performance Device Lab with the following setup.
Line 177: Line 61:
|}
|}
Using rooted phones makes it possible to stabilize CPU and GPU performance by configuring governors.
Using rooted phones makes it possible to stabilize CPU and GPU performance by configuring governors.
=== Logs ===
All log files from the tests are located in '''~/wikimedia/performance-mobile-synthetic-monitoring-tests/logs'''. If you need to debug things, that's the place to look. Each mobile phone logs in its own log file named after the phone. The phone '''ZY2242HHQ8-Moto-G4''' has a log file named '''ZY2242HHQ8-Moto-G4.log'''.
==== Log rotation ====
Logs are rotated nightly using '''newsyslog''' and you find the configuration '''/etc/newsyslog.d/sitespeed.io.conf'''. Then the files are moved to '''~/wikimedia/performance-mobile-synthetic-monitoring-tests/logs/archive''' by the script  '''~/wikimedia/moveLogs.sh''' that runs in the crontab.
==== Log surveillance ====
All errors in the log files from the tests are sent to the '''#performance-team-synthetic-test-logs''' room on Matrix foundation.wikimedia.org. Ask the performance team for an invite to the room.
On the server there's a script in the the '''wikimedia''' folder that sends the errors. Start it like this:
<code>nohup nice ./sendmatrix.sh > /tmp/s.log 2> /tmp/s.err < /dev/null &</code>
The script is started on reboot on the server using @reboot in the crontab.


== Performance tests ==
== Performance tests ==
All configuration and setups for our tests lives in Gerrit in [[gerrit:admin/repos/performance/mobile-synthetic-monitoring-tests|performance/mobile-synthetic-monitoring-tests]]. To add or change tests you clone the repo and send in your change for review.
All configuration and setups for our tests lives in Gerrit in [[gerrit:admin/repos/performance/mobile-synthetic-monitoring-tests|performance/mobile-synthetic-monitoring-tests]]. To add or change tests you clone the repo and send in your change for review.
We run two different kind of tests at the moment: One phone tests agains WebPageReplay (to easier see front end performance regressions) and one phone run tests direct against Wikipedia.


=== Add a test ===
=== Add a test ===
Line 205: Line 72:
<code>cd mobile-synthetic-monitoring-tests/tests</code>
<code>cd mobile-synthetic-monitoring-tests/tests</code>


Each phone has it owns directory for the test files. That is important because each test needs to run on the same phone to get as stable metrics as possible. You can run three different kind of tests:
=== Change configuration ===  
 
* Access one URL with a cold browser cache - add a URL in a file with the file extension ''.txt''
* Use [https://www.sitespeed.io/documentation/sitespeed.io/scripting/ scripting] to measure a user journey -  add the script in a file  with the file extension ''.js''
* Test against WebPageReplay - add a URL in a file with the file extension ''.wpr''
 
If you want to add a new URL to test, add that to one of the existing ''.txt'' files. If you want to add a new user journey, create a new ''.js'' file with the script. If you want to add a new URL tested with WebPageReplay, add that to the existing ''.wpr'' file.
 
=== Change configuration ===
By default each phone has it own configuration file in the config directory. If you make changes to that file, you will change the configuration for all the tests that runs on that phone. If you want to have specific configuration for one test, you can do that but adding a specific configuration file for that test!
 
=== Add configuration for a specific test ===
Configuration can be inherited from other configuration files and the new file overrides other configuration. Checkout the configuration for the phone ZY2242HHQ8-Moto-G4. In the config folder the base configuration for that phone exists in ''ZY2242HHQ8-Moto-G4.json.'' Then look in the test folder for that phone. We have test where we test as a logged in user and that test file is named '''loginMobile.js.''' We then have a specific configuration file for that test named '''ZY2242HHQ8-Moto-G4.loginMobile.json'''. The first part of the name tells us which phone that runs the test and if there's a second part (after the dot) that tells us the name of the test.
 
=== Performance budget tests ===
On one of the phones we run some simple performance budget tests where we signal if a page has more long tasks than X and a total blocking time than Y. The [https://www.sitespeed.io/documentation/sitespeed.io/performance-budget/#all-possible-metrics-you-can-configure configuration] for that performance budget is in '''config/budget/cpu.json.'''
 
=== Send result/errors to Matrix ===
The CPU Long Task tests sends the budget result (URLs that do not pass the budget) to our Matrix instance (the room performance-team-mobile-budget). You can get an invite if you ask the performance team. The configuration for Matrix exists in the Mac Mini <code>~/wikimedia/config/matrix-s3.json</code> .
 
== Update browser version ==
You can check which browser version that is installed on the phone using ADB.
 
<code>adb -s ZY223LNPVM shell dumpsys package org.mozilla.firefox | grep versionName</code>
 
<code>adb -s ZY223LNPVM shell dumpsys package com.android.chrome | grep versionName</code>
 
=== Using ADB ===
Download the correct Chrome version from the [https://play.google.com/store/ Chrome Play Store].
 
Then install the APK on the phone using ADB:
 
<code>adb -s PHONE_ID  install -r chrome-86.apk</code>
 
=== Using scrcpy ===
If you wanna use scrcpy application you need to stop all the tests that runs (or the test for the phone you want to inspect):
 
<code>./stop.sh</code>
 
so that the application do not compete with our tests. Then wait for all the tests to finish.
 
Start scrcpy and use the -s switch to connect to the right phone.
 
<code>scrcpy -s PHONE_ID</code>
 
When the application has connected, you need will see a mirror screen on the Mac Mini. Go to the Google Play store and update the browser (get the login details from Peter or Gilles). When all phones are updated make sure you exit scrcpy before you start the tests again.
 
=== Update Chromedriver ===
Add a new Chromedriver version in <code>~/wikimedia/chromedriver</code>
 
Say that you want install Chromedriver 86, then create a new directory: <code>mkdir ~/wikimedia/chromedriver/86</code>
 
and then download the 86 version from https://chromedriver.chromium.org (macc64 version). Unzip the file, run it and set the security so its trusted on the Mac Mini:
[[File:Open the Chromedriver anyway.png|center|thumb]]
Then put it in the newly created directory.
 
The next step is to update the configuration JSON file for each phone. First do it on one phone directly on the Mac Mini so you can verify that it works. Then do it in the git repo and push the change.


== Troubleshooting ==
== Troubleshooting ==
If something is wrong with the phones, first try run the ''checkPhones.sh'' script (it's in the root folder). It will will give you useful info of the status of the phones:<syntaxhighlight lang="bash">
❯ ./checkPhones.sh
ZY223LNPVM XT1650 has a connection to the internet and the battery temperature is 24 degrees
ZY2242HHQ8 Moto G Play has a connection to the internet and the battery temperature is 26 degrees
ZY2242HJDX Moto G Play has a connection to the internet and the battery temperature is 24 degrees
ZY3222N2CZ Moto G (5) has a connection to the internet and the battery temperature is 30 degrees
ZY322HX8RC Moto G (5) has a connection to the internet and the battery temperature is 26 degrees
</syntaxhighlight>If one of the phones is not online or do not have a internet connection, ADB is your friend. Here's a [https://gist.github.com/Pulimet/5013acf2cd5b28e55036c82c91bd56d8 long list of useful ADB commands].  The first step is to login to the server and run <code>adb devices -l</code>
=== The device is offline ===
=== The device is offline ===
If the device is offline you can see that in the log, look for ''The phone state is offline.'' You can also verify that by running <code>adb devices</code>. Talk with the Wikimedia Performance team and we will contact Kobiton.
If the device is offline you can see that in the log, look for ''The phone state is offline.'' You can also verify that by running <code>adb devices</code>. Talk with the Wikimedia Performance team and we will contact Kobiton.
=== The device lost its wifi connection ===
How do you know if the device do not have a wifi connection? You will see that in the log file, look for messages with ''No internet connection. Could not ping XXX'' and no tests will work. You can verify that by sending ping from the device. When you use ADB  make sure you communicate with the right device. In this example the device is ''ZY322MMFZ2.''
<code>adb -s ZY322MMFZ2 shell ping www.wikipedia.org</code>
You can also look if the wifi is active, check for ''Current networks'' in the output from:
<code>adb -s ZY322MMFZ2 shell dumpsys connectivity</code>
If you don't have a network, the first step is reboot your device using ADB.
<code>adb -s ZY322MMFZ2 reboot</code>
If that doesn't help, use '''scrcpy''' to connect to the phone and check what's going on. If you cannot connect to the wifi, contact Kobiton.
=== Check battery temperature ===
We avoid running tests if the battery temperature is above 35 degrees. That works as a proxy for the CPU temperature. Sometimes you want to check the temperature manually and you can do that with ADC:
<code>adb -s ZY322MMFZ2 shell dumpsys battery | grep temperature | grep -Eo '[0-9]{1,3}'</code>


=== Missing data for a phone ===
=== Missing data for a phone ===
Line 301: Line 83:
== Dashboards ==
== Dashboards ==
We have three dashboards:
We have three dashboards:
* A dashboard for WebPageReplay tests: https://grafana.wikimedia.org/d/Tw-s6iKMz/mobile-webpagereplay?orgId=1
* A drilldown where you can find all tests on real mobile devices: https://grafana.wikimedia.org/d/000000082/mobile-drilldown?orgId=1
* A summary page for the tests we run: https://grafana.wikimedia.org/d/bB670cFGk/mobile-performance-synthetic-testing


== Outstanding issues ==
== Outstanding issues ==
There are a couple of issues that we need to fix in the current setup:
There are a couple of issues that we need to fix in the current setup
 
* Phones gets offline sometimes. It can either be a problem with the USB hub/cables or the phone battery that shuts off the phone.
* <s>The Kobiton application (the GUI to access the phones) access all the phones all the times. The purpose of the GUI application is to run manual tests so it reverts the state of all changes made on the phone so we cannot use it to setup things (they will be reverted) except for updating Chrome and Firefox. However it can be used for debugging when there's an error. At the moment starting the Kobiton application can affect testing on all phones, so if we us sit now, we need to stop the tests on all phones and then start the application. Remember to stop the application when you are finished and then start the tests again.</s> Avoid using the Kobiton application, please use scrcpy.
 
*

Revision as of 12:17, 27 September 2022

Summary

In the path to get more realistic performance metric run tests on real mobile devices. That makes it easier to find performance regressions. We use BitBar as the provider of our devices. All tasks are reported under the Performance Device Lab tag in Phabricator.

Performance testing on mobile phones

Running tests on mobile phones we want a stable environment that do not change, so we can see performance regressions. To make that happen we use:

  • A stable network: We throttle the connection to look like a 3g or 4g connection.By limiting the upload/download speed and adding delay, we make the phone get requests in a more realistic scenario. By making sure the network is the same all the time, the network will not affect our metrics.
  • Low phone temperature: We measure the battery temperature as a proxy for CPU temperature. Android phones change behavior when the CPU gets warm and we want to avoid that. Some of our phones are rooted to try to make sure the phone has the same performance characteristics. We use the same settings as the Mozilla performance team to setup the rooted phones. We measure the temperature before and after we start a test. If the temperature is too high we wait X minutes and try again.

Setup

We use five mobile phones, a server and two wifi:s setup with throttled connection to simulate 3g and 4g traffic.

Setup showing the mobile performance device lab.

The workflow: The jobs are started on the server that runs sitespeed.io that drives the phones using WebDriver. The configuration and URLs to tests exists in a public Git repo. The tests runs on the phones, access Wikipedia and we record a video of the screen and analyze the result to get visual metrics. The metrics is sent to our Graphite instance and the test results (screenshots, HTML result pages) are stored on S3. We also run one tests using WebPageReplay where we record and replay Wikipedia locally on the server to try to get as stable metrics as possible between runs.

Setup the phones

BitBar is handling the setup of phones.

We have five phones running at the Performance Device Lab with the following setup.

Id Type Internet connection Extras OS version Usage
ZY2242HJDX Moto G4 Simulated 4g Running Chrome looking CPU bottlenecks https://phabricator.wikimedia.org/T250686
ZY2242HHQ8 Moto G4 Simulated 3g Root 7.1.1 Running mobile and user journeys tests
ZY3222N2CZ Moto G5 Traffic back to the Mac Mini Root 7.0 Running against WebPageReplay
ZY322HX8RC Moto G5 Simulated 3g 8.1.0 Running tests using Firefox
ZY223LNPVM Moto Z Simulated 3g 8.0.0 Used for testing out new things

Using rooted phones makes it possible to stabilize CPU and GPU performance by configuring governors.

Performance tests

All configuration and setups for our tests lives in Gerrit in performance/mobile-synthetic-monitoring-tests. To add or change tests you clone the repo and send in your change for review.

Add a test

All configuration files exists in our synthetic monitoring tests repo. Clone the repo and go into the tests folder:

git clone ssh://USERNAME@gerrit.wikimedia.org:29418/performance/mobile-synthetic-monitoring-tests.git

cd mobile-synthetic-monitoring-tests/tests

Change configuration

Troubleshooting

The device is offline

If the device is offline you can see that in the log, look for The phone state is offline. You can also verify that by running adb devices. Talk with the Wikimedia Performance team and we will contact Kobiton.

Missing data for a phone

If you see in the graphs in Grafana that data is missing you need to check the logs on the server. Each phone has their own log. Say that the phone ZY322HX8RC-Moto-G5 has the problem, then go to the directory performance-mobile-synthetic-monitoring-tests/logs and check the log file named ZY322HX8RC-Moto-G5.log. Look for error messages in the log file.

Dashboards

We have three dashboards:

Outstanding issues

There are a couple of issues that we need to fix in the current setup