You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Analytics/Systems/Matomo: Difference between revisions
imported>Neil P. Quinn-WMF (Copyedit. Consolidate info under the name Matomo. Explain when to use in the intro.) |
imported>Neil P. Quinn-WMF (Clarify which username to use) |
||
Line 1: | Line 1: | ||
'''[https://matomo.org/ Matomo]''' (formerly known as '''Piwik''') is a web analytics platform which we use for microsites (roughly 10,000 requests per day or less). Our production instance can be reached at [https://piwik.wikimedia.org https://piwik.wikimedia.org]. it has two layer authentication, a first one with LDAP credentials and another one with a Matomo specific user/password. | '''[https://matomo.org/ Matomo]''' (formerly known as '''Piwik''') is a web analytics platform which we use for microsites (roughly 10,000 requests per day or less). Our production instance can be reached at [https://piwik.wikimedia.org https://piwik.wikimedia.org]. it has two layer authentication, a first one with LDAP credentials and another one with a Matomo specific user/password. | ||
==Access== | ==Access== | ||
To access Matomo, you need <code>wmf</code> or <code>nda</code> LDAP access. For more details, see [[Analytics/Data access#LDAP access]]. | |||
If you have that access, you can log in at [https://piwik.wikimedia.org piwik.wikimedia.org] with your Wikitech username and password. | |||
After the LDAP login there is a second login form that | After the LDAP login, there is a second login form that we don't need but cannot easily remove. To log in, use the username <code>design</code> and password <code>design</code>. | ||
== How to instrument == | == How to instrument == |
Revision as of 14:59, 8 January 2020
Matomo (formerly known as Piwik) is a web analytics platform which we use for microsites (roughly 10,000 requests per day or less). Our production instance can be reached at https://piwik.wikimedia.org. it has two layer authentication, a first one with LDAP credentials and another one with a Matomo specific user/password.
Access
To access Matomo, you need wmf
or nda
LDAP access. For more details, see Analytics/Data access#LDAP access.
If you have that access, you can log in at piwik.wikimedia.org with your Wikitech username and password.
After the LDAP login, there is a second login form that we don't need but cannot easily remove. To log in, use the username design
and password design
.
How to instrument
Piwik does some tracking out of the box like counting pageviews and unique devices, you can instrument further using piwik's (now called matomo) api:
https://developer.matomo.org/api-reference/tracking-javascript
Administration
When team requests a piwik beacon
- Go to piwik and login with admin user
- Click Settings
- Websites -> Manage -> Add Site
Adding a site will create some tracking code like:
<script type="text/javascript"> var _paq = _paq || []; /* tracker methods like "setCustomDimension" should be called before "trackPageView" */ _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//piwik.wikimedia.org/"; _paq.push(['setTrackerUrl', u+'piwik.php']); _paq.push(['setSiteId', '19']); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s); })(); </script>
- Enable piwik user to see the site (password in stat1007 on /home/nuria, this is a regular user, not an admin one)
- Done
Once snippet is in place visits will come in (reports are run once a day)
Rerun reports for all websites
php /usr/share/matomo/console core:archive --force-all-websites --force-all-periods=86400 (for websites that had visits in the last day)
Invalidate old reports
It happened in the past that the daily archiver cron (responsible to generate daily/monthly/yearly stats for any Piwik domain) skipped days of data, ending up in reports like https://phabricator.wikimedia.org/T188559 (data collected but not archived, so flat graphs). There is a quick way to force Piwik to rerun its archival process over past data, namely invalidating it:
elukey@bohrium:/var/log/matomo for el in {20..28}; do sudo -u www-data /usr/share/matomo/console core:invalidate-report-data --dates=2018-02-$el --sites=3; done
Invalidating day periods in 2018-02-20 [segment = ]...
Invalidating week periods in 2018-02-20 [segment = ]...
Invalidating month periods in 2018-02-20 [segment = ]...
Invalidating year periods in 2018-02-20 [segment = ]...
Invalidating day periods in 2018-02-21 [segment = ]...
Invalidating week periods in 2018-02-21 [segment = ]...
Invalidating month periods in 2018-02-21 [segment = ]...
Invalidating year periods in 2018-02-21 [segment = ]...
Invalidating day periods in 2018-02-22 [segment = ]...
[..cut..]
In this example data from 20/02/2018 to 28/02/2018 has been invalidated via Piwik's console for website id 3 (currently iOS).
Tuning
We had an expected performance problem while tracking a larger website, which we fixed with their adviceː http://piwik.org/docs/setup-auto-archiving/
The cron we set up with this technique is:
root@matomo1001:/var/log/apache2# crontab -u www-data -l # HEADER: This file was autogenerated at 2017-05-05 12:01:17 +0000 by puppet. # HEADER: While it can still be managed manually, it is definitely not recommended. # HEADER: Note particularly that the comments starting with 'Puppet Name' should # HEADER: not be deleted, as doing so could cause duplicate cron jobs. # Puppet Name: piwik_archiver MAILTO=analytics-alerts@wikimedia.org 0 8 * * * [ -e /usr/share/matomo/console ] && [ -x /usr/bin/php ] && nice /usr/bin/php /usr/share/matomo/console core:archive --url="piwik.wikimedia.org" >> /var/log/matomo/matomo-archive.log
Known outages
- Nov 23rd 2017: due to a Ganeti failure (more details in https://phabricator.wikimedia.org/T181121) the bohrium virtual machine (running Piwik and its mysql database) got stopped in a non graceful way, ending up in a InnoDB table corruption. We had to restore the last mysql backup happened on Nov 22, so almost all the data related to Nov 23 has not been recorded.
- June 27th 2018: 8 hours of downtime to upgrade the Piwik database to the new version - T192298
- June 28th 2018: 1 hour of downtime to upgrade Piwik to Matomo
- October 4th 2018: 1 hour of downtime to move Matomo/Piwik from bohrium to matomo1001 (new host).
- October 5th 2018: 1 hour of downtime to fix a database issue.
- December 5th 2018: 30 mins of downtime to upgrade to 3.7.0