You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "Analytics/Systems/EventLogging/TestingOnBetaCluster"

From Wikitech-static
Jump to navigation Jump to search
imported>FDans
(Corrected scap command)
imported>Quiddity
m (fixes)
 
(15 intermediate revisions by 8 users not shown)
Line 1: Line 1:
{{Notice|This documentation is outdated.  See [[Event_Platform/Instrumentation_How_To]].}}
The consumer side of event logging can be easily tested on Beta Cluster.
The consumer side of event logging can be easily tested on Beta Cluster.


Line 6: Line 9:
Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance
Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance


Note that you need <code>sudo</code> on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on <code>deployment-eventlogging03.eqiad.wmflabs</code>.
Note that you need <code>sudo</code> on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on <code>deployment-eventlog05</code>.
It is unfortunate that sudo is required but that is the state of affairs right now.
It is unfortunate that sudo is required but that is the state of affairs right now.


Line 12: Line 15:


=== How to log  a client-side event to Beta Cluster directly ===
=== How to log  a client-side event to Beta Cluster directly ===
Just hit the varnish endpoint on labs, for example:  
Just hit the varnish endpoint on labs for example:  
 
curl -A "WikipediaApp/22.0.22-alpha-2017-11-01 (Android 8.0.0; Phone) Alpha Channel" https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22pageTitleSource%22%3A%22Main%20Page%22%2C%22namespaceIdSource%22%3A0%2C%22pageIdSource%22%3A1%2C%22isAnon%22%3Atrue%2C%22popupEnabled%22%3Atrue%2C%22pageToken%22%3A%223ec574813fadb97b%22%2C%22sessionToken%22%3A%228460c2e4d547b250%22%2C%22previewCountBucket%22%3A%220%20previews%22%2C%22hovercardsSuppressedByGadget%22%3Afalse%2C%22action%22%3A%22pageLoaded%22%7D%2C%22revision%22%3A16364296%2C%22schema%22%3A%22Popups%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D
 
 


http://bits.beta.wmflabs.org/event.gif?%7B%22event%22%3A%7B%22mobileMode%22%3A%22stable%22%2C%22name%22%3A%22hamburger%22%7D%2C%22revision%22%3A10742159%2C%22schema%22%3A%22MobileWebUIClickTracking%22%2C%22webHost%22%3A%22en.m.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D;
https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22country%22%3A%22US%22%2C%22region%22%3A%22WA%22%2C%22anonymous%22%3Atrue%2C%22project%22%3A%22wikipedia%22%2C%22db%22%3A%22enwiki%22%2C%22uselang%22%3A%22en%22%2C%22device%22%3A%22desktop%22%2C%22debug%22%3Afalse%2C%22randomcampaign%22%3A0.8838892205730462%2C%22randombanner%22%3A0.7340400211496478%2C%22recordImpressionSampleRate%22%3A0.01%2C%22impressionEventSampleRate%22%3A1%2C%22status%22%3A%22banner_shown%22%2C%22statusCode%22%3A%226%22%2C%22campaign%22%3A%22CN%20browser%20tests%22%2C%22campaignCategory%22%3A%22CNbrowsertests%22%2C%22campaignCategoryUsesLegacy%22%3Afalse%2C%22bucket%22%3A0%2C%22banner%22%3A%22browser_test_b3%22%2C%22bannerCategory%22%3A%22CNbrowsertests%22%2C%22result%22%3A%22show%22%2C%22testIdentifiers%22%3A%22popupsUnknown%22%7D%2C%22revision%22%3A17995347%2C%22schema%22%3A%22CentralNoticeImpression%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D;


=== How to log via the website ===
=== How to log via the website ===
Line 25: Line 32:


== How to verify events ==
== How to verify events ==
You can tail the files in the <code>/srv/log/eventlogging</code> on <code>deployment-eventlogging03.eqiad.wmflabs</code> to verify if your event is coming through.
You can tail the files in the <code>/srv/log/eventlogging</code> on <code>deployment-eventlog05.deployment-prep.eqiad1.wikimedia.cloud</code> to verify if your event is coming through.
 
Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.
Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.


  ssh deployment-eventlogging03.eqiad.wmflabs
  ssh deployment-eventlog05.deployment-prep.eqiad1.wikimedia.cloud
  cd /srv/log/eventlogging
  cd /srv/log/eventlogging


=== Validated events ===
=== Validated events ===
* <code>all-events.log</code>: validated events appear in this file  
 
Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds).
==== In MySQL ====
If they don't appear then check the next section.
All events in beta should be written to the MySQL <tt>log</tt> database hosted on the beta eventlogging server.
 
  ssh deployment-eventlog05.deployment-prep.eqiad1.wikimedia.cloud
  sudo mysql --skip-ssl log
  show tables;
  ...
 
==== In files ====
* <code>all-events.log</code>: schema-validated events that are inserted into MYSQL appear in this file (the all* in name is missleading).
 
Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds). The contents of this file are of the eventlogging_valid_mixed topic in Kafka.
 
==== In Kafka ====
You can consume valid events directly from kafka:
 
  kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'
 
will list topics .
 
After, consume from your topic.  It should be named something like eventlogging_<schema>. You should be able to see your events if they are valid.
 
  kafkacat -C -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 -t eventlogging_NavigationTiming


=== Raw stream of events (including unvalidated events) ===
=== Raw stream of events (including unvalidated events) ===
* <code>client-side-events.log</code>: client side events appear in this file (valid and not)
* <code>client-side-events.log</code>: client side events appear in this file (valid and not)


If events do not appear they might not be valid, check <code>/var/log/upstart/</code> for either the client-side processor logs.
If events do not appear they might not be valid, check <code>/srv/log/eventlogging/systemd</code> and <code>tail -f</code> + <code>grep</code> the following:
  eventlogging_processor-client-side-events.log  
  eventlogging-processor@client-side-XX.log
 


Validation errors will appear on those logs and they are very descriptive.
Validation errors will appear on those logs and they are very descriptive. '''Note''': you may see a <code>-00</code> log and a <code>-01</code> log, which exist for parallelization and you should monitor both.


=== Where is eventlogging code? ===
=== Where is eventlogging code? ===


   /srv/deployment/eventlogging/analytics/eventlogging
   /srv/deployment/eventlogging/analytics/eventlogging
=== See all EventLogging schema Kafka topics ===
  kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'
All event logging topics for which valid events are being sent should be present here


== Database ==
== Database ==
The mysql server is storing events just like it is in production, in order to see events you can use the eventlogging user whose user and password  
In order to see events you can use the eventlogging user whose user and password are listed at:
are listed at:


  /etc/eventlogging.d/consumers/mysql-m4-master
  /etc/eventlogging.d/consumers/mysql-m4-master
Line 57: Line 90:
If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:
If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:


  mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1
  mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1 --skip-ssl
  (it's labs, the password is not really a secret.)
  (it's labs, the password is not really a secret.)


If mysql needs a re-start:
If mysql needs a re-start:


  /etc/init.d/mysql start
  systemctl restart mysql


The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err


TokuDB shoudl be enabled by default, otherwise try:
This might be of help: [http://www.chriscalender.com/disabling-transparent-hugepages-for-tokudb/]
  /etc/init.d/mysql start--default-storage-engine=tokudb


The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err
Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:<syntaxhighlight lang="bash">
elukey@deployment-eventlog05:~$ systemctl list-timers | grep sanitization
Wed 2018-10-24 11:00:00 UTC  20h left      Tue 2018-10-23 11:00:14 UTC  3h 19min ago eventlogging_db_sanitization.timer eventlogging_db_sanitization.service
elukey@deployment-eventlog05:~$ systemctl cat eventlogging_db_sanitization.service
# /lib/systemd/system/eventlogging_db_sanitization.service
[Unit]
Description=Apply Analytics data retetion policies to the Eventlogging database


This might be of help: [http://www.chriscalender.com/disabling-transparent-hugepages-for-tokudb/]
[Service]
User=eventlogcleaner
ExecStart=/usr/local/bin/eventlogging_cleaner ...[cut]...
</syntaxhighlight>Two notable things:
* --no-whitelist-sanity-check is not used in production but only in beta.
* The systemd timer updates /srv/eventlogging/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day).


== Admin ==
== Admin ==
Line 77: Line 120:
=== Give people access ===
=== Give people access ===
Add them to the lists on these wikis (you need to be an admin to do that)
Add them to the lists on these wikis (you need to be an admin to do that)
Asking in #wikimedia-labs might be a way to get help.
Asking in {{irc|wikimedia-cloud}} might be a way to get help.


[[Nova_Resource:Deployment-prep]]
[[Nova_Resource:Deployment-prep]]
Line 85: Line 128:
=== How to deploy code ===
=== How to deploy code ===


<source lang="bash">
<syntaxhighlight lang="bash">
# Log into the beta deploy server
# Log into the beta deploy server
ssh deployment-tin.deployment-prep.eqiad.wmflabs
ssh deployment-tin.deployment-prep.eqiad1.wikimedia.cloud


# cd to the EventLogging analytics deploy source
# cd to the EventLogging analytics deploy source
Line 94: Line 137:
# Deploy using scap3 in the beta environment
# Deploy using scap3 in the beta environment
scap deploy -e beta
scap deploy -e beta
</source>
</syntaxhighlight>


You can run puppet with
You can run puppet with
Line 103: Line 146:
Check:
Check:


   /etc/init/eventlogging/init.conf
   sudo eventloggingctl status
 


Run:
Run:


   sudo eventloggingctl restart
   sudo eventloggingctl restart
 


Stop completely:
Stop completely:
   sudo  eventloggingctl stop
   sudo  eventloggingctl stop
The config applied to create logs and such by upstart is at:
/etc/eventlogging.d/consumers/
== Kafka ==
If you're testing Kafka stuff on the beta cluster, you'll need a zookeeper.  You can pass <code>--zookeeper deployment-zookeeper01:2181/kafka/deployment-kafka</code>.  Or you can just do <code>export ZOOKEEPER_URL=deployment-zookeeper01:2181/kafka/deployment-kafka</code>

Latest revision as of 18:39, 4 September 2021


The consumer side of event logging can be easily tested on Beta Cluster.

Instance

The instance name is configured here: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings-labs.php Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance

Note that you need sudo on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on deployment-eventlog05. It is unfortunate that sudo is required but that is the state of affairs right now.

How to create test events

How to log a client-side event to Beta Cluster directly

Just hit the varnish endpoint on labs for example:

curl -A "WikipediaApp/22.0.22-alpha-2017-11-01 (Android 8.0.0; Phone) Alpha Channel" https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22pageTitleSource%22%3A%22Main%20Page%22%2C%22namespaceIdSource%22%3A0%2C%22pageIdSource%22%3A1%2C%22isAnon%22%3Atrue%2C%22popupEnabled%22%3Atrue%2C%22pageToken%22%3A%223ec574813fadb97b%22%2C%22sessionToken%22%3A%228460c2e4d547b250%22%2C%22previewCountBucket%22%3A%220%20previews%22%2C%22hovercardsSuppressedByGadget%22%3Afalse%2C%22action%22%3A%22pageLoaded%22%7D%2C%22revision%22%3A16364296%2C%22schema%22%3A%22Popups%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D


https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22country%22%3A%22US%22%2C%22region%22%3A%22WA%22%2C%22anonymous%22%3Atrue%2C%22project%22%3A%22wikipedia%22%2C%22db%22%3A%22enwiki%22%2C%22uselang%22%3A%22en%22%2C%22device%22%3A%22desktop%22%2C%22debug%22%3Afalse%2C%22randomcampaign%22%3A0.8838892205730462%2C%22randombanner%22%3A0.7340400211496478%2C%22recordImpressionSampleRate%22%3A0.01%2C%22impressionEventSampleRate%22%3A1%2C%22status%22%3A%22banner_shown%22%2C%22statusCode%22%3A%226%22%2C%22campaign%22%3A%22CN%20browser%20tests%22%2C%22campaignCategory%22%3A%22CNbrowsertests%22%2C%22campaignCategoryUsesLegacy%22%3Afalse%2C%22bucket%22%3A0%2C%22banner%22%3A%22browser_test_b3%22%2C%22bannerCategory%22%3A%22CNbrowsertests%22%2C%22result%22%3A%22show%22%2C%22testIdentifiers%22%3A%22popupsUnknown%22%7D%2C%22revision%22%3A17995347%2C%22schema%22%3A%22CentralNoticeImpression%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D;

How to log via the website

Use http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page to create events in mobile, for example.

How to load test with a bunch of events

There's a script that may be handy. It's in the same eventlogging codebase:

https://github.com/wikimedia/eventlogging/blob/master/bin/eventlogging-load-tester

How to verify events

You can tail the files in the /srv/log/eventlogging on deployment-eventlog05.deployment-prep.eqiad1.wikimedia.cloud to verify if your event is coming through.

Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.

ssh deployment-eventlog05.deployment-prep.eqiad1.wikimedia.cloud
cd /srv/log/eventlogging

Validated events

In MySQL

All events in beta should be written to the MySQL log database hosted on the beta eventlogging server.

 ssh deployment-eventlog05.deployment-prep.eqiad1.wikimedia.cloud
 sudo mysql --skip-ssl log
 show tables;
 ...

In files

  • all-events.log: schema-validated events that are inserted into MYSQL appear in this file (the all* in name is missleading).

Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds). The contents of this file are of the eventlogging_valid_mixed topic in Kafka.

In Kafka

You can consume valid events directly from kafka:

 kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'

will list topics .

After, consume from your topic. It should be named something like eventlogging_<schema>. You should be able to see your events if they are valid.

 kafkacat -C -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 -t eventlogging_NavigationTiming

Raw stream of events (including unvalidated events)

  • client-side-events.log: client side events appear in this file (valid and not)

If events do not appear they might not be valid, check /srv/log/eventlogging/systemd and tail -f + grep the following:

eventlogging-processor@client-side-XX.log

Validation errors will appear on those logs and they are very descriptive. Note: you may see a -00 log and a -01 log, which exist for parallelization and you should monitor both.

Where is eventlogging code?

 /srv/deployment/eventlogging/analytics/eventlogging


See all EventLogging schema Kafka topics

 kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'

All event logging topics for which valid events are being sent should be present here

Database

In order to see events you can use the eventlogging user whose user and password are listed at:

/etc/eventlogging.d/consumers/mysql-m4-master

If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:

mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1 --skip-ssl
(it's labs, the password is not really a secret.)

If mysql needs a re-start:

systemctl restart mysql

The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err

This might be of help: [1]

Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:

elukey@deployment-eventlog05:~$ systemctl list-timers | grep sanitization
Wed 2018-10-24 11:00:00 UTC  20h left      Tue 2018-10-23 11:00:14 UTC  3h 19min ago eventlogging_db_sanitization.timer eventlogging_db_sanitization.service
elukey@deployment-eventlog05:~$ systemctl cat eventlogging_db_sanitization.service
# /lib/systemd/system/eventlogging_db_sanitization.service
[Unit]
Description=Apply Analytics data retetion policies to the Eventlogging database

[Service]
User=eventlogcleaner
ExecStart=/usr/local/bin/eventlogging_cleaner ...[cut]...

Two notable things:

  • --no-whitelist-sanity-check is not used in production but only in beta.
  • The systemd timer updates /srv/eventlogging/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day).

Admin

Give people access

Add them to the lists on these wikis (you need to be an admin to do that) Asking in #wikimedia-cloud connect might be a way to get help.

Nova_Resource:Deployment-prep

Special:NovaProject -> add users to deployment-prep

How to deploy code

# Log into the beta deploy server
ssh deployment-tin.deployment-prep.eqiad1.wikimedia.cloud

# cd to the EventLogging analytics deploy source
cd /srv/deployment/eventlogging/analytics

# Deploy using scap3 in the beta environment
scap deploy -e beta

You can run puppet with

puppet agent -tv

Restart EventLogging

Check:

 sudo eventloggingctl status

Run:

 sudo eventloggingctl restart

Stop completely:

 sudo  eventloggingctl stop