You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Analytics/Systems/EventLogging/TestingOnBetaCluster: Difference between revisions
imported>Nuria |
imported>Elukey |
||
Line 54: | Line 54: | ||
* <code>client-side-events.log</code>: client side events appear in this file (valid and not) | * <code>client-side-events.log</code>: client side events appear in this file (valid and not) | ||
If events do not appear they might not be valid, check <code>/ | If events do not appear they might not be valid, check <code>/srv/log/eventlogging/systemd</code> and <code>tail -f</code> + <code>grep</code> the following: | ||
eventlogging-processor@client-side-XX.log | eventlogging-processor@client-side-XX.log | ||
Line 88: | Line 88: | ||
Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:<syntaxhighlight lang="bash"> | Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:<syntaxhighlight lang="bash"> | ||
elukey@deployment-eventlog05:~$ | elukey@deployment-eventlog05:~$ systemctl list-timers | grep sanitization | ||
Wed 2018-10-24 11:00:00 UTC 20h left Tue 2018-10-23 11:00:14 UTC 3h 19min ago eventlogging_db_sanitization.timer eventlogging_db_sanitization.service | |||
elukey@deployment-eventlog05:~$ systemctl cat eventlogging_db_sanitization.service | |||
# | # /lib/systemd/system/eventlogging_db_sanitization.service | ||
[Unit] | |||
Description=Apply Analytics data retetion policies to the Eventlogging database | |||
[Service] | |||
User=eventlogcleaner | |||
ExecStart=/usr/local/bin/eventlogging_cleaner ...[cut]... | |||
</syntaxhighlight>Two notable things: | </syntaxhighlight>Two notable things: | ||
* --no-whitelist-sanity-check is not used in production but only in beta. | * --no-whitelist-sanity-check is not used in production but only in beta. | ||
* The | * The systemd timer updates /srv/eventlogging/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day). | ||
== Admin == | == Admin == |
Revision as of 14:20, 23 October 2018
The consumer side of event logging can be easily tested on Beta Cluster.
Instance
The instance name is configured here: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings-labs.php Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance
Note that you need sudo
on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on deployment-eventlog05
.
It is unfortunate that sudo is required but that is the state of affairs right now.
How to create test events
How to log a client-side event to Beta Cluster directly
Just hit the varnish endpoint on labs for example:
curl -A "WikipediaApp/22.0.22-alpha-2017-11-01 (Android 8.0.0; Phone) Alpha Channel" https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22pageTitleSource%22%3A%22Main%20Page%22%2C%22namespaceIdSource%22%3A0%2C%22pageIdSource%22%3A1%2C%22isAnon%22%3Atrue%2C%22popupEnabled%22%3Atrue%2C%22pageToken%22%3A%223ec574813fadb97b%22%2C%22sessionToken%22%3A%228460c2e4d547b250%22%2C%22previewCountBucket%22%3A%220%20previews%22%2C%22hovercardsSuppressedByGadget%22%3Afalse%2C%22action%22%3A%22pageLoaded%22%7D%2C%22revision%22%3A16364296%2C%22schema%22%3A%22Popups%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D
https://deployment.wikimedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22country%22%3A%22US%22%2C%22region%22%3A%22WA%22%2C%22anonymous%22%3Atrue%2C%22project%22%3A%22wikipedia%22%2C%22db%22%3A%22deploymentwiki%22%2C%22uselang%22%3A%22en%22%2C%22device%22%3A%22desktop%22%2C%22debug%22%3Afalse%2C%22randomcampaign%22%3A0.8838892205730462%2C%22randombanner%22%3A0.7340400211496478%2C%22recordImpressionSampleRate%22%3A0.01%2C%22impressionEventSampleRate%22%3A1%2C%22status%22%3A%22banner_shown%22%2C%22statusCode%22%3A%226%22%2C%22campaign%22%3A%22CN%20browser%20tests%22%2C%22campaignCategory%22%3A%22CNbrowsertests%22%2C%22campaignCategoryUsesLegacy%22%3Afalse%2C%22bucket%22%3A0%2C%22banner%22%3A%22browser_test_b3%22%2C%22bannerCategory%22%3A%22CNbrowsertests%22%2C%22result%22%3A%22show%22%2C%22testIdentifiers%22%3A%22popupsUnknown%22%7D%2C%22revision%22%3A17995347%2C%22schema%22%3A%22CentralNoticeImpression%22%2C%22webHost%22%3A%22deployment.wikimedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22deploymentwiki%22%7D;
How to log via the website
Use http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page to create events in mobile, for example.
How to load test with a bunch of events
There's a script that may be handy. It's in the same eventlogging codebase:
https://github.com/wikimedia/eventlogging/blob/master/bin/eventlogging-load-tester
How to verify events
You can tail the files in the /srv/log/eventlogging
on 9 deployment-eventlog05.deployment-prep.eqiad.wmflabs
to verify if your event is coming through.
Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.
ssh deployment-eventlog05.eqiad.wmflabs cd /srv/log/eventlogging
Validated events
all-events.log
: schema-validated events that are inserted into MYSQL appear in this file (the all* in name is missleading), as of 2018/09 most events are being sent to hadoop by default and thus will not appear here, schemas that prior to 2018/09 were sent to MySQL continue to be so.
Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds). The contents of this file are of the eventlogging_valid_mix topic in Kafka.
If events are not being stored in MySQL you would need to consume them directly from kafka:
kafka-tools -b deployment-kafka-jumbo-2.deployment-prep.eqiad.wmflabs:9092 print_topics
Will list topics .
After, consume from eventlogging_<schema> using kafkacat. You should be able to see your events if they are valid. Note that eventlogging_<schema> topics in Kafka are only used by hadoop pipeline, not MySQL pipeline.
kafka-tools -b deployment-kafka-jumbo-2.deployment-prep.eqiad.wmflabs:9092 consume_topic eventlogging_VirtualPageView
Raw stream of events (including unvalidated events)
client-side-events.log
: client side events appear in this file (valid and not)
If events do not appear they might not be valid, check /srv/log/eventlogging/systemd
and tail -f
+ grep
the following:
eventlogging-processor@client-side-XX.log
Validation errors will appear on those logs and they are very descriptive. Note: you may see a -00
log and a -01
log, which exist for parallelization and you should monitor both.
Where is eventlogging code?
/srv/deployment/eventlogging/analytics/eventlogging
See all kafka topics
kafka-tools -b deployment-kafka-jumbo-2.deployment-prep.eqiad.wmflabs:9092 print_topics
All event logging topics for which valid events are being sent should be present here
Database
The mysql server is storing events just like it is in production, in order to see events you can use the eventlogging user whose user and password are listed at:
/etc/eventlogging.d/consumers/mysql-m4-master
If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:
mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1 --skip-ssl (it's labs, the password is not really a secret.)
If mysql needs a re-start:
systemctl restart mysql
The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err
This might be of help: [1]
Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:
elukey@deployment-eventlog05:~$ systemctl list-timers | grep sanitization
Wed 2018-10-24 11:00:00 UTC 20h left Tue 2018-10-23 11:00:14 UTC 3h 19min ago eventlogging_db_sanitization.timer eventlogging_db_sanitization.service
elukey@deployment-eventlog05:~$ systemctl cat eventlogging_db_sanitization.service
# /lib/systemd/system/eventlogging_db_sanitization.service
[Unit]
Description=Apply Analytics data retetion policies to the Eventlogging database
[Service]
User=eventlogcleaner
ExecStart=/usr/local/bin/eventlogging_cleaner ...[cut]...
Two notable things:
- --no-whitelist-sanity-check is not used in production but only in beta.
- The systemd timer updates /srv/eventlogging/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day).
Admin
Give people access
Add them to the lists on these wikis (you need to be an admin to do that) Asking in #wikimedia-cloud connect might be a way to get help.
Special:NovaProject -> add users to deployment-prep
How to deploy code
# Log into the beta deploy server
ssh deployment-tin.deployment-prep.eqiad.wmflabs
# cd to the EventLogging analytics deploy source
cd /srv/deployment/eventlogging/analytics
# Deploy using scap3 in the beta environment
scap deploy -e beta
You can run puppet with
puppet agent -tv
Restart EventLogging
Check:
sudo eventloggingctl status
Run:
sudo eventloggingctl restart
Stop completely:
sudo eventloggingctl stop
Kafka
If you're testing Kafka stuff on the beta cluster, you'll need a zookeeper. You can pass --zookeeper deployment-zookeeper02:2181/kafka/deployment-kafka
. Or you can just do export ZOOKEEPER_URL=deployment-zookeeper02:2181/kafka/deployment-kafka