You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Analytics/Systems/Cluster/Camus: Difference between revisions
imported>Neil P. Quinn-WMF (Link to ticket for evaluation of replacement component) |
imported>Elukey |
||
Line 6: | Line 6: | ||
== How to stop Camus == | == How to stop Camus == | ||
The quickest way is to ssh to | The quickest way is to ssh to an-launcher1002 and check/stop systemd timers: | ||
ssh an- | ssh an-launcher2002.eqiad.wmnet | ||
sudo - | |||
sudo systemctl list-timers | grep camus | |||
Then disable puppet (requires SRE/root permissions) and: | |||
sudo systemctl stop camus-webrequest | |||
== Check Camus Production logs == | == Check Camus Production logs == | ||
* ssh to an- | * ssh to an-launcher1002 | ||
* logs are stored in /var/log/camus, one (rotated) file per camus run-type (as of today: <code>webrequest</code>, <code>eventlogging</code>, <code>mediawiki</code> and <code>eventbus</code>) | * logs are stored in /var/log/camus, one (rotated) file per camus run-type (as of today: <code>webrequest</code>, <code>eventlogging</code>, <code>mediawiki</code> and <code>eventbus</code>) | ||
* In those files are logged both camus output and camus-partition-checker output. | * In those files are logged both camus output and camus-partition-checker output. | ||
== How to produce to kafka == | == How to produce to kafka == | ||
cat test_message.txt | kafkacat -b | cat test_message.txt | kafkacat -b kafka-jumbo1001.eqiad.wmnet:9092 -t test | ||
Test message is a file like: | Test message is a file like: |
Revision as of 07:21, 27 November 2020
This info is for members of analytics team.
Analytics Production Camus jobs are launched via hdfs user cron on analytics1003 (check site.pp in Puppet first for the role camus).
As of September 2020, possible replacements for Camus are being evaluated (T238400).
How to stop Camus
The quickest way is to ssh to an-launcher1002 and check/stop systemd timers:
ssh an-launcher2002.eqiad.wmnet
sudo systemctl list-timers | grep camus
Then disable puppet (requires SRE/root permissions) and:
sudo systemctl stop camus-webrequest
Check Camus Production logs
- ssh to an-launcher1002
- logs are stored in /var/log/camus, one (rotated) file per camus run-type (as of today:
webrequest
,eventlogging
,mediawiki
andeventbus
) - In those files are logged both camus output and camus-partition-checker output.
How to produce to kafka
cat test_message.txt | kafkacat -b kafka-jumbo1001.eqiad.wmnet:9092 -t test
Test message is a file like:
{"id":123456,"name":"pepito perez", "muchoStuff":{"a": "1"}} {"id":123456,"name":"pepito perez", "muchoStuff":{"a": "2"}} {"id":123456,"name":"pepito perez", "muchoStuff":{"a": "3"}} {"id":123456,"name":"pepito perez", "muchoStuff":{"a": "4"}}
How to validate your data against your avro schema
We have found php bindings to be different than java ones, please validate messages using this java jar:
java -jar avro-tools-1.7.6.jar jsontofrag --schema-file CirrusSearchRequestSet.avsc searchmessage.json
How to run camus job to decode avro from kafka topic
Camus is our map reduce job but also has some of the code we depend on, thus camus jar appears twice.
Note that you need your local properties file to pass to camus. Note: "-P /home/user/avro-kafka/camus.avro.json.properties" below
"Real" properties files live on puppet: [1]
#!/bin/sh export LIBJARS=/home/user/avro-kafka/camus-wmf-0.1.0-wmf6.jar,/home/user/avro-kafka/camus-etl-kafka-0.1.0-wmf6.jar,/home/user/avro-kafka/camus-api-0.1.0-wmf6.jar,/home/user/av ro-kafka/camus-kafka-coders-0.1.0-wmf6.jar,/home/user/avro-kafka/camus-schema-registry-0.1.0-wmf6.jar,/home/user/avro-kafka/camus-parent-0.1.0-wmf6-tests.jar,/home/user/avro-kafka/refinery-camus-0.0.20-SNAPSHOT.jar export HADOOP_CLASSPATH=/home/user/avro-kafka/camus-wmf-0.1.0-wmf6.jar:/home/user/avro-kafka/camus-etl-kafka-0.1.0-wmf6.jar:/home/user/avro-kafka/camus-api-0.1.0-wmf6.jar:/home/user/avro-kafka/camus-kafka-coders-0.1.0-wmf6.jar:/home/user/avro-kafka/camus-schema-registry-0.1.0-wmf6.jar:/home/user/avro-kafka/camus-parent-0.1.0-wmf6-tests.jar:/home/user/avr o-kafka/refinery-camus-0.0.20-SNAPSHOT.jar /usr/bin/hadoop jar /home/user/avro-kafka/camus-wmf-0.1.0-wmf6.jar com.linkedin.camus.etl.kafka.CamusJob -libjars ${LIBJARS} -Dcamus.job.name="some_avro_test" -P /home/user/avro-kafka/camus.avro.json.properties >> ./log_camus_avro_test.txt 2>&1