You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Analytics/EventLogging/Performance: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Mforns
 
imported>Milimetric
(Milimetric moved page Analytics/EventLogging/Performance to Analytics/Systems/EventLogging/Performance: Reorganizing documentation)
 
Line 1: Line 1:
Put any performance related Event Logging information on this page.
#REDIRECT [[Analytics/Systems/EventLogging/Performance]]
 
== Event Logging load test ==
 
Setup:
* using analytics1004.eqiad.wmnet, a beefy if not trusty Cisco box (analytics1004.eqiad.wmnet as of this writing)
* 31492 events, 4 schemas, split 3/3/3/1, generated by the wonderful test_load.py
* format: seqId[space]?[qson event]
 
Processor
* 4 runs average: 17.6478 seconds, so ~'''1784''' events per second
* command: <code>time cat /home/milimetric/load.test.seq.30k | python ./eventlogging-processor "%{seqId}d %q" stdin:// file:///home/milimetric/out.load.30k</code>
 
Processor (take 2)
* using a patch from Andrew but that shouldn't affect performance for file reading (https://gerrit.wikimedia.org/r/#/c/221664/)
* consuming from a file generated using kafkacat now, with the standard format expected in prod
* 30000 events exactly, as sampled from the prod stream, so some of them are invalid (truncated, bad data, etc)
* 4 runs average: 21.3743, so ~'''1403''' events per second
* command: <code>time cat /home/milimetric/load.test.sample | python bin/eventlogging-processor "%q %{recvFrom}s %{seqId}d %t %h %{userAgent}i" stdin:// file:///home/milimetric/out.load.sample</code>
* It seems even small variations in format and number of invalid events affects performance quite a bit, so we should have a fairly large margin because we could get a spike of invalid events
 
Processor (take 3 - direct from kafka)
* reading from the beginning of a kafka topic so we're not limited by amount of events
* command: <code>time python bin/eventlogging-processor "%q %{recvFrom}s %{seqId}d %t %h %{userAgent}i" "kafka:///analytics1012.eqiad.wmnet:9092?topic=eventlogging-client-side&auto_offset_reset=smallest" file:///home/milimetric/out.load.from.kafka</code>
* 40.229 seconds: 40467 events
* 78.812 seconds: 78185 events
* 54.090 seconds: 65253 events
* Average: ~'''1062''' events per second with a fairly high variance again suggesting we should give ourselves a large margin
 
Processor (take 4 - direct from kafka but with custom code to make all events invalid)
* Idea: does the processing rate suffer if all events are invalid
* 18.938 seconds, 13348 events, '''704''' per second
 
Processor (take 5 - all events invalid, no logging to slow down the console output)
* Idea: does the logger output which goes to the console in theses tests slow down the processing?
* 28.326 seconds, 76995 events, '''2719''' per second!!!
 
Processor (take 6)
* using Andrew's fancy parallel processor
* 1 process, 100k events: 71.439 seconds, 68.734
* 2 processes, 100k events: 35.335 seconds, 35.794
* 5 processes, 100k events: 15.269 seconds, 15.124
* 7 processes, 100k events: 12.959 seconds, 12.772
* 10 processes, 100k events: 11.291 seconds, 11.xxx
* 12 processes, 100k events: 11.000 seconds, 10.831
* 24 processes, 100k events: 9.713 seconds, 9.843
 
Consumer
* NOTE: mysql server was installed on the same host as the eventlogging-consumer
* NOTE: tried this with four different batch sizes (400, 1000, 4000, 8000) and it didn't seem to make a substantive difference.  Therefore I just used the default configured batch size of 400.  This indicates that most likely the performance bottleneck is in preparing and organizing the events for insertion and not the insertion itself.
* 4 runs average: 11.8945 seconds, so ~'''2647''' events per second
* command: <code>time cat ~/out.load.30k | python eventlogging-consumer stdin:// mysql://root:root@127.0.0.1/log?charset=utf8</code>
 
== Benchmarking DB inserts ==
 
When batching is turned on (i.e. events for distinct schemas are being grouped) this are some numbers we get in vanadium:
 
Inserting 300 events takes about  1 sec
 
Inserting 400 events takes less than 2 secs (~1.6)
 
Inserting 500 events takes about 2 secs
 
Inserting 600 events takes over 2 secs
 
Inserting 100/1500 events takes 3.5 secs

Latest revision as of 14:13, 7 April 2017