You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "Analytics/Systems/Varnishkafka"

From Wikitech-static
Jump to navigation Jump to search
imported>Milimetric
 
imported>Krinkle
Line 4: Line 4:
Gerrit: https://gerrit.wikimedia.org/r/#/admin/projects/operations/software/varnish/varnishkafka
Gerrit: https://gerrit.wikimedia.org/r/#/admin/projects/operations/software/varnish/varnishkafka


Github: github.com/wikimedia/varnishkafka
Github: https://github.com/wikimedia/varnishkafka


== Testing a code change ==
== Testing a code change ==

Revision as of 20:28, 6 November 2017

Varnishkafka is a daemon that runs on all the Wikimedia frontend caching hosts. Since Varnish logs HTTP requests in its own format in shared memory, Varnishkafka uses the Varnish Log API to read data, format it following some user input and finally send the result to a specific Kafka topic (using librdkafka).

Where is the code?

Gerrit: https://gerrit.wikimedia.org/r/#/admin/projects/operations/software/varnish/varnishkafka

Github: https://github.com/wikimedia/varnishkafka

Testing a code change

Use the Docker image as outlined in https://gerrit.wikimedia.org/r/#/admin/projects/operations/software/varnish/varnishkafka/testing. The wise reader might think "and what about unit tests? These ones are integration tests!". The Analytics team tried in T147432 to explore the possibility of adding unit tests to Varnishkafka but the estimated amount of time for the code refactoring, testing, releasing etc.. (this software is critical for the team) was not worth the benefits of having some code tests. The team preferred instead to spend more time on creating a flexible and simple integration testing suite.

Varnishkafka instances

You might see some reference in puppet of Varnishkafka instances, and this is because the same Varnish request data can be sliced and formatted in different ways for different scopes:

- Webrequest

- Statsv

- Eventlogging

On all the cache segments (text, upload, misc and maps) we run the Webrequest instance, meanwhile the Statsv and Eventlogging ones are only running in text.

Monitoring

https://grafana.wikimedia.org/dashboard/db/varnishkafka