Kafka HTTP purging
The current (2020) mechanism for purging objects from the CDN is based on a daemon running on all cache nodes called Purged . Purged can be configured to read purge messages using either the legacy Multicast HTCP purging mechanism, or via Kafka . Regardless of the source from which purge messages are read, Purged converts them into HTTP PURGE requests sent locally to both the ATS cache backend and to the Varnish cache frontend.
Typical purge flow
- MediaWiki detects that a purge is needed. It produces a Kafka message on a given topic for each individual URI that needs to be purged
- Purged , the daemon running on every relevant cache machine, consumes the appropriate Kafka topics receives a copy of the purge message. Purged forwards the request to the Varnish and ATS instances on localhost over a persistent HTTP/1.1 connection, using the PURGE request method.
- PURGE requests are handled by ATS and Varnish and cause the cache object in question to be invalidated.
MediaWiki
All CDN purges are generated in MediaWiki via
CdnCacheUpdate::purge
method. Currently MediaWiki is configured to send the generated purges to the EventRelayer under the
cdn-url-purges
key.
EventBus
extension provides an implementation of the EventRelayer,
CdnPurgeEventRelayer
that creates purge events and sends them to Kafka using normal EventBus flow - via
eventgate
service.
Relevant configuration:
// Configuration for the EventRelayer to send purges to resource-purge kafka topic
'wgEventRelayerConfig' => [
'cdn-url-purges' => [
'class' => \MediaWiki\Extension\EventBus\Adapters\EventRelayer\CdnPurgeEventRelayer::class,
'stream' => 'resource-purge',
],
'default' => [
'class' => EventRelayerNull::class,
],
],
// EventBus stream configuration
'wgEventServiceDefault' => 'eventgate-main'
One-off purge
On a deployment server , run:
$ echo 'https://example.org/foo?x=y' | mwscript-k8s --attach -- purgeList.php
If the URL is under
/static/
, it must always be purged via hostname
en.wikipedia.org
. This is the shared virtual hostname under which Varnish caches content for
/static/
, regardless of requesting wiki hostname.