You are browsing a read-only backup copy of Wikitech. The live site can be found at


From Wikitech-static
Revision as of 07:28, 29 March 2022 by imported>JMeybohm (→‎List existing actions)
Jump to navigation Jump to search

What is requestctl

Requestctl is the tool we use to control request patterns at various layers of the infrastructure, currently mainly at the edge caches.

In this page, we'll show how the tool is used at the WMF. For more details about its schema and its general command line options, see requestctl's README file.

In production, we keep all requestctl objects under the requestctl directory of the private puppet repository.

Quick start: adding a new rule

Let's say we want to throttle per ip requests that don't have an accept-encoding header, have Connect: keep-alive as a header and go to a special page, coming from azure.

We already have the ipblocks from azure, originating from a cronjob running on the puppetmasters, in the file requestctl/request-ipblocks/cloud/azure.yaml:

:~$ requestctl get ipblock -o json | jq -r 'keys[]'

Now let's check if we have a request pattern that correspond to not having an accept-encoding header:

:~$ requestctl get pattern
name                pattern
------------------  --------------------------------
req/cache_buster_q  ?q=\w{12}
ua/urllib3          User-Agent: ^python-urllib3/.*$
ua/requests         User-Agent: ^python-requests/.*$
ua/curl             User-Agent: ^curl/.*$
ua/MediaWiki        User-Agent: ^MediaWiki/.*$
sites/commonswiki   Host:
sites/wikidata      Host:
sites/enwiki        Host:
url/api             url:^/w/(api|rest).php
url/docroot         url:^/[?$]
url/page            url:^/wiki/
url/semicolon_page  url:^/wiki/.+:+

It doesn't look like it's the case! So let's add a file named /srv/private/requestctl/request-patterns/req/no_accept_encoding.yaml, with the following content:

header: 'Accept-Encoding'

Omitting any header_value this will translate to "no header present" (see the README, again).

Now let's sync our objects to etcd:

puppetmaster1001:~$ sudo requestctl sync -g /srv/private/requestctl pattern
2022-03-28 14:56:23,995 - reqctl (cli:_write:359) - INFO - Updating pattern ua/MediaWiki
2022-03-28 14:56:24,005 - reqctl (cli:_write:359) - INFO - Updating pattern ua/curl
2022-03-28 14:56:24,014 - reqctl (cli:_write:359) - INFO - Updating pattern ua/urllib3
2022-03-28 14:56:24,024 - reqctl (cli:_write:359) - INFO - Updating pattern ua/requests
2022-03-28 14:56:24,034 - reqctl (cli:_write:359) - INFO - Updating pattern sites/wikidata
2022-03-28 14:56:24,044 - reqctl (cli:_write:359) - INFO - Updating pattern sites/commonswiki
2022-03-28 14:56:24,054 - reqctl (cli:_write:359) - INFO - Updating pattern sites/enwiki
2022-03-28 14:56:24,064 - reqctl (cli:_write:359) - INFO - Updating pattern req/cache_buster
2022-03-28 14:56:24,073 - reqctl (cli:_write:359) - INFO - Updating pattern req/specific_page
2022-03-28 14:56:24,085 - reqctl (cli:_write:359) - INFO - Updating pattern req/cache_buster_q
2022-03-28 14:56:24,094 - reqctl (cli:_write:362) - INFO - Creating pattern req/no_accept_encoding
2022-03-28 14:56:24,103 - reqctl (cli:_write:359) - INFO - Updating pattern url/semicolon_page
2022-03-28 14:56:24,113 - reqctl (cli:_write:359) - INFO - Updating pattern url/api
2022-03-28 14:56:24,122 - reqctl (cli:_write:359) - INFO - Updating pattern url/docroot
2022-03-28 14:56:24,133 - reqctl (cli:_write:359) - INFO - Updating pattern url/page

(note that our object hase been created).

Now we can do the same with Connect: keep-alive, we'll create the file /srv/private/requestctl/request-patterns/req/keepalive.yaml containing:

header: Connect
header_value: keep-alive

and sync again with the same command.

Now we have all the ingredients, and we can move to write the action at /srv/private/requestctl/request-actions/cache-text/bot_from_azure.yaml

# This should tell anyone what this rule does
comment: "Throttle requests with keepalive but no accept-encoding, coming from azure."
# This is the default. For now add it.
enabled: false
# each pattern and ipblock is referred to using {pattern,ipblock}@<scope>/<name>
expression: pattern@req/keepalive AND pattern@req/no_accept_encoding AND ipblock@cloud/azure
# Only bother with cache misses
cache_miss_only: true
# We want to throttle individual ips
do_throttle: true
throttle_per_ip: true
# Allow 10 rqp per 10 seconds, and if exceeeded, ban for 1 minute
throttle_requests: 100
throttle_interval: 10
throttle_duration: 60

now we can just run requestctl sync -g /srv/private/requestctl action And the object will be in the datastore:

:~$ sudo requestctl get action cache-text/bot_from_azure -o yaml
  cache_miss_only: true
  comment: Throttle requests with keepalive but no accept-encoding, coming from azure
  do_throttle: true
  enabled: false
  expression: pattern@req/keepalive AND pattern@req/no_accept_encoding AND ipblock@cloud/azure
  resp_reason: ''
  resp_status: 429
  sites: []
  throttle_duration: 60
  throttle_interval: 10
  throttle_per_ip: true
  throttle_requests: 100

Now this won't show up now in varnish. To make that happen, we would need to run

puppetmaster1001:~$ sudo requestctl enable cache-text/bot_from_azure

At this point, the rule will appear on all cache-text nodes. That is because we didn't define the sites property for our new rule.

Commands recap

List existing actions

# All actions.
requestctl get action -o yaml
# A specific action
requestctl get action cache-text/generic_ua_clouds
# All enabled actions
requestctl get action -o json | jq 'to_entries[] | select(.value.enabled == true)'

Enable / Disable an action

# Writes to the datastore, needs sudo
sudo requestctl enable cache-text/generic_ua_clouds
sudo requestctl disable cache-text/generic_ua_clouds

List existing patterns

# All patterns
requestctl get pattern
# Request a specific pattern
requestctl get pattern ua/requests

Modifying any object

  • Ssh to a puppetmaster frontend
  • modify the yaml file under /srv/private/requestctl, commit the change
  • Run requestctl sync -g /srv/private/requestctl {pattern,ipblock/action}

Removing an object

  • If you're removing a pattern / ipblock, ensure it's not referenced by any action object
  • remove the object file from the git repository, commit the change
  • Run requestctl sync --purge -g /srv/private/requestctl {pattern,ipblock/action}