You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Apache Traffic Server: Difference between revisions
imported>Ema |
imported>Ema |
||
Line 1: | Line 1: | ||
[https://trafficserver.apache.org/ Apache Traffic Server] is a caching proxy server. | [https://trafficserver.apache.org/ Apache Traffic Server] is a caching proxy server. | ||
== | == Architecture == | ||
There are three distinct processes in Traffic Server: | There are three distinct processes in Traffic Server: | ||
# traffic_server | # traffic_server | ||
Line 7: | Line 7: | ||
# traffic_cop | # traffic_cop | ||
'''traffic_server''' is the process responsible for dealing with user traffic: accepting connections, processing requests, serving documents from cache or the origin server. | '''traffic_server''' is the process responsible for dealing with user traffic: accepting connections, processing requests, serving documents from cache or the origin server. traffic_server is a event-driven multi-threaded process. Threads are used to take advantage of multiple CPUs, not to handle multiple connections concurrently (eg: by spawning a thread per connection, or by using a thread pool). Instead, an [https://docs.trafficserver.apache.org/en/latest/developer-guide/plugins/introduction.en.html#asynchronous-event-model event system] is used in order to schedule work on threads. ATS uses a [https://docs.trafficserver.apache.org/en/latest/developer-guide/plugins/hooks-and-transactions/index.en.html#http-transaction-state-diagram state machine] to handle each transaction (single HTTP request from a client and the response Traffic Server sends to that client) and provides a system of [https://docs.trafficserver.apache.org/en/latest/developer-guide/plugins/hooks-and-transactions/index.en.html hooks] where plugins (eg: lua) can step in and do things. Specific [https://docs.trafficserver.apache.org/en/latest/developer-guide/plugins/hooks-and-transactions/trafficserver-timers.en.html timers] are used at the various states. | ||
'''traffic_manager''' is responsible for launching, monitoring and configuring '''traffic_server''', handling the statistics interface, cluster administration and virtual IP failover. | '''traffic_manager''' is responsible for launching, monitoring and configuring '''traffic_server''', handling the statistics interface, cluster administration and virtual IP failover. | ||
'''traffic_cop''' is a watchdog program monitoring the health of both '''traffic_manager''' and '''traffic_server'''. This has traditionally been the command to use in order to start ATS. In a systemd world, it can be avoided, and traffic_manager can be used as the program to be executed in order to start the unit. | '''traffic_cop''' is a watchdog program monitoring the health of both '''traffic_manager''' and '''traffic_server'''. This has traditionally been the command to use in order to start ATS. In a systemd world, it can probably be avoided, and traffic_manager can be used as the program to be executed in order to start the unit. | ||
== | == Configuration == | ||
The | The changes to the default configuration required to get a caching proxy are: | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 43: | Line 43: | ||
If [https://docs.trafficserver.apache.org/en/latest/admin-guide/files/records.config.en.html#proxy-config-http-cache-required-headers proxy.config.http.cache.required_headers] is set to 2, which is the default, the origin server is required to set an explicit lifetime, from either '''Expires''' or '''Cache-Control: max-age'''. By setting '''required_headers''' to 1, objects with '''Last-Modified''' are considered for caching too. Setting the value to 0 means that no headers are required to make documents cachable. | If [https://docs.trafficserver.apache.org/en/latest/admin-guide/files/records.config.en.html#proxy-config-http-cache-required-headers proxy.config.http.cache.required_headers] is set to 2, which is the default, the origin server is required to set an explicit lifetime, from either '''Expires''' or '''Cache-Control: max-age'''. By setting '''required_headers''' to 1, objects with '''Last-Modified''' are considered for caching too. Setting the value to 0 means that no headers are required to make documents cachable. | ||
=== Logging === | |||
Diagnostic output [https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/records.config.en.html#diagnostic-logging-configuration can be sent to standard output and error instead of the default logfiles], which is a good idea in order to take advantage of systemd's journal. | |||
<source lang="bash"> | |||
# /etc/trafficserver/records.config | |||
CONFIG proxy.config.diags.output.status STRING O | |||
CONFIG proxy.config.diags.output.note STRING O | |||
CONFIG proxy.config.diags.output.warning STRING O | |||
CONFIG proxy.config.diags.output.error STRING E | |||
CONFIG proxy.config.diags.output.fatal STRING E | |||
CONFIG proxy.config.diags.output.alert STRING E | |||
CONFIG proxy.config.diags.output.emergency STRING E | |||
</source> | |||
=== Health checks === | === Health checks === | ||
Line 65: | Line 79: | ||
With the above configuration, GET requests to `/check` will result in 200 responses from ATS with the response body defined in `/etc/trafficserver/ts-alive`. | With the above configuration, GET requests to `/check` will result in 200 responses from ATS with the response body defined in `/etc/trafficserver/ts-alive`. | ||
=== Cache inspector === | |||
To enable the cache inspector functionality, add the following remap rules: | |||
<source lang="bash"> | |||
map /cache-internal/ http://{cache-internal} | |||
map /cache/ http://{cache} | |||
map /stat/ http://{stat} | |||
map /test/ http://{test} | |||
map /hostdb/ http://{hostdb} | |||
map /net/ http://{net} | |||
map /http/ http://{http} | |||
</source> | |||
=== systemd unit === | === systemd unit === | ||
Line 87: | Line 114: | ||
#PrivateTmp=yes | #PrivateTmp=yes | ||
CapabilityBoundingSet=CAP_CHOWN CAP_DAC_OVERRIDE CAP_IPC_LOCK CAP_KILL CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_SETGID CAP_SETUID | |||
# Setting SystemCallFilter as follows seems fine at first, but then objects do not get cached. Needs further investigation. | |||
# SystemCallFilter=~acct modify_ldt add_key adjtimex clock_adjtime delete_module fanotify_init finit_module get_mempolicy init_module io_destroy io_getevents iopl ioperm io_setup io_submit io_cancel kcmp kexec_load keyctl lookup_dcookie mbind migrate_pages mount move_pages open_by_handle_at perf_event_open pivot_root process_vm_readv process_vm_writev ptrace remap_file_pages request_key set_mempolicy swapoff swapon umount2 uselib vmsplice | # SystemCallFilter=~acct modify_ldt add_key adjtimex clock_adjtime delete_module fanotify_init finit_module get_mempolicy init_module io_destroy io_getevents iopl ioperm io_setup io_submit io_cancel kcmp kexec_load keyctl lookup_dcookie mbind migrate_pages mount move_pages open_by_handle_at perf_event_open pivot_root process_vm_readv process_vm_writev ptrace remap_file_pages request_key set_mempolicy swapoff swapon umount2 uselib vmsplice | ||
# MemoryDenyWriteExecute=true | # MemoryDenyWriteExecute=true | ||
Line 134: | Line 162: | ||
reverse_map http://$origin_server_ip/ http://127.0.0.1:3128/ | reverse_map http://$origin_server_ip/ http://127.0.0.1:3128/ | ||
</source> | </source> | ||
=== Choosing origin server === | |||
Selecting the appropriate origin server for a given request can be done using ATS [https://docs.trafficserver.apache.org/en/latest/admin-guide/files/remap.config.en.html mapping rules]. The same goal can be achieved in lua: | |||
<source lang="lua"> | <source lang="lua"> | ||
Line 148: | Line 179: | ||
</source> | </source> | ||
=== Negative response caching === | |||
By default ATS caches negative responses such as 404, 503 [https://docs.trafficserver.apache.org/en/latest/admin-guide/files/records.config.en.html#admin-negative-response-caching and others] only if the response defines a maxage via the Cache-Control header. This behavior can be changed by setting | |||
the configuration option [https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/records.config.en.html#proxy-config-http-negative-caching-enabled proxy.config.http.negative_caching_enabled], which allows caching of negative responses that do NOT specify Cache-Control. If negative caching is enabled, the lifetime of negative responses without Cache-Control is defined by [https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/records.config.en.html#proxy-config-http-negative-caching-lifetime proxy.config.http.negative_caching_lifetime], in seconds, defaulting to 1800. | |||
One might however desire to cache 404 responses which do not send Cache-Control, without caching any 503 response. Given that proxy.config.http.negative_caching_enabled enables the behavior for a bunch of negative responses, and there is no way to specify the list of negative response status codes to cache, the goal can be achieved by setting Cache-Control in lua only for certain status codes: | |||
<source lang="lua"> | |||
function read_response() | |||
local status_code = ts.server_response.get_status() | |||
local cache_control = ts.server_response.header['Cache-Control'] | |||
-- Cache 404 responses without CC for 10s | |||
if status_code == 404 and not(cache_control) then | |||
ts.server_response.header['Cache-Control'] = 'max-age=10' | |||
end | |||
end | |||
function do_remap() | |||
ts.hook(TS_LUA_HOOK_READ_RESPONSE_HDR, read_response) | |||
return 0 | |||
end | |||
</source> | |||
=== Setting X-Cache-Int === | |||
As another example, the following script takes care of setting the X-Cache-Int response header: | As another example, the following script takes care of setting the X-Cache-Int response header: | ||
Revision as of 18:30, 4 March 2018
Apache Traffic Server is a caching proxy server.
Architecture
There are three distinct processes in Traffic Server:
- traffic_server
- traffic_manager
- traffic_cop
traffic_server is the process responsible for dealing with user traffic: accepting connections, processing requests, serving documents from cache or the origin server. traffic_server is a event-driven multi-threaded process. Threads are used to take advantage of multiple CPUs, not to handle multiple connections concurrently (eg: by spawning a thread per connection, or by using a thread pool). Instead, an event system is used in order to schedule work on threads. ATS uses a state machine to handle each transaction (single HTTP request from a client and the response Traffic Server sends to that client) and provides a system of hooks where plugins (eg: lua) can step in and do things. Specific timers are used at the various states.
traffic_manager is responsible for launching, monitoring and configuring traffic_server, handling the statistics interface, cluster administration and virtual IP failover.
traffic_cop is a watchdog program monitoring the health of both traffic_manager and traffic_server. This has traditionally been the command to use in order to start ATS. In a systemd world, it can probably be avoided, and traffic_manager can be used as the program to be executed in order to start the unit.
Configuration
The changes to the default configuration required to get a caching proxy are:
# /etc/trafficserver/remap.config
map client_url origin_server_url
The following rules map grafana and phabricator to their respective backends and define a catchall for requests that don't match either of the first two rules:
# /etc/trafficserver/remap.config
map http://grafana.wikimedia.org/ http://krypton.eqiad.wmnet/
map http://phabricator.wikimedia.org/ http://iridium.eqiad.wmnet/
map / http://deployment-mediawiki05.deployment-prep.eqiad.wmflabs/
# /etc/trafficserver/records.config
CONFIG proxy.config.http.server_ports STRING 3128 3128:ipv6
CONFIG proxy.config.admin.synthetic_port INT 8083
CONFIG proxy.config.process_manager.mgmt_port INT 8084
CONFIG proxy.config.admin.user_id STRING trafficserver
CONFIG proxy.config.http.cache.required_headers INT 1
CONFIG proxy.config.url_remap.pristine_host_hdr INT 1
CONFIG proxy.config.disable_configuration_modification INT 1
If proxy.config.http.cache.required_headers is set to 2, which is the default, the origin server is required to set an explicit lifetime, from either Expires or Cache-Control: max-age. By setting required_headers to 1, objects with Last-Modified are considered for caching too. Setting the value to 0 means that no headers are required to make documents cachable.
Logging
Diagnostic output can be sent to standard output and error instead of the default logfiles, which is a good idea in order to take advantage of systemd's journal.
# /etc/trafficserver/records.config
CONFIG proxy.config.diags.output.status STRING O
CONFIG proxy.config.diags.output.note STRING O
CONFIG proxy.config.diags.output.warning STRING O
CONFIG proxy.config.diags.output.error STRING E
CONFIG proxy.config.diags.output.fatal STRING E
CONFIG proxy.config.diags.output.alert STRING E
CONFIG proxy.config.diags.output.emergency STRING E
Health checks
Load the `healthchecks` plugin:
# /etc/trafficserver/plugin.config
healthchecks.so /etc/trafficserver/healtchecks.conf
Define health check:
# /etc/trafficserver/healtchecks.conf
/check /etc/trafficserver/ts-alive text/plain 200 403
Response body:
# /etc/trafficserver/ts-alive
All good
With the above configuration, GET requests to `/check` will result in 200 responses from ATS with the response body defined in `/etc/trafficserver/ts-alive`.
Cache inspector
To enable the cache inspector functionality, add the following remap rules:
map /cache-internal/ http://{cache-internal}
map /cache/ http://{cache}
map /stat/ http://{stat}
map /test/ http://{test}
map /hostdb/ http://{hostdb}
map /net/ http://{net}
map /http/ http://{http}
systemd unit
# /etc/systemd/system/trafficserver.service
[Unit]
Description=Apache Traffic Server
After=network.service systemd-networkd.service network-online.target
[Service]
ExecStart=/usr/bin/traffic_manager --nosyslog
ExecReload=/usr/bin/traffic_ctl config reload
Restart=always
RestartSec=1
LimitNOFILE=500000
LimitMEMLOCK=90000
# PrivateTmp causes the following error:
# FATAL: unable to load remap.config
# traffic_server: using root directory '/usr'
#PrivateTmp=yes
CapabilityBoundingSet=CAP_CHOWN CAP_DAC_OVERRIDE CAP_IPC_LOCK CAP_KILL CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_SETGID CAP_SETUID
# Setting SystemCallFilter as follows seems fine at first, but then objects do not get cached. Needs further investigation.
# SystemCallFilter=~acct modify_ldt add_key adjtimex clock_adjtime delete_module fanotify_init finit_module get_mempolicy init_module io_destroy io_getevents iopl ioperm io_setup io_submit io_cancel kcmp kexec_load keyctl lookup_dcookie mbind migrate_pages mount move_pages open_by_handle_at perf_event_open pivot_root process_vm_readv process_vm_writev ptrace remap_file_pages request_key set_mempolicy swapoff swapon umount2 uselib vmsplice
# MemoryDenyWriteExecute=true
ReadOnlyDirectories=/usr
ReadOnlyDirectories=/var/lib
#
#ReadOnlyDirectories=/etc
#ReadWriteDirectories=/etc/trafficserver/internal
#ReadWriteDirectories=/etc/trafficserver/snapshots
Cheatsheet
Show non-default configuration values:
sudo traffic_ctl config diff
Configuration reload:
sudo traffic_ctl config reload
Check if a reload/restart is needed:
sudo traffic_ctl config status
Start in debugging mode, dumping headers
sudo traffic_server -T http_hdrs
Access metrics from the CLI:
traffic_ctl metric get proxy.process.http.cache_hit_fresh
Lua scripting
ATS plugins can be written in Lua. As an example, this is how to choose an origin server dynamically:
# /etc/trafficserver/remap.config
map http://127.0.0.1:3128/ http://$origin_server_ip/ @plugin=/usr/lib/trafficserver/modules/tslua.so @pparam=/var/tmp/ats-set-backend.lua
reverse_map http://$origin_server_ip/ http://127.0.0.1:3128/
Choosing origin server
Selecting the appropriate origin server for a given request can be done using ATS mapping rules. The same goal can be achieved in lua:
-- /var/tmp/ats-set-backend.lua
function do_remap()
url = ts.client_request.get_url()
if url:match("/api/rest_v1/") then
ts.client_request.set_url_host('origin-server.eqiad.wmnet')
ts.client_request.set_url_port(80)
ts.client_request.set_url_scheme('http')
return TS_LUA_REMAP_DID_REMAP
end
end
Negative response caching
By default ATS caches negative responses such as 404, 503 and others only if the response defines a maxage via the Cache-Control header. This behavior can be changed by setting the configuration option proxy.config.http.negative_caching_enabled, which allows caching of negative responses that do NOT specify Cache-Control. If negative caching is enabled, the lifetime of negative responses without Cache-Control is defined by proxy.config.http.negative_caching_lifetime, in seconds, defaulting to 1800.
One might however desire to cache 404 responses which do not send Cache-Control, without caching any 503 response. Given that proxy.config.http.negative_caching_enabled enables the behavior for a bunch of negative responses, and there is no way to specify the list of negative response status codes to cache, the goal can be achieved by setting Cache-Control in lua only for certain status codes:
function read_response()
local status_code = ts.server_response.get_status()
local cache_control = ts.server_response.header['Cache-Control']
-- Cache 404 responses without CC for 10s
if status_code == 404 and not(cache_control) then
ts.server_response.header['Cache-Control'] = 'max-age=10'
end
end
function do_remap()
ts.hook(TS_LUA_HOOK_READ_RESPONSE_HDR, read_response)
return 0
end
Setting X-Cache-Int
As another example, the following script takes care of setting the X-Cache-Int response header:
-- /var/tmp/ats-set-x-cache-int.lua
function cache_lookup()
local cache_status = ts.http.get_cache_lookup_status()
ts.ctx['cstatus'] = cache_status
end
function cache_status_to_string(status)
if status == TS_LUA_CACHE_LOOKUP_MISS then
return "miss"
end
if status == TS_LUA_CACHE_LOOKUP_HIT_FRESH then
return "hit"
end
if status == TS_LUA_CACHE_LOOKUP_HIT_STALE then
return "miss"
end
if status == TS_LUA_CACHE_LOOKUP_SKIPPED then
return "pass"
end
return "bug"
end
function gen_x_cache_int()
local hostname = "cp4242" -- from puppet
local cache_status = cache_status_to_string(ts.ctx['cstatus'])
local v = ts.client_response.header['X-Cache-Int']
local mine = hostname .. " " .. cache_status
if (v) then
v = v .. ", " .. mine
else
v = mine
end
ts.client_response.header['X-Cache-Int'] = v
ts.client_response.header['X-Cache-Status'] = cache_status
end
function do_remap()
ts.hook(TS_LUA_HOOK_CACHE_LOOKUP_COMPLETE, cache_lookup)
ts.hook(TS_LUA_HOOK_SEND_RESPONSE_HDR, gen_x_cache_int)
return 0
end
Unit testing
The busted framework allows to test Lua scripts. It can be installed as follows:
apt install luarocks
luarocks install busted
luarocks install luacov
The following unit tests cover some of the functionalities implemented by ats-set-x-cache-int.lua:
-- unit_test.lua
_G.ts = { client_response = { header = {} }, ctx = {} }
describe("Busted unit testing framework", function()
describe("script for ATS Lua Plugin", function()
it("test - hook", function()
stub(ts, "hook")
require("ats-set-x-cache-int")
local result = do_remap()
assert.are.equals(0, result)
end)
it("test - gen_x_cache_hit", function()
stub(ts, "hook")
require("ats-set-x-cache-int")
local result = gen_x_cache_int()
assert.are.equals('miss', ts.client_response.header['X-Cache-Status'])
assert.are.equals('cp4242 miss', ts.client_response.header['X-Cache-Int'])
end)
end)
end)
Run the tests and generate a coverage report with:
$ busted -c unit_test.lua
●●
2 successes / 0 failures / 0 errors / 0 pending : 0.012771 seconds
$ luacov ; cat luacov.report.out