You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

HTTP timeouts: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Vgutierrez
(Added envoy)
 
imported>Effie Mouzeli
No edit summary
Line 69: Line 69:
|[https://github.com/wikimedia/operations-mediawiki-config/blob/dd2f06c71e82cef6a24c8325ede80c4847085f61/wmf-config/set-time-limit.php#L31 60 seconds]<sup>GET</sup> / [https://github.com/wikimedia/operations-mediawiki-config/blob/dd2f06c71e82cef6a24c8325ede80c4847085f61/wmf-config/set-time-limit.php#L29 200 seconds]<sup>POST</sup>
|[https://github.com/wikimedia/operations-mediawiki-config/blob/dd2f06c71e82cef6a24c8325ede80c4847085f61/wmf-config/set-time-limit.php#L31 60 seconds]<sup>GET</sup> / [https://github.com/wikimedia/operations-mediawiki-config/blob/dd2f06c71e82cef6a24c8325ede80c4847085f61/wmf-config/set-time-limit.php#L29 200 seconds]<sup>POST</sup>
|}
|}
'''Note:''' Those timeouts might be larger than the ones on the caching layer, mainly to properly service internal clients

Revision as of 06:10, 30 October 2019

This page is an attempt of documenting the timeouts involved in a request performed by a user against a service behind WMF caching layer.


The entry point for a user could be nginx or ats-tls depending on the service and the cache node assigned to the user IP:

TLS termination layer SSL handshake timeout TTFB (origin server) successive reads (origin server) Keepalive timeout (client)
nginx 60 seconds (nginx default value) 180 seconds 180 seconds (same config parameter as TTFB) 60 seconds
ats-tls 60 seconds 180 seconds 200 seconds 120 seconds

Currently a big difference between nginx and ats-tls can be found on how they handle POST requests. nginx buffers the whole request completely before relying it to the origin (varnish-frontend) while ats-tls doesn't buffer it and relays the connection to varnish-frontend as soon as possible. On nginx, the timeout to fulfil the POST body is 60 seconds between read operations, this is the default value and it isn't explicitly configured.

Our caching system is split in two layers (frontend and backend). There is one implementation of the frontend layer (varnish) and two implementations of the backend layer (varnish-be and ats-be).

caching layer connect timeout TTFB successive reads
varnish-frontend 3 secondstext / 5 secondsupload 65 secondstext / 35 secondsupload 33 secondstext / 60 secondsupload
varnish-backend 3 secondstext 63 secondstext 31 secondstext
ats-backend N/A (fused together with TTFB) 180 secondsGET / 180 secondsPOST,PUT 200 seconds

After leaving the backend caching layer, the request reaches the appserver. Here are described the timeouts that apply to appservers and api:

layer request timeout
Nginx (TLS/ats-be requests) N/A (same timeouts as the nginx used for TLS termination)
Envoy(TLS/ats-be requests) 1 secondconnect timeout / 65 secondsroute timeout
Apache 202 seconds
PHP 201 secondsappservers / 201 secondsapi
Excimer 60 secondsGET / 200 secondsPOST

Note: Those timeouts might be larger than the ones on the caching layer, mainly to properly service internal clients