From leonickel at gmail.com Tue Feb 5 21:24:09 2013 From: leonickel at gmail.com (Leonardo Nickel) Date: Tue, 5 Feb 2013 19:24:09 -0200 Subject: [Varnish 3.0.2 + Chrome] Failed requests even with 200 OK status response Message-ID: Hi guys, This is my first post in this group. Thanks for the community. I'm facing some weird problem with Varnish 3.0.2 and Google Chrome. While doing several requests, some of them are getting "failed" with "200 OK" status response. In the "Developer tools" we get the "200 OK" with correct request and response headers. It seems the image was retrieved correctly from Varnish cache, but for no reason Chrome interprets this request as "failed". I'm sending an attachment to give you an example of that. Further infornation: - i'm using connection keep-alive with timeout=20, max=15 - i'm passing Content-Length header from my backend, this seems no to be a problem; During my requests i can see the following lines in varnishlog: 10 Debug - "Hit send timeout, wrote = 20272/175697; retrying" 10 Debug - "Write error, retval = -1, len = 155425, errno = Resource temporarily unavailable" Searching on the web, i've found a couple of threads relating this problem as "varnish trying to send a response for a closed connection by the browser". But we are using keep-alive connections, so i've got confused on this point. Is there any configuration on varnish i'm missing? Or maybe should i pass some additional header for Chrome? Thanks in advance for the help! -- Leonardo Nickel -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sample-request.png Type: image/png Size: 138898 bytes Desc: not available URL: From phk at phk.freebsd.dk Wed Feb 6 15:21:19 2013 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Wed, 06 Feb 2013 15:21:19 +0000 Subject: VMOD objects Message-ID: <54201.1360164079@critter.freebsd.dk> We talked about some VCL improvements at the Dev-meeting in CPH and having thought about it, I realized that it would actually be pretty trivial to implement one of the more powerful things, vmod objects, so this I have started doing. The basic idea is that you can create objects in a VMOD's init function, which can be called in your VCL program, with the poster-boy example looking somewhat like: import dx_director; sub vcl_init { dx1 = dx_director(1712); dx1.add_backend(b1, 92); dx1.add_backend(b2, 112); dx1.add_backend(b4, 7); dx2 = dx_director(1913); dx2.add_backend(b1, 23); dx2.add_backend(b2, 27); dx2.add_backend(b3, 19); } sub vcl_recv { if (something...) { set req.backend = dx1.select_backend(); } else { set req.backend = dx2.select_backend(); } } In the vmod's .vcc file, this would be declared something like: Object dx_director(REAL) { Method VOID .add_backend(BACKEND, REAL) Method BACKEND .select_backend(VOID) } I've done the first part, the .vcc file compiler, now I just need the actual VCL compiler, and you can start to play with this. I should warn you, that one very likely outcome of this, is that _all_ directors gets expelled into VMODs in Varnish4. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From slink at schokola.de Thu Feb 7 10:20:14 2013 From: slink at schokola.de (Nils Goroll) Date: Thu, 07 Feb 2013 11:20:14 +0100 Subject: VMOD objects In-Reply-To: <54201.1360164079@critter.freebsd.dk> References: <54201.1360164079@critter.freebsd.dk> Message-ID: <51137FDE.8000404@schokola.de> On 02/ 6/13 04:21 PM, Poul-Henning Kamp wrote: > I should warn you, that one very likely outcome of this, is that > _all_ directors gets expelled into VMODs in Varnish4. YEAH! I love this kind of warning :) phk, this sounds like a _very_ powerful approach with the potential to pave the path for much more powerful VMODs. Hope to find some time soon to work on a varnish4 vmod. Nils From slink at schokola.de Thu Feb 7 10:22:54 2013 From: slink at schokola.de (Nils Goroll) Date: Thu, 07 Feb 2013 11:22:54 +0100 Subject: etag handling / entity modifications Message-ID: <5113807E.50708@schokola.de> Based on what we discussed on Monday, I have tried to make up my mind about how the VCL interface for proper gzip/gunzip handling with respect to ETags and Vary could look like. reviews welcome for https://www.varnish-cache.org/trac/wiki/ETags Nils From martin at varnish-software.com Thu Feb 7 16:21:27 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Thu, 7 Feb 2013 17:21:27 +0100 Subject: Reliably detecting clients that has hung up after waitinglist Message-ID: Hi, We have an open ticket #1252, that I am in the process of creating a patch to fix. The problem here is with regard to waitinglists and failure mode. When a slow backend times out, the requesting client will get it's 503 page and go away, scheduling the waitinglist in the process. One of these parked sessions will then retry, and the rest goes back on the waitinglist. The failing backend doesn't recover, so the same happens on every backend attempt. As there isn't any client communication going on here, Varnish never notices if the client (very likely after n * first_byte_timeout) has given up and gone away. For a popular page the number of sessions coming into the waitinglist (page's hitrate) is going to be much higher than the number leaving (only one every first_byte_timeout). End result is going out of file descriptors as observed in ticket #1252, or hitting session_max as we have observed in a $customer case. The solution to this was first thought to be simple. When a session comes off the waitinglist, check to see if the connection has been closed. If it has, drop the session. Only this turned out to be quite hard to do. The normal TCP EOF checking can't really be done due to HTTP pipelining. If we try to read data from the socket, we have to store the data away somewhere for reuse, and if the next request is a HTTP post with a large body, we'll run out of httpconn pipeline buffer space. Doing recv(2) with MSG_PEEK won't do for the same reason, as we'll only check to see if the OS buffer has anything there, and if there's pipelining in play we'll get a false positive. Doing a poll() on the socket was next attempted, with the POLLHUP tantalizingly seeming the perfect state. Unfortunately POLLHUP will only be signaled when both ends agree on the connection being closed, and in this case it's only the client that has closed the connection (FIN received by the server, but no FIN is sent the other way until the socket is closed on the server side). The poll()'s thus returned only POLLIN and POLLOUT. The Linux specific POLLRDHUP could be used to detect the FIN from the client, giving reliable results that the client has indeed closed it's end of the duplex connection. But on closer thought and a lot of googling, I can't find anything about HTTP clients not being allowed to half-close the TCP connection after sending the request and still expect to read the reply. On the contrary, exotic clients did turn up that does exactly that. So this also turned out not to be a solution. What we really want to know in this case, is not if the client will be sending us any more data, but if the client is still going to accept our response when(/if) we are able to deliver it. So it is writing data that we need to test. But the HTTP protocol isn't going to allow us to write something just to test if that'll result in TCP RSTs. So next thing I have tested out is SO_KEEPALIVE option on the socket. This will send periodic messages to the client, that it will have to ACK, which should eventually allow the server TCP stack to learn that there are no clients on the other end. On FreeBSD SO_KEEPALIVE is on by default I believe, but on Linux it isn't. So with SO_KEEPALIVE on, the client would send a RST after the socket left FIN_WAIT2 state (60s), which closed it on the server side too, allowing Varnish to get POLLHUP state and kill off the session! Reading up on SO_KEEPALIVE, there are some caveats though. By default both Linux and FreeBSD won't start sending keep-alives until the connection has been idle for 7200 seconds (2 hours?!). And then only after 9 unsuccessful probes spaced 75 seconds apart will it kill the connection. So it will by default take 2 hours 11 minutes for this to trigger. For a webserver these defaults seems too large and should be lowered. But for the problem at hand I believe the default values will still benefit, as this problem doesn't happen instantly but builds over time when dealing with unresponsive backends. On Linux it is possible to specify the SO_KEEPALIVE idle timer and connection attempts/interval per connection using custom SO_-options. So as a final solution to this problem, I suggest always setting SO_KEEPALIVE on our sockets. Then add parameter settings for the keep alive idle timer, the count and the interval, with better defaults for a webserver (or maybe just setting the idle time to sess_timeout / 2?). (Docfix FreeBSD / other platforms that can't set the keep alive values per socket). And finally making Varnish poll() the client socket on return from waitinglist and kill the session on POLLHUP. Any comments to this plan of action would be most welcome, and also any clever ideas that I might have missed to detect this situation in the first place. Regards, Martin Blix Grydeland -- *Martin Blix Grydeland* Senior Developer | Varnish Software AS Cell: +47 21 98 92 60 We Make Websites Fly! -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at varnish-software.com Mon Feb 11 14:09:57 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Mon, 11 Feb 2013 15:09:57 +0100 Subject: [PATCH 1/2] Turn on SO_KEEPALIVE on all TCP connections. Message-ID: <1360591798-25870-1-git-send-email-martin@varnish-software.com> This will help in determining remote hang up of the connection for situations where we still are not able to send any reply, but freeing the session will reduce resource overhead (e.g. when staying on waitinglists for extended periods). On platforms that support it also add runtime parameters to control the keep-alive packet settings through socket options. On platforms that don't support these socket options, the values must be set system wide. --- bin/varnishd/cache/cache_acceptor.c | 116 +++++++++++++++++++++++++++++ bin/varnishd/common/params.h | 5 ++ bin/varnishd/mgt/mgt_param_tbl.c | 20 +++++ configure.ac | 32 ++++++++ doc/sphinx/installation/platformnotes.rst | 15 ++++ 5 files changed, 188 insertions(+) diff --git a/bin/varnishd/cache/cache_acceptor.c b/bin/varnishd/cache/cache_acceptor.c index 62209a5..4e17dfb 100644 --- a/bin/varnishd/cache/cache_acceptor.c +++ b/bin/varnishd/cache/cache_acceptor.c @@ -70,8 +70,23 @@ static const struct linger linger = { .l_onoff = 0, }; +/* + * We turn on keepalives by default to assist in detecting clients that have + * hung up on connections returning from waitinglists + */ +static const int keepalive = 1; + static unsigned char need_sndtimeo, need_rcvtimeo, need_linger, need_test, need_tcpnodelay; +static unsigned char need_keepalive = 0; +#ifdef TCP_KEEP_WORKS +static unsigned char need_ka_time = 0; +static unsigned char need_ka_probes = 0; +static unsigned char need_ka_intvl = 0; +static int ka_time = 0; +static int ka_probes = 0; +static int ka_intvl = 0; +#endif /*-------------------------------------------------------------------- * Some kernels have bugs/limitations with respect to which options are @@ -83,6 +98,10 @@ static void sock_test(int fd) { struct linger lin; + int tka; +#ifdef TCP_KEEP_WORKS + int tka_time, tka_probes, tka_intvl; +#endif struct timeval tv; socklen_t l; int i, tcp_nodelay; @@ -97,6 +116,48 @@ sock_test(int fd) if (memcmp(&lin, &linger, l)) need_linger = 1; + l = sizeof tka; + i = getsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &tka, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka); + if (tka != keepalive) + need_keepalive = 1; + +#ifdef TCP_KEEP_WORKS + l = sizeof tka_time; + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &tka_time, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka_time); + if (tka_time != ka_time) + need_ka_time = 1; + + l = sizeof tka_probes; + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &tka_probes, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka_probes); + if (tka_probes != ka_probes) + need_ka_probes = 1; + + l = sizeof tka_intvl; + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &tka_intvl, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka_intvl); + if (tka_intvl != ka_intvl) + need_ka_intvl = 1; +#endif + #ifdef SO_SNDTIMEO_WORKS l = sizeof tv; i = getsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &tv, &l); @@ -281,6 +342,22 @@ VCA_SetupSess(struct worker *wrk, struct sess *sp) if (need_linger) VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_LINGER, &linger, sizeof linger)); + if (need_keepalive) + VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_KEEPALIVE, + &keepalive, sizeof keepalive)); +#ifdef TCP_KEEP_WORKS + AN(ka_time); + if (need_ka_time) + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPIDLE, + &ka_time, sizeof ka_time)); + if (need_ka_probes) + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPCNT, + &ka_probes, sizeof ka_probes)); + if (need_ka_intvl) + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPINTVL, + &ka_intvl, sizeof ka_intvl)); +#endif + #ifdef SO_SNDTIMEO_WORKS if (need_sndtimeo) VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_SNDTIMEO, @@ -316,6 +393,12 @@ vca_acct(void *arg) THR_SetName("cache-acceptor"); (void)arg; +#ifdef TCP_KEEP_WORKS + ka_time = cache_param->tcp_keepalive_time; + ka_probes = cache_param->tcp_keepalive_probes; + ka_intvl = cache_param->tcp_keepalive_intvl; +#endif + VTAILQ_FOREACH(ls, &heritage.socks, list) { if (ls->sock < 0) continue; @@ -324,6 +407,16 @@ vca_acct(void *arg) &linger, sizeof linger)); AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_NODELAY, &tcp_nodelay, sizeof tcp_nodelay)); + AZ(setsockopt(ls->sock, SOL_SOCKET, SO_KEEPALIVE, + &keepalive, sizeof keepalive)); +#ifdef TCP_KEEP_WORKS + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPIDLE, + &ka_time, sizeof ka_time)); + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPCNT, + &ka_probes, sizeof ka_probes)); + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPINTVL, + &ka_intvl, sizeof ka_intvl)); +#endif if (cache_param->accept_filter) { i = VTCP_filter_http(ls->sock); if (i) @@ -339,6 +432,29 @@ vca_acct(void *arg) t0 = VTIM_real(); while (1) { (void)sleep(1); +#ifdef TCP_KEEP_WORKS + if (cache_param->tcp_keepalive_time != ka_time || + cache_param->tcp_keepalive_probes != ka_probes || + cache_param->tcp_keepalive_intvl != ka_intvl) { + need_test = 1; + ka_time = cache_param->tcp_keepalive_time; + ka_probes = cache_param->tcp_keepalive_probes; + ka_intvl = cache_param->tcp_keepalive_intvl; + VTAILQ_FOREACH(ls, &heritage.socks, list) { + if (ls->sock < 0) + continue; + AZ(setsockopt(ls->sock, IPPROTO_TCP, + TCP_KEEPIDLE, + &ka_time, sizeof ka_time)); + AZ(setsockopt(ls->sock, IPPROTO_TCP, + TCP_KEEPCNT, + &ka_probes, sizeof ka_probes)); + AZ(setsockopt(ls->sock, IPPROTO_TCP, + TCP_KEEPINTVL, + &ka_intvl, sizeof ka_intvl)); + } + } +#endif #ifdef SO_SNDTIMEO_WORKS if (cache_param->idle_send_timeout != send_timeout) { need_test = 1; diff --git a/bin/varnishd/common/params.h b/bin/varnishd/common/params.h index a6e881b..6893461 100644 --- a/bin/varnishd/common/params.h +++ b/bin/varnishd/common/params.h @@ -110,6 +110,11 @@ struct params { unsigned pipe_timeout; unsigned send_timeout; unsigned idle_send_timeout; +#ifdef TCP_KEEP_WORKS + unsigned tcp_keepalive_time; + unsigned tcp_keepalive_probes; + unsigned tcp_keepalive_intvl; +#endif /* Management hints */ unsigned auto_restart; diff --git a/bin/varnishd/mgt/mgt_param_tbl.c b/bin/varnishd/mgt/mgt_param_tbl.c index 8601bae..0380a02 100644 --- a/bin/varnishd/mgt/mgt_param_tbl.c +++ b/bin/varnishd/mgt/mgt_param_tbl.c @@ -205,6 +205,26 @@ const struct parspec mgt_parspec[] = { "See setsockopt(2) under SO_SNDTIMEO for more information.", DELAYED_EFFECT, "60", "seconds" }, +#ifdef TCP_KEEP_WORKS + { "tcp_keepalive_time", tweak_timeout, &mgt_param.tcp_keepalive_time, + 1, 7200, + "The number of seconds a connection needs to be idle before " + "TCP begins sending out keep-alive probes.", + 0, + "600", "seconds" }, + { "tcp_keepalive_probes", tweak_uint, &mgt_param.tcp_keepalive_probes, + 1, 100, + "The maximum number of TCP keep-alive probes to send before " + "giving up and killing the connection if no response is " + "obtained from the other end.", + 0, + "5", "probes" }, + { "tcp_keepalive_intvl", tweak_timeout, &mgt_param.tcp_keepalive_intvl, + 1, 100, + "The number of seconds between TCP keep-alive probes.", + 0, + "5", "seconds" }, +#endif { "auto_restart", tweak_bool, &mgt_param.auto_restart, 0, 0, "Restart child process automatically if it dies.\n", 0, diff --git a/configure.ac b/configure.ac index a4cd8e8..6613980 100644 --- a/configure.ac +++ b/configure.ac @@ -423,6 +423,38 @@ if test "$ac_cv_so_rcvtimeo_works" = no || fi LIBS="${save_LIBS}" +# Check if the OS supports TCP_KEEP(CNT|IDLE|INTVL) socket options +save_LIBS="${LIBS}" +LIBS="${LIBS} ${NET_LIBS}" +AC_CACHE_CHECK([for TCP_KEEP(CNT|IDLE|INTVL) socket options], + [ac_cv_tcp_keep_works], + [AC_RUN_IFELSE( + [AC_LANG_PROGRAM([[ +#include +#include +#include +#include +#include + ]],[[ +int s = socket(AF_INET, SOCK_STREAM, 0); +int i; +i = 5; +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPCNT, &i, sizeof i)) + return (1); +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPIDLE, &i, sizeof i)) + return (1); +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPINTVL, &i, sizeof i)) + return (1); +return (0); + ]])], + [ac_cv_tcp_keep_works=yes], + [ac_cv_tcp_keep_works=no]) + ]) +if test "$ac_cv_tcp_keep_works" = yes; then + AC_DEFINE([TCP_KEEP_WORKS], [1], [Define if TCP_KEEP* works]) +fi +LIBS="${save_LIBS}" + # Run-time directory VARNISH_STATE_DIR='${localstatedir}/varnish' AC_SUBST(VARNISH_STATE_DIR) diff --git a/doc/sphinx/installation/platformnotes.rst b/doc/sphinx/installation/platformnotes.rst index 3ad486c..e1720b6 100644 --- a/doc/sphinx/installation/platformnotes.rst +++ b/doc/sphinx/installation/platformnotes.rst @@ -35,3 +35,18 @@ Reduce the maximum stack size by running:: in the Varnish startup script. +TCP keep-alive configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On platforms except Linux, Varnish is not able to set the TCP +keep-alive values per socket, and therefor the tcp_keepalive_* Varnish +runtime parameters are not available. On these platforms it can be +benefitial to tune the system wide values for these in order to more +reliably detect remote close for sessions spending long time on +waitinglists. This will help free up resources faster. + +On Linux the defaults are set to: + + tcp_keepalive_time = 600 seconds + tcp_keepalive_probes = 5 + tcp_keepalive_intvl = 5 seconds -- 1.7.10.4 From martin at varnish-software.com Mon Feb 11 14:09:58 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Mon, 11 Feb 2013 15:09:58 +0100 Subject: [PATCH 2/2] On return from waitinglist, do a poll() on the socket to see if the client has closed the connection and gone away. If so, release the session early. In-Reply-To: <1360591798-25870-1-git-send-email-martin@varnish-software.com> References: <1360591798-25870-1-git-send-email-martin@varnish-software.com> Message-ID: <1360591798-25870-2-git-send-email-martin@varnish-software.com> Fixes: #1252 --- bin/varnishd/cache/cache_hash.c | 16 +++++++++++ bin/varnishd/cache/cache_http1_fsm.c | 14 +++++++++ bin/varnishd/hash/hash_slinger.h | 1 + bin/varnishtest/tests.disabled/r01252.vtc | 44 +++++++++++++++++++++++++++++ include/vtcp.h | 1 + lib/libvarnish/vtcp.c | 19 +++++++++++++ 6 files changed, 95 insertions(+) create mode 100644 bin/varnishtest/tests.disabled/r01252.vtc diff --git a/bin/varnishd/cache/cache_hash.c b/bin/varnishd/cache/cache_hash.c index d0d987e..c73bf5c 100644 --- a/bin/varnishd/cache/cache_hash.c +++ b/bin/varnishd/cache/cache_hash.c @@ -702,6 +702,22 @@ HSH_Deref(struct dstat *ds, struct objcore *oc, struct object **oo) } void +HSH_DerefObjHead(struct dstat *ds, struct objhead **poh) +{ + struct objhead *oh; + + AN(ds); + AN(poh); + oh = *poh; + *poh = NULL; + CHECK_OBJ_NOTNULL(oh, OBJHEAD_MAGIC); + + assert(oh->refcnt > 0); + if (!hash->deref(oh)) + HSH_DeleteObjHead(ds, oh); +} + +void HSH_Init(const struct hash_slinger *slinger) { diff --git a/bin/varnishd/cache/cache_http1_fsm.c b/bin/varnishd/cache/cache_http1_fsm.c index 2f99c6e..3950151 100644 --- a/bin/varnishd/cache/cache_http1_fsm.c +++ b/bin/varnishd/cache/cache_http1_fsm.c @@ -74,6 +74,7 @@ #include #include "cache.h" +#include "hash/hash_slinger.h" #include "vcl.h" #include "vct.h" @@ -329,6 +330,19 @@ HTTP1_Session(struct worker *wrk, struct req *req) return; } + /* + * Return from waitinglist. Check to see if the remote has left. + */ + if (req->req_step == R_STP_LOOKUP && VTCP_check_hup(sp->fd)) { + AN(req->hash_objhead); + HSH_DerefObjHead(&wrk->stats, &req->hash_objhead); + AZ(req->hash_objhead); + SES_Close(sp, SC_REM_CLOSE); + sdr = http1_cleanup(sp, wrk, req); + assert(sdr == SESS_DONE_RET_GONE); + return; + } + if (sp->sess_step == S_STP_NEWREQ) { HTTP1_Init(req->htc, req->ws, sp->fd, req->vsl, cache_param->http_req_size, diff --git a/bin/varnishd/hash/hash_slinger.h b/bin/varnishd/hash/hash_slinger.h index c385ea6..856252d 100644 --- a/bin/varnishd/hash/hash_slinger.h +++ b/bin/varnishd/hash/hash_slinger.h @@ -98,6 +98,7 @@ struct objhead { void HSH_Unbusy(struct dstat *, struct objcore *); void HSH_Complete(struct objcore *oc); void HSH_DeleteObjHead(struct dstat *, struct objhead *oh); +void HSH_DerefObjHead(struct dstat *, struct objhead **poh); int HSH_Deref(struct dstat *, struct objcore *oc, struct object **o); #endif /* VARNISH_CACHE_CHILD */ diff --git a/bin/varnishtest/tests.disabled/r01252.vtc b/bin/varnishtest/tests.disabled/r01252.vtc new file mode 100644 index 0000000..78d0b23 --- /dev/null +++ b/bin/varnishtest/tests.disabled/r01252.vtc @@ -0,0 +1,44 @@ +varnishtest "#1252 - Drop remote closed connections returning from waitinglists" + +# This test case is disabled because it will only pass on Linux (needs +# tcp_keepalive runtime arguments only available on Linux), and also +# because it requires "-t 75" argument to varnishtest (remote closed +# state will only be detected after FIN timeout has passed (60s)) + +server s1 { + rxreq + expect req.http.X-Client == "1" + sema r1 sync 2 + delay 75 + close +} -start + +server s2 { + rxreq + expect req.url == "/should/not/happen" + txresp +} -start + +varnish v1 -arg "-p debug=+waitinglist -p tcp_keepalive_time=1s -p tcp_keepalive_probes=1 -p tcp_keepalive_intvl=1s -p first_byte_timeout=70" -vcl+backend { + sub vcl_recv { + if (req.http.x-client == "2") { + set req.backend = s2; + } + } +} -start + +client c1 { + timeout 70 + txreq -hdr "X-Client: 1" + rxresp + expect resp.status == 503 +} -start + +client c2 { + sema r1 sync 2 + txreq -hdr "X-Client: 2" + delay 1 +} -start + +client c1 -wait +client c2 -wait diff --git a/include/vtcp.h b/include/vtcp.h index 77f86ed..1594a4d 100644 --- a/include/vtcp.h +++ b/include/vtcp.h @@ -62,6 +62,7 @@ int VTCP_filter_http(int sock); int VTCP_blocking(int sock); int VTCP_nonblocking(int sock); int VTCP_linger(int sock, int linger); +int VTCP_check_hup(int sock); #ifdef SOL_SOCKET int VTCP_port(const struct sockaddr_storage *addr); diff --git a/lib/libvarnish/vtcp.c b/lib/libvarnish/vtcp.c index 2c6dd3f..7227fb3 100644 --- a/lib/libvarnish/vtcp.c +++ b/lib/libvarnish/vtcp.c @@ -308,3 +308,22 @@ VTCP_linger(int sock, int linger) VTCP_Assert(i); return (i); } + +/*-------------------------------------------------------------------- + * Do a poll to check for remote HUP + */ + +int +VTCP_check_hup(int sock) +{ + struct pollfd pfd; + + assert(sock > 0); + pfd.fd = sock; + pfd.events = POLLOUT; + pfd.revents = 0; + + if (poll(&pfd, 1, 0) == 1 && pfd.revents & POLLHUP) + return (1); + return (0); +} -- 1.7.10.4 From andrea.campi at zephirworks.com Mon Feb 11 14:20:57 2013 From: andrea.campi at zephirworks.com (Andrea Campi) Date: Mon, 11 Feb 2013 15:20:57 +0100 Subject: [PATCH 1/2] Turn on SO_KEEPALIVE on all TCP connections. In-Reply-To: <1360591798-25870-1-git-send-email-martin@varnish-software.com> References: <1360591798-25870-1-git-send-email-martin@varnish-software.com> Message-ID: FreeBSD seems to also have these sockets options since this commit: http://svnweb.freebsd.org/base?view=revision&revision=232945 At a quick glance, this went in for FreeBSD 9.1. On Mon, Feb 11, 2013 at 3:09 PM, Martin Blix Grydeland < martin at varnish-software.com> wrote: > This will help in determining remote hang up of the connection for > situations where we still are not able to send any reply, but freeing > the session will reduce resource overhead (e.g. when staying on > waitinglists for extended periods). > > On platforms that support it also add runtime parameters to control > the keep-alive packet settings through socket options. On platforms > that don't support these socket options, the values must be set system > wide. > --- > bin/varnishd/cache/cache_acceptor.c | 116 > +++++++++++++++++++++++++++++ > bin/varnishd/common/params.h | 5 ++ > bin/varnishd/mgt/mgt_param_tbl.c | 20 +++++ > configure.ac | 32 ++++++++ > doc/sphinx/installation/platformnotes.rst | 15 ++++ > 5 files changed, 188 insertions(+) > > diff --git a/bin/varnishd/cache/cache_acceptor.c > b/bin/varnishd/cache/cache_acceptor.c > index 62209a5..4e17dfb 100644 > --- a/bin/varnishd/cache/cache_acceptor.c > +++ b/bin/varnishd/cache/cache_acceptor.c > @@ -70,8 +70,23 @@ static const struct linger linger = { > .l_onoff = 0, > }; > > +/* > + * We turn on keepalives by default to assist in detecting clients that > have > + * hung up on connections returning from waitinglists > + */ > +static const int keepalive = 1; > + > static unsigned char need_sndtimeo, need_rcvtimeo, need_linger, > need_test, > need_tcpnodelay; > +static unsigned char need_keepalive = 0; > +#ifdef TCP_KEEP_WORKS > +static unsigned char need_ka_time = 0; > +static unsigned char need_ka_probes = 0; > +static unsigned char need_ka_intvl = 0; > +static int ka_time = 0; > +static int ka_probes = 0; > +static int ka_intvl = 0; > +#endif > > /*-------------------------------------------------------------------- > * Some kernels have bugs/limitations with respect to which options are > @@ -83,6 +98,10 @@ static void > sock_test(int fd) > { > struct linger lin; > + int tka; > +#ifdef TCP_KEEP_WORKS > + int tka_time, tka_probes, tka_intvl; > +#endif > struct timeval tv; > socklen_t l; > int i, tcp_nodelay; > @@ -97,6 +116,48 @@ sock_test(int fd) > if (memcmp(&lin, &linger, l)) > need_linger = 1; > > + l = sizeof tka; > + i = getsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &tka, &l); > + if (i) { > + VTCP_Assert(i); > + return; > + } > + assert(l == sizeof tka); > + if (tka != keepalive) > + need_keepalive = 1; > + > +#ifdef TCP_KEEP_WORKS > + l = sizeof tka_time; > + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &tka_time, &l); > + if (i) { > + VTCP_Assert(i); > + return; > + } > + assert(l == sizeof tka_time); > + if (tka_time != ka_time) > + need_ka_time = 1; > + > + l = sizeof tka_probes; > + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &tka_probes, &l); > + if (i) { > + VTCP_Assert(i); > + return; > + } > + assert(l == sizeof tka_probes); > + if (tka_probes != ka_probes) > + need_ka_probes = 1; > + > + l = sizeof tka_intvl; > + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &tka_intvl, &l); > + if (i) { > + VTCP_Assert(i); > + return; > + } > + assert(l == sizeof tka_intvl); > + if (tka_intvl != ka_intvl) > + need_ka_intvl = 1; > +#endif > + > #ifdef SO_SNDTIMEO_WORKS > l = sizeof tv; > i = getsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &tv, &l); > @@ -281,6 +342,22 @@ VCA_SetupSess(struct worker *wrk, struct sess *sp) > if (need_linger) > VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_LINGER, > &linger, sizeof linger)); > + if (need_keepalive) > + VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_KEEPALIVE, > + &keepalive, sizeof keepalive)); > +#ifdef TCP_KEEP_WORKS > + AN(ka_time); > + if (need_ka_time) > + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPIDLE, > + &ka_time, sizeof ka_time)); > + if (need_ka_probes) > + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPCNT, > + &ka_probes, sizeof ka_probes)); > + if (need_ka_intvl) > + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPINTVL, > + &ka_intvl, sizeof ka_intvl)); > +#endif > + > #ifdef SO_SNDTIMEO_WORKS > if (need_sndtimeo) > VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_SNDTIMEO, > @@ -316,6 +393,12 @@ vca_acct(void *arg) > THR_SetName("cache-acceptor"); > (void)arg; > > +#ifdef TCP_KEEP_WORKS > + ka_time = cache_param->tcp_keepalive_time; > + ka_probes = cache_param->tcp_keepalive_probes; > + ka_intvl = cache_param->tcp_keepalive_intvl; > +#endif > + > VTAILQ_FOREACH(ls, &heritage.socks, list) { > if (ls->sock < 0) > continue; > @@ -324,6 +407,16 @@ vca_acct(void *arg) > &linger, sizeof linger)); > AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_NODELAY, > &tcp_nodelay, sizeof tcp_nodelay)); > + AZ(setsockopt(ls->sock, SOL_SOCKET, SO_KEEPALIVE, > + &keepalive, sizeof keepalive)); > +#ifdef TCP_KEEP_WORKS > + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPIDLE, > + &ka_time, sizeof ka_time)); > + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPCNT, > + &ka_probes, sizeof ka_probes)); > + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPINTVL, > + &ka_intvl, sizeof ka_intvl)); > +#endif > if (cache_param->accept_filter) { > i = VTCP_filter_http(ls->sock); > if (i) > @@ -339,6 +432,29 @@ vca_acct(void *arg) > t0 = VTIM_real(); > while (1) { > (void)sleep(1); > +#ifdef TCP_KEEP_WORKS > + if (cache_param->tcp_keepalive_time != ka_time || > + cache_param->tcp_keepalive_probes != ka_probes || > + cache_param->tcp_keepalive_intvl != ka_intvl) { > + need_test = 1; > + ka_time = cache_param->tcp_keepalive_time; > + ka_probes = cache_param->tcp_keepalive_probes; > + ka_intvl = cache_param->tcp_keepalive_intvl; > + VTAILQ_FOREACH(ls, &heritage.socks, list) { > + if (ls->sock < 0) > + continue; > + AZ(setsockopt(ls->sock, IPPROTO_TCP, > + TCP_KEEPIDLE, > + &ka_time, sizeof ka_time)); > + AZ(setsockopt(ls->sock, IPPROTO_TCP, > + TCP_KEEPCNT, > + &ka_probes, sizeof ka_probes)); > + AZ(setsockopt(ls->sock, IPPROTO_TCP, > + TCP_KEEPINTVL, > + &ka_intvl, sizeof ka_intvl)); > + } > + } > +#endif > #ifdef SO_SNDTIMEO_WORKS > if (cache_param->idle_send_timeout != send_timeout) { > need_test = 1; > diff --git a/bin/varnishd/common/params.h b/bin/varnishd/common/params.h > index a6e881b..6893461 100644 > --- a/bin/varnishd/common/params.h > +++ b/bin/varnishd/common/params.h > @@ -110,6 +110,11 @@ struct params { > unsigned pipe_timeout; > unsigned send_timeout; > unsigned idle_send_timeout; > +#ifdef TCP_KEEP_WORKS > + unsigned tcp_keepalive_time; > + unsigned tcp_keepalive_probes; > + unsigned tcp_keepalive_intvl; > +#endif > > /* Management hints */ > unsigned auto_restart; > diff --git a/bin/varnishd/mgt/mgt_param_tbl.c > b/bin/varnishd/mgt/mgt_param_tbl.c > index 8601bae..0380a02 100644 > --- a/bin/varnishd/mgt/mgt_param_tbl.c > +++ b/bin/varnishd/mgt/mgt_param_tbl.c > @@ -205,6 +205,26 @@ const struct parspec mgt_parspec[] = { > "See setsockopt(2) under SO_SNDTIMEO for more > information.", > DELAYED_EFFECT, > "60", "seconds" }, > +#ifdef TCP_KEEP_WORKS > + { "tcp_keepalive_time", tweak_timeout, > &mgt_param.tcp_keepalive_time, > + 1, 7200, > + "The number of seconds a connection needs to be idle > before " > + "TCP begins sending out keep-alive probes.", > + 0, > + "600", "seconds" }, > + { "tcp_keepalive_probes", tweak_uint, > &mgt_param.tcp_keepalive_probes, > + 1, 100, > + "The maximum number of TCP keep-alive probes to send > before " > + "giving up and killing the connection if no response is " > + "obtained from the other end.", > + 0, > + "5", "probes" }, > + { "tcp_keepalive_intvl", tweak_timeout, > &mgt_param.tcp_keepalive_intvl, > + 1, 100, > + "The number of seconds between TCP keep-alive probes.", > + 0, > + "5", "seconds" }, > +#endif > { "auto_restart", tweak_bool, &mgt_param.auto_restart, 0, 0, > "Restart child process automatically if it dies.\n", > 0, > diff --git a/configure.ac b/configure.ac > index a4cd8e8..6613980 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -423,6 +423,38 @@ if test "$ac_cv_so_rcvtimeo_works" = no || > fi > LIBS="${save_LIBS}" > > +# Check if the OS supports TCP_KEEP(CNT|IDLE|INTVL) socket options > +save_LIBS="${LIBS}" > +LIBS="${LIBS} ${NET_LIBS}" > +AC_CACHE_CHECK([for TCP_KEEP(CNT|IDLE|INTVL) socket options], > + [ac_cv_tcp_keep_works], > + [AC_RUN_IFELSE( > + [AC_LANG_PROGRAM([[ > +#include > +#include > +#include > +#include > +#include > + ]],[[ > +int s = socket(AF_INET, SOCK_STREAM, 0); > +int i; > +i = 5; > +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPCNT, &i, sizeof i)) > + return (1); > +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPIDLE, &i, sizeof i)) > + return (1); > +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPINTVL, &i, sizeof i)) > + return (1); > +return (0); > + ]])], > + [ac_cv_tcp_keep_works=yes], > + [ac_cv_tcp_keep_works=no]) > + ]) > +if test "$ac_cv_tcp_keep_works" = yes; then > + AC_DEFINE([TCP_KEEP_WORKS], [1], [Define if TCP_KEEP* works]) > +fi > +LIBS="${save_LIBS}" > + > # Run-time directory > VARNISH_STATE_DIR='${localstatedir}/varnish' > AC_SUBST(VARNISH_STATE_DIR) > diff --git a/doc/sphinx/installation/platformnotes.rst > b/doc/sphinx/installation/platformnotes.rst > index 3ad486c..e1720b6 100644 > --- a/doc/sphinx/installation/platformnotes.rst > +++ b/doc/sphinx/installation/platformnotes.rst > @@ -35,3 +35,18 @@ Reduce the maximum stack size by running:: > > in the Varnish startup script. > > +TCP keep-alive configuration > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +On platforms except Linux, Varnish is not able to set the TCP > +keep-alive values per socket, and therefor the tcp_keepalive_* Varnish > +runtime parameters are not available. On these platforms it can be > +benefitial to tune the system wide values for these in order to more > +reliably detect remote close for sessions spending long time on > +waitinglists. This will help free up resources faster. > + > +On Linux the defaults are set to: > + > + tcp_keepalive_time = 600 seconds > + tcp_keepalive_probes = 5 > + tcp_keepalive_intvl = 5 seconds > -- > 1.7.10.4 > > > _______________________________________________ > varnish-dev mailing list > varnish-dev at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at varnish-software.com Mon Feb 11 14:50:44 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Mon, 11 Feb 2013 15:50:44 +0100 Subject: [PATCH 1/2] Turn on SO_KEEPALIVE on all TCP connections. In-Reply-To: References: <1360591798-25870-1-git-send-email-martin@varnish-software.com> Message-ID: Ah, I wasn't aware of that. Thanks Andrea, I will update the docs to reflect that. -Martin On Mon, Feb 11, 2013 at 3:20 PM, Andrea Campi wrote: > FreeBSD seems to also have these sockets options since this commit: > > http://svnweb.freebsd.org/base?view=revision&revision=232945 > > At a quick glance, this went in for FreeBSD 9.1. > > > On Mon, Feb 11, 2013 at 3:09 PM, Martin Blix Grydeland < > martin at varnish-software.com> wrote: > >> This will help in determining remote hang up of the connection for >> situations where we still are not able to send any reply, but freeing >> the session will reduce resource overhead (e.g. when staying on >> waitinglists for extended periods). >> >> On platforms that support it also add runtime parameters to control >> the keep-alive packet settings through socket options. On platforms >> that don't support these socket options, the values must be set system >> wide. >> --- >> bin/varnishd/cache/cache_acceptor.c | 116 >> +++++++++++++++++++++++++++++ >> bin/varnishd/common/params.h | 5 ++ >> bin/varnishd/mgt/mgt_param_tbl.c | 20 +++++ >> configure.ac | 32 ++++++++ >> doc/sphinx/installation/platformnotes.rst | 15 ++++ >> 5 files changed, 188 insertions(+) >> >> diff --git a/bin/varnishd/cache/cache_acceptor.c >> b/bin/varnishd/cache/cache_acceptor.c >> index 62209a5..4e17dfb 100644 >> --- a/bin/varnishd/cache/cache_acceptor.c >> +++ b/bin/varnishd/cache/cache_acceptor.c >> @@ -70,8 +70,23 @@ static const struct linger linger = { >> .l_onoff = 0, >> }; >> >> +/* >> + * We turn on keepalives by default to assist in detecting clients that >> have >> + * hung up on connections returning from waitinglists >> + */ >> +static const int keepalive = 1; >> + >> static unsigned char need_sndtimeo, need_rcvtimeo, need_linger, >> need_test, >> need_tcpnodelay; >> +static unsigned char need_keepalive = 0; >> +#ifdef TCP_KEEP_WORKS >> +static unsigned char need_ka_time = 0; >> +static unsigned char need_ka_probes = 0; >> +static unsigned char need_ka_intvl = 0; >> +static int ka_time = 0; >> +static int ka_probes = 0; >> +static int ka_intvl = 0; >> +#endif >> >> /*-------------------------------------------------------------------- >> * Some kernels have bugs/limitations with respect to which options are >> @@ -83,6 +98,10 @@ static void >> sock_test(int fd) >> { >> struct linger lin; >> + int tka; >> +#ifdef TCP_KEEP_WORKS >> + int tka_time, tka_probes, tka_intvl; >> +#endif >> struct timeval tv; >> socklen_t l; >> int i, tcp_nodelay; >> @@ -97,6 +116,48 @@ sock_test(int fd) >> if (memcmp(&lin, &linger, l)) >> need_linger = 1; >> >> + l = sizeof tka; >> + i = getsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &tka, &l); >> + if (i) { >> + VTCP_Assert(i); >> + return; >> + } >> + assert(l == sizeof tka); >> + if (tka != keepalive) >> + need_keepalive = 1; >> + >> +#ifdef TCP_KEEP_WORKS >> + l = sizeof tka_time; >> + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &tka_time, &l); >> + if (i) { >> + VTCP_Assert(i); >> + return; >> + } >> + assert(l == sizeof tka_time); >> + if (tka_time != ka_time) >> + need_ka_time = 1; >> + >> + l = sizeof tka_probes; >> + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &tka_probes, &l); >> + if (i) { >> + VTCP_Assert(i); >> + return; >> + } >> + assert(l == sizeof tka_probes); >> + if (tka_probes != ka_probes) >> + need_ka_probes = 1; >> + >> + l = sizeof tka_intvl; >> + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &tka_intvl, &l); >> + if (i) { >> + VTCP_Assert(i); >> + return; >> + } >> + assert(l == sizeof tka_intvl); >> + if (tka_intvl != ka_intvl) >> + need_ka_intvl = 1; >> +#endif >> + >> #ifdef SO_SNDTIMEO_WORKS >> l = sizeof tv; >> i = getsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &tv, &l); >> @@ -281,6 +342,22 @@ VCA_SetupSess(struct worker *wrk, struct sess *sp) >> if (need_linger) >> VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_LINGER, >> &linger, sizeof linger)); >> + if (need_keepalive) >> + VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_KEEPALIVE, >> + &keepalive, sizeof keepalive)); >> +#ifdef TCP_KEEP_WORKS >> + AN(ka_time); >> + if (need_ka_time) >> + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPIDLE, >> + &ka_time, sizeof ka_time)); >> + if (need_ka_probes) >> + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPCNT, >> + &ka_probes, sizeof ka_probes)); >> + if (need_ka_intvl) >> + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPINTVL, >> + &ka_intvl, sizeof ka_intvl)); >> +#endif >> + >> #ifdef SO_SNDTIMEO_WORKS >> if (need_sndtimeo) >> VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_SNDTIMEO, >> @@ -316,6 +393,12 @@ vca_acct(void *arg) >> THR_SetName("cache-acceptor"); >> (void)arg; >> >> +#ifdef TCP_KEEP_WORKS >> + ka_time = cache_param->tcp_keepalive_time; >> + ka_probes = cache_param->tcp_keepalive_probes; >> + ka_intvl = cache_param->tcp_keepalive_intvl; >> +#endif >> + >> VTAILQ_FOREACH(ls, &heritage.socks, list) { >> if (ls->sock < 0) >> continue; >> @@ -324,6 +407,16 @@ vca_acct(void *arg) >> &linger, sizeof linger)); >> AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_NODELAY, >> &tcp_nodelay, sizeof tcp_nodelay)); >> + AZ(setsockopt(ls->sock, SOL_SOCKET, SO_KEEPALIVE, >> + &keepalive, sizeof keepalive)); >> +#ifdef TCP_KEEP_WORKS >> + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPIDLE, >> + &ka_time, sizeof ka_time)); >> + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPCNT, >> + &ka_probes, sizeof ka_probes)); >> + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPINTVL, >> + &ka_intvl, sizeof ka_intvl)); >> +#endif >> if (cache_param->accept_filter) { >> i = VTCP_filter_http(ls->sock); >> if (i) >> @@ -339,6 +432,29 @@ vca_acct(void *arg) >> t0 = VTIM_real(); >> while (1) { >> (void)sleep(1); >> +#ifdef TCP_KEEP_WORKS >> + if (cache_param->tcp_keepalive_time != ka_time || >> + cache_param->tcp_keepalive_probes != ka_probes || >> + cache_param->tcp_keepalive_intvl != ka_intvl) { >> + need_test = 1; >> + ka_time = cache_param->tcp_keepalive_time; >> + ka_probes = cache_param->tcp_keepalive_probes; >> + ka_intvl = cache_param->tcp_keepalive_intvl; >> + VTAILQ_FOREACH(ls, &heritage.socks, list) { >> + if (ls->sock < 0) >> + continue; >> + AZ(setsockopt(ls->sock, IPPROTO_TCP, >> + TCP_KEEPIDLE, >> + &ka_time, sizeof ka_time)); >> + AZ(setsockopt(ls->sock, IPPROTO_TCP, >> + TCP_KEEPCNT, >> + &ka_probes, sizeof ka_probes)); >> + AZ(setsockopt(ls->sock, IPPROTO_TCP, >> + TCP_KEEPINTVL, >> + &ka_intvl, sizeof ka_intvl)); >> + } >> + } >> +#endif >> #ifdef SO_SNDTIMEO_WORKS >> if (cache_param->idle_send_timeout != send_timeout) { >> need_test = 1; >> diff --git a/bin/varnishd/common/params.h b/bin/varnishd/common/params.h >> index a6e881b..6893461 100644 >> --- a/bin/varnishd/common/params.h >> +++ b/bin/varnishd/common/params.h >> @@ -110,6 +110,11 @@ struct params { >> unsigned pipe_timeout; >> unsigned send_timeout; >> unsigned idle_send_timeout; >> +#ifdef TCP_KEEP_WORKS >> + unsigned tcp_keepalive_time; >> + unsigned tcp_keepalive_probes; >> + unsigned tcp_keepalive_intvl; >> +#endif >> >> /* Management hints */ >> unsigned auto_restart; >> diff --git a/bin/varnishd/mgt/mgt_param_tbl.c >> b/bin/varnishd/mgt/mgt_param_tbl.c >> index 8601bae..0380a02 100644 >> --- a/bin/varnishd/mgt/mgt_param_tbl.c >> +++ b/bin/varnishd/mgt/mgt_param_tbl.c >> @@ -205,6 +205,26 @@ const struct parspec mgt_parspec[] = { >> "See setsockopt(2) under SO_SNDTIMEO for more >> information.", >> DELAYED_EFFECT, >> "60", "seconds" }, >> +#ifdef TCP_KEEP_WORKS >> + { "tcp_keepalive_time", tweak_timeout, >> &mgt_param.tcp_keepalive_time, >> + 1, 7200, >> + "The number of seconds a connection needs to be idle >> before " >> + "TCP begins sending out keep-alive probes.", >> + 0, >> + "600", "seconds" }, >> + { "tcp_keepalive_probes", tweak_uint, >> &mgt_param.tcp_keepalive_probes, >> + 1, 100, >> + "The maximum number of TCP keep-alive probes to send >> before " >> + "giving up and killing the connection if no response is " >> + "obtained from the other end.", >> + 0, >> + "5", "probes" }, >> + { "tcp_keepalive_intvl", tweak_timeout, >> &mgt_param.tcp_keepalive_intvl, >> + 1, 100, >> + "The number of seconds between TCP keep-alive probes.", >> + 0, >> + "5", "seconds" }, >> +#endif >> { "auto_restart", tweak_bool, &mgt_param.auto_restart, 0, 0, >> "Restart child process automatically if it dies.\n", >> 0, >> diff --git a/configure.ac b/configure.ac >> index a4cd8e8..6613980 100644 >> --- a/configure.ac >> +++ b/configure.ac >> @@ -423,6 +423,38 @@ if test "$ac_cv_so_rcvtimeo_works" = no || >> fi >> LIBS="${save_LIBS}" >> >> +# Check if the OS supports TCP_KEEP(CNT|IDLE|INTVL) socket options >> +save_LIBS="${LIBS}" >> +LIBS="${LIBS} ${NET_LIBS}" >> +AC_CACHE_CHECK([for TCP_KEEP(CNT|IDLE|INTVL) socket options], >> + [ac_cv_tcp_keep_works], >> + [AC_RUN_IFELSE( >> + [AC_LANG_PROGRAM([[ >> +#include >> +#include >> +#include >> +#include >> +#include >> + ]],[[ >> +int s = socket(AF_INET, SOCK_STREAM, 0); >> +int i; >> +i = 5; >> +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPCNT, &i, sizeof i)) >> + return (1); >> +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPIDLE, &i, sizeof i)) >> + return (1); >> +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPINTVL, &i, sizeof i)) >> + return (1); >> +return (0); >> + ]])], >> + [ac_cv_tcp_keep_works=yes], >> + [ac_cv_tcp_keep_works=no]) >> + ]) >> +if test "$ac_cv_tcp_keep_works" = yes; then >> + AC_DEFINE([TCP_KEEP_WORKS], [1], [Define if TCP_KEEP* works]) >> +fi >> +LIBS="${save_LIBS}" >> + >> # Run-time directory >> VARNISH_STATE_DIR='${localstatedir}/varnish' >> AC_SUBST(VARNISH_STATE_DIR) >> diff --git a/doc/sphinx/installation/platformnotes.rst >> b/doc/sphinx/installation/platformnotes.rst >> index 3ad486c..e1720b6 100644 >> --- a/doc/sphinx/installation/platformnotes.rst >> +++ b/doc/sphinx/installation/platformnotes.rst >> @@ -35,3 +35,18 @@ Reduce the maximum stack size by running:: >> >> in the Varnish startup script. >> >> +TCP keep-alive configuration >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + >> +On platforms except Linux, Varnish is not able to set the TCP >> +keep-alive values per socket, and therefor the tcp_keepalive_* Varnish >> +runtime parameters are not available. On these platforms it can be >> +benefitial to tune the system wide values for these in order to more >> +reliably detect remote close for sessions spending long time on >> +waitinglists. This will help free up resources faster. >> + >> +On Linux the defaults are set to: >> + >> + tcp_keepalive_time = 600 seconds >> + tcp_keepalive_probes = 5 >> + tcp_keepalive_intvl = 5 seconds >> -- >> 1.7.10.4 >> >> >> _______________________________________________ >> varnish-dev mailing list >> varnish-dev at varnish-cache.org >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev >> > > -- *Martin Blix Grydeland* Senior Developer | Varnish Software AS Cell: +47 21 98 92 60 We Make Websites Fly! -------------- next part -------------- An HTML attachment was scrubbed... URL: From phk at phk.freebsd.dk Thu Feb 14 10:04:18 2013 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 14 Feb 2013 10:04:18 +0000 Subject: [PATCH 1/2] Turn on SO_KEEPALIVE on all TCP connections. In-Reply-To: <1360591798-25870-1-git-send-email-martin@varnish-software.com> References: <1360591798-25870-1-git-send-email-martin@varnish-software.com> Message-ID: <3495.1360836258@critter.freebsd.dk> Content-Type: text/plain; charset=ISO-8859-1 -------- In message <1360591798-25870-1-git-send-email-martin at varnish-software.com>, Mar tin Blix Grydeland writes: >On platforms that support it also add runtime parameters to control >the keep-alive packet settings through socket options. On platforms >that don't support these socket options, the values must be set system >wide. I think "TCP_KEEP_WORKS" is misnamed, auto* doesn't actually test if it works, but only that it does not fail the setsockopt, it should be an "HAVE_TCP_KEEP" option. That said, which operating system does _not_ have it ? If all relevant kernels have it, there is no reason to pollute the source with #ifdefs and autocrappery. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Thu Feb 14 10:08:08 2013 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 14 Feb 2013 10:08:08 +0000 Subject: [PATCH 2/2] On return from waitinglist, do a poll() on the socket to see if the client has closed the connection and gone away. If so, release the session early. In-Reply-To: <1360591798-25870-2-git-send-email-martin@varnish-software.com> References: <1360591798-25870-1-git-send-email-martin@varnish-software.com> <1360591798-25870-2-git-send-email-martin@varnish-software.com> Message-ID: <3522.1360836488@critter.freebsd.dk> Content-Type: text/plain; charset=ISO-8859-1 -------- In message <1360591798-25870-2-git-send-email-martin at varnish-software.com>, Mar tin Blix Grydeland writes: If you are going to split HSH_DerefObjHead() out to a function, you should also call that function instead of inlining in HSH_Deref(). -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From lkarsten at varnish-software.com Thu Feb 14 15:51:40 2013 From: lkarsten at varnish-software.com (Lasse Karstensen) Date: Thu, 14 Feb 2013 16:51:40 +0100 Subject: Extending vcl_trace to log VCL value assignments Message-ID: <20130214155139.GD14555@immer.varnish-software.com> Hello. A long standing todo item of ours (VS) is that we should make some sort of VCL debugger or execution tracer. Many new users find it difficult to read the verbose output of varnishlog, and some hand-holding might be in place. I've been looking at the VCL trace function, which is a varnish parameter to enable log lines in varnishlog. The output says that the execution is now on line x char y of the VCL line, logged whenever there is a function call. In our experience, this feature is rarely or never used. My thinking is that this VCL_trace can be extended to log the previous and new value whenever an assignment is done in VCL. This would enable us to write software that takes the VCL and varnishlog as inputs, and let the user step through the VCL execution and see what the different headers/values are at each step. I think this should be possible to turn on per request in addition to the Varnish wide parameter it is now. This enables us to do "online debugging"; a Varnish user is having trouble with a distinct URL and can input that into the tool and have it run just once without affecting other requests or site performance. >From a VCL perspective it can look like this: 5 sub vcl_recv { 6 if (req.http.x-trace-request == "yes") 7 { 8 set req.trace = true;? 9 } 10 set req.http.x-tmp = "foo"; 11 set req.http.x-tmp = "bar"; 12 } The obvious drawback is of course that any VCL executed before this won't get trace information. I think this can be accepted in this case. In the shmlog the output can be: 11 RxHeader c Host: localhost 11 RxHeader c User-Agent: lwp-request/6.03 libwww-perl/6.04 11 VCL_ValueTrace c 13 10.13 "" "foo" 11 VCL_ValueTrace c 14 11.13 "foo" "bar" 11 VCL_call c recv 1 42.5 2 41.5 3 42.9 5 46.13 6 49.5 8 59.5 10 63.5 12 67.5 lookup 11 VCL_call c hash 15 85.5 11 Hash c / 11 VCL_trace c 16 87.9 11 Hash c localhost Implementation wise this can be done via a small patch to VRT_SetHdr(), which check if tracing is on and writes VCL_ValueTrace to VSL. There must be a flag somewhere (struct req?) that indicates if this connection/session has tracing set to true. I think this should be a fairly quick win, little effort needed to get great debugging value out of. I can prepare a patch if there is consensus on how to proceed. Any input is appreciated. -- With regards, Lasse Karstensen Varnish Software AS From bilbo at hobbiton.org Thu Feb 14 17:29:21 2013 From: bilbo at hobbiton.org (Leif Pedersen) Date: Thu, 14 Feb 2013 11:29:21 -0600 Subject: Extending vcl_trace to log VCL value assignments In-Reply-To: <20130214155139.GD14555@immer.varnish-software.com> References: <20130214155139.GD14555@immer.varnish-software.com> Message-ID: Lasse, This sounds pretty useful. I do a lot of work with VCL and I can usually track what's happened pretty easily, but when my coworkers have to look through it, they tend to get lost since their focus is on other parts of our servers. This could really help them trace what's going on in the VCL code that I write. I have a suggestion. How about emitting an explicit log message when tracing is enabled or disabled? This way a log processor can quickly identify a request when it turns on tracing and automatically filter only those from the log to display. - Leif (PS. I'm a long-time user, but until now I've only been lurking on the list. Hi, all.) On Thu, Feb 14, 2013 at 9:51 AM, Lasse Karstensen < lkarsten at varnish-software.com> wrote: > Hello. > > A long standing todo item of ours (VS) is that we should make some sort > of VCL debugger or execution tracer. > > Many new users find it difficult to read the verbose output of varnishlog, > and > some hand-holding might be in place. > > I've been looking at the VCL trace function, which is a varnish parameter > to enable > log lines in varnishlog. The output says that the execution is now on line > x > char y of the VCL line, logged whenever there is a function call. > In our experience, this feature is rarely or never used. > > My thinking is that this VCL_trace can be extended to log the previous and > new > value whenever an assignment is done in VCL. This would enable us to write > software that takes the VCL and varnishlog as inputs, and let the user step > through the VCL execution and see what the different headers/values are at > each > step. > > I think this should be possible to turn on per request in addition to the > Varnish wide parameter it is now. This enables us to do "online > debugging"; a > Varnish user is having trouble with a distinct URL and can input that into > the > tool and have it run just once without affecting other requests or site > performance. > > From a VCL perspective it can look like this: > > 5 sub vcl_recv { > 6 if (req.http.x-trace-request == "yes") > 7 { > 8 set req.trace = true; > 9 } > 10 set req.http.x-tmp = "foo"; > 11 set req.http.x-tmp = "bar"; > 12 } > > The obvious drawback is of course that any VCL executed before this won't > get > trace information. I think this can be accepted in this case. > > In the shmlog the output can be: > > 11 RxHeader c Host: localhost > 11 RxHeader c User-Agent: lwp-request/6.03 libwww-perl/6.04 > 11 VCL_ValueTrace c 13 10.13 "" "foo" > 11 VCL_ValueTrace c 14 11.13 "foo" "bar" > 11 VCL_call c recv 1 42.5 2 41.5 3 42.9 5 46.13 6 49.5 8 59.5 10 > 63.5 12 67.5 lookup > 11 VCL_call c hash 15 85.5 > 11 Hash c / > 11 VCL_trace c 16 87.9 > 11 Hash c localhost > > > Implementation wise this can be done via a small patch to VRT_SetHdr(), > which > check if tracing is on and writes VCL_ValueTrace to VSL. There must be a > flag somewhere (struct req?) that indicates if this connection/session has > tracing set to true. > > > I think this should be a fairly quick win, little effort needed to get > great > debugging value out of. I can prepare a patch if there is consensus on how > to > proceed. > > Any input is appreciated. > > -- > With regards, > Lasse Karstensen > Varnish Software AS > > _______________________________________________ > varnish-dev mailing list > varnish-dev at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev > -- As implied by email protocols, the information in this message is not confidential. Any middle-man or recipient may inspect, modify, copy, forward, reply to, delete, or filter email for any purpose unless said parties are otherwise obligated. As the sender, I acknowledge that I have a lower expectation of the control and privacy of this message than I would a post-card. Further, nothing in this message is legally binding without cryptographic evidence of its integrity. http://bilbo.hobbiton.org/wiki/Eat_My_Sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at varnish-software.com Thu Feb 14 21:45:48 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Thu, 14 Feb 2013 22:45:48 +0100 Subject: [PATCH 1/2] Turn on SO_KEEPALIVE on all TCP connections. Message-ID: <1360878349-1891-1-git-send-email-martin@varnish-software.com> This will help in determining remote hang up of the connection for situations where we still are not able to send any reply, but freeing the session will reduce resource overhead (e.g. when staying on waitinglists for extended periods). On platforms that support it also add runtime parameters to control the keep-alive packet settings through socket options. On platforms that don't support these socket options, the values must be set system wide. The Varnish runtime parameters will only be applied when they are less than the system default. --- bin/varnishd/cache/cache_acceptor.c | 160 +++++++++++++++++++++++++++++ bin/varnishd/common/params.h | 5 + bin/varnishd/mgt/mgt_param_tbl.c | 25 +++++ configure.ac | 32 ++++++ doc/sphinx/installation/platformnotes.rst | 25 +++++ 5 files changed, 247 insertions(+) diff --git a/bin/varnishd/cache/cache_acceptor.c b/bin/varnishd/cache/cache_acceptor.c index 62209a5..ee6b179 100644 --- a/bin/varnishd/cache/cache_acceptor.c +++ b/bin/varnishd/cache/cache_acceptor.c @@ -70,8 +70,26 @@ static const struct linger linger = { .l_onoff = 0, }; +/* + * We turn on keepalives by default to assist in detecting clients that have + * hung up on connections returning from waitinglists + */ +static const int keepalive = 1; + static unsigned char need_sndtimeo, need_rcvtimeo, need_linger, need_test, need_tcpnodelay; +static unsigned char need_keepalive = 0; +#ifdef HAVE_TCP_KEEP +static unsigned char need_ka_time = 0; +static unsigned char need_ka_probes = 0; +static unsigned char need_ka_intvl = 0; +static int ka_time_cur = 0; +static int ka_probes_cur = 0; +static int ka_intvl_cur = 0; +static int ka_time, ka_time_sys; +static int ka_probes, ka_probes_sys; +static int ka_intvl, ka_intvl_sys; +#endif /*-------------------------------------------------------------------- * Some kernels have bugs/limitations with respect to which options are @@ -83,6 +101,10 @@ static void sock_test(int fd) { struct linger lin; + int tka; +#ifdef HAVE_TCP_KEEP + int tka_time, tka_probes, tka_intvl; +#endif struct timeval tv; socklen_t l; int i, tcp_nodelay; @@ -97,6 +119,48 @@ sock_test(int fd) if (memcmp(&lin, &linger, l)) need_linger = 1; + l = sizeof tka; + i = getsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &tka, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka); + if (tka != keepalive) + need_keepalive = 1; + +#ifdef HAVE_TCP_KEEP + l = sizeof tka_time; + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &tka_time, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka_time); + if (tka_time != ka_time_cur) + need_ka_time = 1; + + l = sizeof tka_probes; + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &tka_probes, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka_probes); + if (tka_probes != ka_probes_cur) + need_ka_probes = 1; + + l = sizeof tka_intvl; + i = getsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &tka_intvl, &l); + if (i) { + VTCP_Assert(i); + return; + } + assert(l == sizeof tka_intvl); + if (tka_intvl != ka_intvl_cur) + need_ka_intvl = 1; +#endif + #ifdef SO_SNDTIMEO_WORKS l = sizeof tv; i = getsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &tv, &l); @@ -281,6 +345,22 @@ VCA_SetupSess(struct worker *wrk, struct sess *sp) if (need_linger) VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_LINGER, &linger, sizeof linger)); + if (need_keepalive) + VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_KEEPALIVE, + &keepalive, sizeof keepalive)); +#ifdef HAVE_TCP_KEEP + AN(ka_time); + if (need_ka_time) + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPIDLE, + &ka_time_cur, sizeof ka_time_cur)); + if (need_ka_probes) + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPCNT, + &ka_probes_cur, sizeof ka_probes_cur)); + if (need_ka_intvl) + VTCP_Assert(setsockopt(sp->fd, IPPROTO_TCP, TCP_KEEPINTVL, + &ka_intvl_cur, sizeof ka_intvl_cur)); +#endif + #ifdef SO_SNDTIMEO_WORKS if (need_sndtimeo) VTCP_Assert(setsockopt(sp->fd, SOL_SOCKET, SO_SNDTIMEO, @@ -312,10 +392,17 @@ vca_acct(void *arg) struct listen_sock *ls; double t0, now; int i; + socklen_t len; THR_SetName("cache-acceptor"); (void)arg; +#ifdef HAVE_TCP_KEEP + ka_time = cache_param->tcp_keepalive_time; + ka_probes = cache_param->tcp_keepalive_probes; + ka_intvl = cache_param->tcp_keepalive_intvl; +#endif + VTAILQ_FOREACH(ls, &heritage.socks, list) { if (ls->sock < 0) continue; @@ -324,6 +411,50 @@ vca_acct(void *arg) &linger, sizeof linger)); AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_NODELAY, &tcp_nodelay, sizeof tcp_nodelay)); + AZ(setsockopt(ls->sock, SOL_SOCKET, SO_KEEPALIVE, + &keepalive, sizeof keepalive)); +#ifdef HAVE_TCP_KEEP + if (!ka_time_cur) { + len = sizeof ka_time_sys; + AZ(getsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPIDLE, + &ka_time_sys, &len)); + assert(len == sizeof ka_time_sys); + AN(ka_time_sys); + ka_time_cur = ka_time = + (ka_time_sys < cache_param->tcp_keepalive_time ? + ka_time_sys : cache_param->tcp_keepalive_time); + } + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPIDLE, + &ka_time_cur, sizeof ka_time_cur)); + + if (!ka_probes_cur) { + len = sizeof ka_probes_sys; + AZ(getsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPCNT, + &ka_probes_sys, &len)); + assert(len == sizeof ka_probes_sys); + AN(ka_probes_sys); + ka_probes_cur = ka_probes = + (ka_probes_sys < cache_param->tcp_keepalive_probes ? + ka_probes_sys : + cache_param->tcp_keepalive_probes); + } + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPCNT, + &ka_probes_cur, sizeof ka_probes_cur)); + + if (!ka_intvl_cur) { + len = sizeof ka_intvl_sys; + AZ(getsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPINTVL, + &ka_intvl_sys, &len)); + assert(len == sizeof ka_intvl_sys); + AN(ka_intvl_sys); + ka_intvl_cur = ka_intvl = + (ka_intvl_sys < cache_param->tcp_keepalive_intvl ? + ka_intvl_sys : + cache_param->tcp_keepalive_intvl); + } + AZ(setsockopt(ls->sock, IPPROTO_TCP, TCP_KEEPINTVL, + &ka_intvl_cur, sizeof ka_intvl_cur)); +#endif if (cache_param->accept_filter) { i = VTCP_filter_http(ls->sock); if (i) @@ -339,6 +470,35 @@ vca_acct(void *arg) t0 = VTIM_real(); while (1) { (void)sleep(1); +#ifdef HAVE_TCP_KEEP + ka_time = (ka_time_sys < cache_param->tcp_keepalive_time ? + ka_time_sys : cache_param->tcp_keepalive_time); + ka_probes = (ka_probes_sys < cache_param->tcp_keepalive_probes ? + ka_probes_sys : cache_param->tcp_keepalive_probes); + ka_intvl = (ka_intvl_sys < cache_param->tcp_keepalive_intvl ? + ka_intvl_sys : cache_param->tcp_keepalive_intvl); + if (ka_time_cur != ka_time || + ka_probes_cur != ka_probes || + ka_intvl_cur != ka_intvl) { + need_test = 1; + ka_time_cur = ka_time; + ka_probes_cur = ka_probes; + ka_intvl_cur = ka_intvl; + VTAILQ_FOREACH(ls, &heritage.socks, list) { + if (ls->sock < 0) + continue; + AZ(setsockopt(ls->sock, IPPROTO_TCP, + TCP_KEEPIDLE, + &ka_time_cur, sizeof ka_time_cur)); + AZ(setsockopt(ls->sock, IPPROTO_TCP, + TCP_KEEPCNT, + &ka_probes_cur, sizeof ka_probes_cur)); + AZ(setsockopt(ls->sock, IPPROTO_TCP, + TCP_KEEPINTVL, + &ka_intvl_cur, sizeof ka_intvl_cur)); + } + } +#endif #ifdef SO_SNDTIMEO_WORKS if (cache_param->idle_send_timeout != send_timeout) { need_test = 1; diff --git a/bin/varnishd/common/params.h b/bin/varnishd/common/params.h index a6e881b..ebeff0f 100644 --- a/bin/varnishd/common/params.h +++ b/bin/varnishd/common/params.h @@ -110,6 +110,11 @@ struct params { unsigned pipe_timeout; unsigned send_timeout; unsigned idle_send_timeout; +#ifdef HAVE_TCP_KEEP + unsigned tcp_keepalive_time; + unsigned tcp_keepalive_probes; + unsigned tcp_keepalive_intvl; +#endif /* Management hints */ unsigned auto_restart; diff --git a/bin/varnishd/mgt/mgt_param_tbl.c b/bin/varnishd/mgt/mgt_param_tbl.c index 8601bae..b92c71b 100644 --- a/bin/varnishd/mgt/mgt_param_tbl.c +++ b/bin/varnishd/mgt/mgt_param_tbl.c @@ -205,6 +205,31 @@ const struct parspec mgt_parspec[] = { "See setsockopt(2) under SO_SNDTIMEO for more information.", DELAYED_EFFECT, "60", "seconds" }, +#ifdef HAVE_TCP_KEEP + { "tcp_keepalive_time", tweak_timeout, &mgt_param.tcp_keepalive_time, + 1, 7200, + "The number of seconds a connection needs to be idle before " + "TCP begins sending out keep-alive probes. Note that this " + "setting will only take effect when it is less than the " + "system default.", + EXPERIMENTAL, + "600", "seconds" }, + { "tcp_keepalive_probes", tweak_uint, &mgt_param.tcp_keepalive_probes, + 1, 100, + "The maximum number of TCP keep-alive probes to send before " + "giving up and killing the connection if no response is " + "obtained from the other end. Note that this setting will " + "only take effect when it is less than the system default.", + EXPERIMENTAL, + "5", "probes" }, + { "tcp_keepalive_intvl", tweak_timeout, &mgt_param.tcp_keepalive_intvl, + 1, 100, + "The number of seconds between TCP keep-alive probes. Note " + "that this setting will only take effect when it is less than" + "the system default.", + EXPERIMENTAL, + "5", "seconds" }, +#endif { "auto_restart", tweak_bool, &mgt_param.auto_restart, 0, 0, "Restart child process automatically if it dies.\n", 0, diff --git a/configure.ac b/configure.ac index a4cd8e8..76406d0 100644 --- a/configure.ac +++ b/configure.ac @@ -423,6 +423,38 @@ if test "$ac_cv_so_rcvtimeo_works" = no || fi LIBS="${save_LIBS}" +# Check if the OS supports TCP_KEEP(CNT|IDLE|INTVL) socket options +save_LIBS="${LIBS}" +LIBS="${LIBS} ${NET_LIBS}" +AC_CACHE_CHECK([for TCP_KEEP(CNT|IDLE|INTVL) socket options], + [ac_cv_have_tcp_keep], + [AC_RUN_IFELSE( + [AC_LANG_PROGRAM([[ +#include +#include +#include +#include +#include + ]],[[ +int s = socket(AF_INET, SOCK_STREAM, 0); +int i; +i = 5; +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPCNT, &i, sizeof i)) + return (1); +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPIDLE, &i, sizeof i)) + return (1); +if (setsockopt(s, IPPROTO_TCP, TCP_KEEPINTVL, &i, sizeof i)) + return (1); +return (0); + ]])], + [ac_cv_have_tcp_keep=yes], + [ac_cv_have_tcp_keep=no]) + ]) +if test "$ac_cv_have_tcp_keep" = yes; then + AC_DEFINE([HAVE_TCP_KEEP], [1], [Define if OS supports TCP_KEEP* socket options]) +fi +LIBS="${save_LIBS}" + # Run-time directory VARNISH_STATE_DIR='${localstatedir}/varnish' AC_SUBST(VARNISH_STATE_DIR) diff --git a/doc/sphinx/installation/platformnotes.rst b/doc/sphinx/installation/platformnotes.rst index 3ad486c..048442c 100644 --- a/doc/sphinx/installation/platformnotes.rst +++ b/doc/sphinx/installation/platformnotes.rst @@ -35,3 +35,28 @@ Reduce the maximum stack size by running:: in the Varnish startup script. +TCP keep-alive configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On some systems, Varnish is not able to set the TCP keep-alive values +per socket, and therefor the tcp_keepalive_* Varnish runtime +parameters are not available. On these platforms it can be benefitial +to tune the system wide values for these in order to more reliably +detect remote close for sessions spending long time on +waitinglists. This will help free up resources faster. + +Systems to not support TCP keep-alive values per socket include: + +- Solaris releases prior to version 11 +- FreeBSD releases prior to version 9.1 +- OS X releases prior to Mountain Lion + +On platforms with the necessary socket options the defaults are set +to: + +- tcp_keepalive_time = 600 seconds +- tcp_keepalive_probes = 5 +- tcp_keepalive_intvl = 5 seconds + +Note that Varnish will only apply these run-time parameters so long as +they are less than the system default value. -- 1.7.10.4 From martin at varnish-software.com Thu Feb 14 21:45:49 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Thu, 14 Feb 2013 22:45:49 +0100 Subject: [PATCH 2/2] On return from waitinglist, do a poll() on the socket to see if the client has closed the connection and gone away. If so, release the session early. In-Reply-To: <1360878349-1891-1-git-send-email-martin@varnish-software.com> References: <1360878349-1891-1-git-send-email-martin@varnish-software.com> Message-ID: <1360878349-1891-2-git-send-email-martin@varnish-software.com> Fixes: #1252 --- bin/varnishd/cache/cache_hash.c | 25 +++++++++++++--- bin/varnishd/cache/cache_http1_fsm.c | 14 +++++++++ bin/varnishd/hash/hash_slinger.h | 1 + bin/varnishtest/tests.disabled/r01252.vtc | 44 +++++++++++++++++++++++++++++ include/vtcp.h | 1 + lib/libvarnish/vtcp.c | 19 +++++++++++++ 6 files changed, 100 insertions(+), 4 deletions(-) create mode 100644 bin/varnishtest/tests.disabled/r01252.vtc diff --git a/bin/varnishd/cache/cache_hash.c b/bin/varnishd/cache/cache_hash.c index d0d987e..db906ec 100644 --- a/bin/varnishd/cache/cache_hash.c +++ b/bin/varnishd/cache/cache_hash.c @@ -399,7 +399,7 @@ HSH_Lookup(struct req *req) assert(oc->objhead == oh); oc->refcnt++; Lck_Unlock(&oh->mtx); - assert(hash->deref(oh)); + assert(HSH_DerefObjHead(&wrk->stats, &oh)); o = oc_getobj(&wrk->stats, oc); CHECK_OBJ_NOTNULL(o, OBJECT_MAGIC); if (!cache_param->obj_readonly && o->hits < INT_MAX) @@ -694,13 +694,30 @@ HSH_Deref(struct dstat *ds, struct objcore *oc, struct object **oo) if (oh != NULL) { /* Drop our ref on the objhead */ assert(oh->refcnt > 0); - if (hash->deref(oh)) - return (0); - HSH_DeleteObjHead(ds, oh); + (void)HSH_DerefObjHead(ds, &oh); } return (0); } +int +HSH_DerefObjHead(struct dstat *ds, struct objhead **poh) +{ + struct objhead *oh; + int r; + + AN(ds); + AN(poh); + oh = *poh; + *poh = NULL; + CHECK_OBJ_NOTNULL(oh, OBJHEAD_MAGIC); + + assert(oh->refcnt > 0); + r = hash->deref(oh); + if (!r) + HSH_DeleteObjHead(ds, oh); + return (r); +} + void HSH_Init(const struct hash_slinger *slinger) { diff --git a/bin/varnishd/cache/cache_http1_fsm.c b/bin/varnishd/cache/cache_http1_fsm.c index 2f99c6e..ecc3583 100644 --- a/bin/varnishd/cache/cache_http1_fsm.c +++ b/bin/varnishd/cache/cache_http1_fsm.c @@ -74,6 +74,7 @@ #include #include "cache.h" +#include "hash/hash_slinger.h" #include "vcl.h" #include "vct.h" @@ -329,6 +330,19 @@ HTTP1_Session(struct worker *wrk, struct req *req) return; } + /* + * Return from waitinglist. Check to see if the remote has left. + */ + if (req->req_step == R_STP_LOOKUP && VTCP_check_hup(sp->fd)) { + AN(req->hash_objhead); + (void)HSH_DerefObjHead(&wrk->stats, &req->hash_objhead); + AZ(req->hash_objhead); + SES_Close(sp, SC_REM_CLOSE); + sdr = http1_cleanup(sp, wrk, req); + assert(sdr == SESS_DONE_RET_GONE); + return; + } + if (sp->sess_step == S_STP_NEWREQ) { HTTP1_Init(req->htc, req->ws, sp->fd, req->vsl, cache_param->http_req_size, diff --git a/bin/varnishd/hash/hash_slinger.h b/bin/varnishd/hash/hash_slinger.h index c385ea6..f90f592 100644 --- a/bin/varnishd/hash/hash_slinger.h +++ b/bin/varnishd/hash/hash_slinger.h @@ -98,6 +98,7 @@ struct objhead { void HSH_Unbusy(struct dstat *, struct objcore *); void HSH_Complete(struct objcore *oc); void HSH_DeleteObjHead(struct dstat *, struct objhead *oh); +int HSH_DerefObjHead(struct dstat *, struct objhead **poh); int HSH_Deref(struct dstat *, struct objcore *oc, struct object **o); #endif /* VARNISH_CACHE_CHILD */ diff --git a/bin/varnishtest/tests.disabled/r01252.vtc b/bin/varnishtest/tests.disabled/r01252.vtc new file mode 100644 index 0000000..c988da6 --- /dev/null +++ b/bin/varnishtest/tests.disabled/r01252.vtc @@ -0,0 +1,44 @@ +varnishtest "#1252 - Drop remote closed connections returning from waitinglists" + +# This test case is disabled because it will only pass on platforms +# where the tcp_keepalive_* runtime arguments are available, and also +# because it requires "-t 75" argument to varnishtest (remote closed +# state will only be detected after FIN timeout has passed (60s)) + +server s1 { + rxreq + expect req.http.X-Client == "1" + sema r1 sync 2 + delay 75 + close +} -start + +server s2 { + rxreq + expect req.url == "/should/not/happen" + txresp +} -start + +varnish v1 -arg "-p debug=+waitinglist -p tcp_keepalive_time=1s -p tcp_keepalive_probes=1 -p tcp_keepalive_intvl=1s -p first_byte_timeout=70" -vcl+backend { + sub vcl_recv { + if (req.http.x-client == "2") { + set req.backend = s2; + } + } +} -start + +client c1 { + timeout 70 + txreq -hdr "X-Client: 1" + rxresp + expect resp.status == 503 +} -start + +client c2 { + sema r1 sync 2 + txreq -hdr "X-Client: 2" + delay 1 +} -start + +client c1 -wait +client c2 -wait diff --git a/include/vtcp.h b/include/vtcp.h index 77f86ed..1594a4d 100644 --- a/include/vtcp.h +++ b/include/vtcp.h @@ -62,6 +62,7 @@ int VTCP_filter_http(int sock); int VTCP_blocking(int sock); int VTCP_nonblocking(int sock); int VTCP_linger(int sock, int linger); +int VTCP_check_hup(int sock); #ifdef SOL_SOCKET int VTCP_port(const struct sockaddr_storage *addr); diff --git a/lib/libvarnish/vtcp.c b/lib/libvarnish/vtcp.c index 2c6dd3f..7227fb3 100644 --- a/lib/libvarnish/vtcp.c +++ b/lib/libvarnish/vtcp.c @@ -308,3 +308,22 @@ VTCP_linger(int sock, int linger) VTCP_Assert(i); return (i); } + +/*-------------------------------------------------------------------- + * Do a poll to check for remote HUP + */ + +int +VTCP_check_hup(int sock) +{ + struct pollfd pfd; + + assert(sock > 0); + pfd.fd = sock; + pfd.events = POLLOUT; + pfd.revents = 0; + + if (poll(&pfd, 1, 0) == 1 && pfd.revents & POLLHUP) + return (1); + return (0); +} -- 1.7.10.4 From phk at phk.freebsd.dk Fri Feb 15 10:14:26 2013 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri, 15 Feb 2013 10:14:26 +0000 Subject: [PATCH 2/2] On return from waitinglist, do a poll() on the socket to see if the client has closed the connection and gone away. If so, release the session early. In-Reply-To: <1360878349-1891-2-git-send-email-martin@varnish-software.com> References: <1360878349-1891-1-git-send-email-martin@varnish-software.com> <1360878349-1891-2-git-send-email-martin@varnish-software.com> Message-ID: <3599.1360923266@critter.freebsd.dk> OK. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Fri Feb 15 10:14:38 2013 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri, 15 Feb 2013 10:14:38 +0000 Subject: [PATCH 1/2] Turn on SO_KEEPALIVE on all TCP connections. In-Reply-To: <1360878349-1891-1-git-send-email-martin@varnish-software.com> References: <1360878349-1891-1-git-send-email-martin@varnish-software.com> Message-ID: <3612.1360923278@critter.freebsd.dk> OK -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From martin at varnish-software.com Fri Feb 15 14:37:25 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Fri, 15 Feb 2013 15:37:25 +0100 Subject: [PATCH 1/2] Add a ban_max runtime parameter. Message-ID: <1360939046-7300-1-git-send-email-martin@varnish-software.com> This parameter controls the maximum ban list length we aim for. When the ban list grows beyond this limit, we allow the ban lurker to force evict the objects hanging on the last bans to get below the limit. --- bin/varnishd/cache/cache_ban.c | 31 +++++++++++++++++++++---------- bin/varnishd/common/params.h | 3 +++ bin/varnishd/mgt/mgt_param_tbl.c | 6 ++++++ bin/varnishtest/tests/b00036.vtc | 34 ++++++++++++++++++++++++++++++++++ include/tbl/vsc_f_main.h | 5 +++++ 5 files changed, 69 insertions(+), 10 deletions(-) create mode 100644 bin/varnishtest/tests/b00036.vtc diff --git a/bin/varnishd/cache/cache_ban.c b/bin/varnishd/cache/cache_ban.c index 2e1488f..79eb9cf 100644 --- a/bin/varnishd/cache/cache_ban.c +++ b/bin/varnishd/cache/cache_ban.c @@ -833,7 +833,7 @@ ban_evaluate(const uint8_t *bs, const struct http *objhttp, static int ban_check_object(struct object *o, struct vsl_log *vsl, - const struct http *req_http) + const struct http *req_http, int force) { struct ban *b; struct objcore *oc; @@ -861,6 +861,8 @@ ban_check_object(struct object *o, struct vsl_log *vsl, skipped = 0; for (b = b0; b != oc->ban; b = VTAILQ_NEXT(b, list)) { CHECK_OBJ_NOTNULL(b, BAN_MAGIC); + if (force) + break; if (b->flags & BAN_F_GONE) continue; if ((b->flags & BAN_F_LURK) && @@ -880,7 +882,10 @@ ban_check_object(struct object *o, struct vsl_log *vsl, } Lck_Lock(&ban_mtx); - VSC_C_main->bans_tested++; + if (force) + VSC_C_main->bans_forced++; + else + VSC_C_main->bans_tested++; VSC_C_main->bans_tests_tested += tests; if (b == oc->ban && skipped > 0) { @@ -922,7 +927,7 @@ int BAN_CheckObject(struct object *o, struct req *req) { - return (ban_check_object(o, req->vsl, req->http) > 0); + return (ban_check_object(o, req->vsl, req->http, 0) > 0); } static void @@ -958,7 +963,7 @@ ban_cleantail(void) */ static int -ban_lurker_work(struct worker *wrk, struct vsl_log *vsl) +ban_lurker_work(struct worker *wrk, struct vsl_log *vsl, int force) { struct ban *b, *b0; struct objhead *oh; @@ -1057,10 +1062,10 @@ ban_lurker_work(struct worker *wrk, struct vsl_log *vsl) * Get the object and check it against all relevant bans */ o = oc_getobj(&wrk->stats, oc); - i = ban_check_object(o, vsl, NULL); + i = ban_check_object(o, vsl, NULL, force); if (DO_DEBUG(DBG_LURKER)) - VSLb(vsl, SLT_Debug, "lurker got: %p %d", - oc, i); + VSLb(vsl, SLT_Debug, "lurker got: %p %d %d", + oc, force, i); if (i == -1) { /* Not banned, not moved */ oc->flags |= pass; @@ -1076,6 +1081,8 @@ ban_lurker_work(struct worker *wrk, struct vsl_log *vsl) (void)HSH_Deref(&wrk->stats, NULL, &o); VTIM_sleep(cache_param->ban_lurker_sleep); } + if (force > 0) + force--; Lck_AssertHeld(&ban_mtx); if (!(b->flags & BAN_F_REQ)) { if (!(b->flags & BAN_F_GONE)) @@ -1104,19 +1111,23 @@ ban_lurker(struct worker *wrk, void *priv) struct vsl_log vsl; volatile double d; int i = 0, n = 0; + int force; VSL_Setup(&vsl, NULL, 0); (void)priv; while (!ban_shutdown) { d = cache_param->ban_lurker_sleep; - if (d > 0.0) { - i = ban_lurker_work(wrk, &vsl); + force = VSC_C_main->bans - cache_param->ban_max; + if (force < 0) + force = 0; + if (d > 0.0 || force) { + i = ban_lurker_work(wrk, &vsl, force); VSL_Flush(&vsl, 0); WRK_SumStat(wrk); if (i && !ban_shutdown) { VTIM_sleep(d); - if (++n > 10) { + if (++n > 10 || force) { ban_cleantail(); n = 0; } diff --git a/bin/varnishd/common/params.h b/bin/varnishd/common/params.h index ebeff0f..714a44e 100644 --- a/bin/varnishd/common/params.h +++ b/bin/varnishd/common/params.h @@ -185,6 +185,9 @@ struct params { /* How long time does the ban lurker sleep */ double ban_lurker_sleep; + /* Max ban list length */ + unsigned ban_max; + /* Max size of the saintmode list. 0 == no saint mode. */ unsigned saintmode_threshold; diff --git a/bin/varnishd/mgt/mgt_param_tbl.c b/bin/varnishd/mgt/mgt_param_tbl.c index b92c71b..e8e491d 100644 --- a/bin/varnishd/mgt/mgt_param_tbl.c +++ b/bin/varnishd/mgt/mgt_param_tbl.c @@ -446,6 +446,12 @@ const struct parspec mgt_parspec[] = { "A value of zero disables the ban lurker.", 0, "0.01", "s" }, + { "ban_max", tweak_uint, &mgt_param.ban_max, 1, UINT_MAX, + "The maximum number of bans allowed on the ban list before " + "the ban lurker starts to force evict the objects on the last " + "ban.", + EXPERIMENTAL, + "5000", "bans" }, { "saintmode_threshold", tweak_uint, &mgt_param.saintmode_threshold, 0, UINT_MAX, "The maximum number of objects held off by saint mode before " diff --git a/bin/varnishtest/tests/b00036.vtc b/bin/varnishtest/tests/b00036.vtc new file mode 100644 index 0000000..8cb3157 --- /dev/null +++ b/bin/varnishtest/tests/b00036.vtc @@ -0,0 +1,34 @@ +varnishtest "Test force evicting of bans" + +server s1 { + rxreq + txresp -hdr "x-foo: first" + rxreq + txresp -hdr "x-foo: second" +} -start + +varnish v1 -arg "-p ban_max=1 -p ban_lurker_sleep=0" -vcl+backend { +} -start + +client c1 { + txreq + rxresp + expect resp.status == 200 + expect resp.http.x-foo == "first" +} -run + +varnish v1 -expect bans == 1 + +varnish v1 -cliok "ban obj.http.x-bar == non-existant" + +delay 2 + +client c1 { + txreq + rxresp + expect resp.status == 200 + expect resp.http.x-foo == "second" +} -run + +varnish v1 -expect bans == 1 +varnish v1 -expect bans_forced == 1 diff --git a/include/tbl/vsc_f_main.h b/include/tbl/vsc_f_main.h index f487471..4d246d4 100644 --- a/include/tbl/vsc_f_main.h +++ b/include/tbl/vsc_f_main.h @@ -501,6 +501,11 @@ VSC_F(bans_tests_tested, uint64_t, 0, 'c', " each other. 'ban req.url == foo && req.http.host == bar'" " counts as one in 'bans_tested' and as two in 'bans_tests_tested'" ) +VSC_F(bans_forced, uint64_t, 0, 'c', + "Bans forced upon objects", + "Count of how many bans were forced upon objects to meet ban list" + " limits." +) VSC_F(bans_dups, uint64_t, 0, 'c', "Bans superseded by other bans", "Count of bans replaced by later identical bans." -- 1.7.10.4 From martin at varnish-software.com Fri Feb 15 14:37:26 2013 From: martin at varnish-software.com (Martin Blix Grydeland) Date: Fri, 15 Feb 2013 15:37:26 +0100 Subject: [PATCH 2/2] Start forcing out bans when less than 10% of the maximum available ban space is left. In-Reply-To: <1360939046-7300-1-git-send-email-martin@varnish-software.com> References: <1360939046-7300-1-git-send-email-martin@varnish-software.com> Message-ID: <1360939046-7300-2-git-send-email-martin@varnish-software.com> --- bin/varnishd/cache/cache.h | 1 + bin/varnishd/cache/cache_ban.c | 25 +++++++++++++++++++++++++ bin/varnishd/storage/storage_persistent.c | 1 + 3 files changed, 27 insertions(+) diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h index 6be2c73..7632bd9 100644 --- a/bin/varnishd/cache/cache.h +++ b/bin/varnishd/cache/cache.h @@ -763,6 +763,7 @@ void BAN_Compile(void); struct ban *BAN_RefBan(struct objcore *oc, double t0, const struct ban *tail); void BAN_TailDeref(struct ban **ban); double BAN_Time(const struct ban *ban); +void BAN_SetMaxBytes(unsigned bytes); /* cache_busyobj.c */ void VBO_Init(void); diff --git a/bin/varnishd/cache/cache_ban.c b/bin/varnishd/cache/cache_ban.c index 79eb9cf..9f918c2 100644 --- a/bin/varnishd/cache/cache_ban.c +++ b/bin/varnishd/cache/cache_ban.c @@ -108,6 +108,8 @@ static struct ban * volatile ban_start; static bgthread_t ban_lurker; static int ban_shutdown = 0; +static unsigned ban_maxbytes = UINT_MAX; + /*-------------------------------------------------------------------- * BAN string defines & magic markers */ @@ -145,6 +147,18 @@ static const struct pvar { }; /*-------------------------------------------------------------------- + * Allow persisted stevedores to report the maximum size of bans they + * support + */ + +void +BAN_SetMaxBytes(unsigned bytes) +{ + if (bytes < ban_maxbytes) + ban_maxbytes = bytes; +} + +/*-------------------------------------------------------------------- * Storage handling of bans */ @@ -1121,6 +1135,17 @@ ban_lurker(struct worker *wrk, void *priv) force = VSC_C_main->bans - cache_param->ban_max; if (force < 0) force = 0; + if (!force && + (VSC_C_main->bans_persisted_bytes - + VSC_C_main->bans_persisted_fragmentation) > + ban_maxbytes * 0.9) { + /* + * We are getting close to (less than 10% left) + * the maximum persisted ban space. Start forcing + * out bans. + */ + force = 10; + } if (d > 0.0 || force) { i = ban_lurker_work(wrk, &vsl, force); VSL_Flush(&vsl, 0); diff --git a/bin/varnishd/storage/storage_persistent.c b/bin/varnishd/storage/storage_persistent.c index 55279c8..46ab683 100644 --- a/bin/varnishd/storage/storage_persistent.c +++ b/bin/varnishd/storage/storage_persistent.c @@ -144,6 +144,7 @@ smp_open_bans(struct smp_sc *sc, struct smp_signspace *spc) ptr = SIGNSPACE_DATA(spc); pe = SIGNSPACE_FRONT(spc); + BAN_SetMaxBytes(spc->size); BAN_Reload(ptr, pe - ptr); return (0); -- 1.7.10.4 From varnish at bsdchicks.com Mon Feb 18 11:21:36 2013 From: varnish at bsdchicks.com (Rogier 'DocWilco' Mulhuijzen) Date: Mon, 18 Feb 2013 12:21:36 +0100 Subject: Extending vcl_trace to log VCL value assignments In-Reply-To: References: <20130214155139.GD14555@immer.varnish-software.com> Message-ID: Sounds useful. I have a few things to add. 1) We probably shouldn't exhaust VSL tags too fast, so I'd suggest using either VCL_trace, or VCL_trace_ext(ended), and then putting a keyword at the front, like "value" or "assign". Then we can have all sorts of other fun tracing info in the same tag. 2) If there's no value, don't put between quotes, but put on its own. That way it's very clear that it's undefined, instead of a string that happens to be the characters . For those wondering wtf I mean by that: char *mystring = NULL; vs char *mystring = "NULL"; 3) While you're at it, results of if-statements would be nice, possibly with the values of all variable involved. 4) In our private fork, we've added a parameter to influence the compiling of VCL (well, a few, but I digress) so that based on the parameter, calls to VRT_trace are added or not. It's called vcl_trace_support. The reason we put it in is that calls to VCL_trace, as short as they may be with tracing disabled, were so numerous that they were eating up a lot of CPU time. If you're adding more VCL tracing functions to VRT, this wastage only increases. I would suggest adding it, and making it default to "on", with the caveats and benefits of turning it off clearly described. To turn on tracing, one would simply have to reload the VCL after switching the support back on. But people new to Varnish might have trouble with that, hence defaulting to on. Cheers, Doc On Thu, Feb 14, 2013 at 6:29 PM, Leif Pedersen wrote: > Lasse, > > This sounds pretty useful. I do a lot of work with VCL and I can usually > track what's happened pretty easily, but when my coworkers have to look > through it, they tend to get lost since their focus is on other parts of > our servers. This could really help them trace what's going on in the VCL > code that I write. > > I have a suggestion. How about emitting an explicit log message when > tracing is enabled or disabled? This way a log processor can quickly > identify a request when it turns on tracing and automatically filter only > those from the log to display. > > - Leif > > (PS. I'm a long-time user, but until now I've only been lurking on the > list. Hi, all.) > > > On Thu, Feb 14, 2013 at 9:51 AM, Lasse Karstensen < > lkarsten at varnish-software.com> wrote: > >> Hello. >> >> A long standing todo item of ours (VS) is that we should make some sort >> of VCL debugger or execution tracer. >> >> Many new users find it difficult to read the verbose output of >> varnishlog, and >> some hand-holding might be in place. >> >> I've been looking at the VCL trace function, which is a varnish parameter >> to enable >> log lines in varnishlog. The output says that the execution is now on >> line x >> char y of the VCL line, logged whenever there is a function call. >> In our experience, this feature is rarely or never used. >> >> My thinking is that this VCL_trace can be extended to log the previous >> and new >> value whenever an assignment is done in VCL. This would enable us to write >> software that takes the VCL and varnishlog as inputs, and let the user >> step >> through the VCL execution and see what the different headers/values are >> at each >> step. >> >> I think this should be possible to turn on per request in addition to the >> Varnish wide parameter it is now. This enables us to do "online >> debugging"; a >> Varnish user is having trouble with a distinct URL and can input that >> into the >> tool and have it run just once without affecting other requests or site >> performance. >> >> From a VCL perspective it can look like this: >> >> 5 sub vcl_recv { >> 6 if (req.http.x-trace-request == "yes") >> 7 { >> 8 set req.trace = true; >> 9 } >> 10 set req.http.x-tmp = "foo"; >> 11 set req.http.x-tmp = "bar"; >> 12 } >> >> The obvious drawback is of course that any VCL executed before this won't >> get >> trace information. I think this can be accepted in this case. >> >> In the shmlog the output can be: >> >> 11 RxHeader c Host: localhost >> 11 RxHeader c User-Agent: lwp-request/6.03 libwww-perl/6.04 >> 11 VCL_ValueTrace c 13 10.13 "" "foo" >> 11 VCL_ValueTrace c 14 11.13 "foo" "bar" >> 11 VCL_call c recv 1 42.5 2 41.5 3 42.9 5 46.13 6 49.5 8 59.5 10 >> 63.5 12 67.5 lookup >> 11 VCL_call c hash 15 85.5 >> 11 Hash c / >> 11 VCL_trace c 16 87.9 >> 11 Hash c localhost >> >> >> Implementation wise this can be done via a small patch to VRT_SetHdr(), >> which >> check if tracing is on and writes VCL_ValueTrace to VSL. There must be a >> flag somewhere (struct req?) that indicates if this connection/session has >> tracing set to true. >> >> >> I think this should be a fairly quick win, little effort needed to get >> great >> debugging value out of. I can prepare a patch if there is consensus on >> how to >> proceed. >> >> Any input is appreciated. >> >> -- >> With regards, >> Lasse Karstensen >> Varnish Software AS >> >> _______________________________________________ >> varnish-dev mailing list >> varnish-dev at varnish-cache.org >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev >> > > > > -- > > As implied by email protocols, the information in this message is > not confidential. Any middle-man or recipient may inspect, modify, > copy, forward, reply to, delete, or filter email for any purpose unless > said parties are otherwise obligated. As the sender, I acknowledge that > I have a lower expectation of the control and privacy of this message > than I would a post-card. Further, nothing in this message is > legally binding without cryptographic evidence of its integrity. > > http://bilbo.hobbiton.org/wiki/Eat_My_Sig > > _______________________________________________ > varnish-dev mailing list > varnish-dev at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tfheen at varnish-software.com Wed Feb 20 09:51:01 2013 From: tfheen at varnish-software.com (Tollef Fog Heen) Date: Wed, 20 Feb 2013 10:51:01 +0100 Subject: [PATCH] Tickle old, idle workers to give up their VCLs Message-ID: <1361353861-1563-1-git-send-email-tfheen@varnish-software.com> --- bin/varnishd/cache/cache_pool.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/bin/varnishd/cache/cache_pool.c b/bin/varnishd/cache/cache_pool.c index 4e7bd7c..c19c1fe 100644 --- a/bin/varnishd/cache/cache_pool.c +++ b/bin/varnishd/cache/cache_pool.c @@ -364,6 +364,16 @@ pool_herder(void *priv) continue; } + Lck_Lock(&pp->mtx); + VTAILQ_FOREACH_REVERSE(pt, &pp->idle_queue, taskhead, list) { + CAST_OBJ_NOTNULL(wrk, pt->priv, WORKER_MAGIC); + if (wrk->lastused > t_idle) + break; + if (wrk->vcl != NULL) + VCL_Rel(&wrk->vcl); + } + Lck_Unlock(&pp->mtx); + if (pp->nthr > cache_param->wthread_min) { t_idle = VTIM_real() - cache_param->wthread_timeout; -- 1.7.10.4 From daghf at varnish-software.com Thu Feb 21 09:50:51 2013 From: daghf at varnish-software.com (Dag Haavi Finstad) Date: Thu, 21 Feb 2013 10:50:51 +0100 Subject: [PATCH] Fixes an issue in backend probe initialization that causes the default value of .initial to equal .threshold. Message-ID: See attachment. -- *Dag Haavi Finstad* Developer | Varnish Software AS Phone: +47 21 98 92 60 We Make Websites Fly! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: probe-initial-default.patch Type: application/octet-stream Size: 2041 bytes Desc: not available URL: From daghf at varnish-software.com Thu Feb 21 09:53:26 2013 From: daghf at varnish-software.com (Dag Haavi Finstad) Date: Thu, 21 Feb 2013 10:53:26 +0100 Subject: [PATCH] Make std.collect() also work for resp.http and bereq.http. Message-ID: See attached patch. -- *Dag Haavi Finstad* Developer | Varnish Software AS Phone: +47 21 98 92 60 We Make Websites Fly! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: std-collect.patch Type: application/octet-stream Size: 2252 bytes Desc: not available URL: From yaoweibin at gmail.com Fri Feb 22 07:17:10 2013 From: yaoweibin at gmail.com (Weibin Yao) Date: Fri, 22 Feb 2013 15:17:10 +0800 Subject: [Patch] Avoid varnish assert failure when inflating corrupt compressed ESI response Message-ID: Hi folks, This is a small patch, it fixes the segment fault caused by the failure assertion with the error gunzip return value. I don't know if this patch is sufficient, it seems work for me. Thanks. diff --git a/bin/varnishd/cache_esi_fetch.c b/bin/varnishd/cache_esi_fetch.c index ab86ac8..e3f80f3 100644 --- a/bin/varnishd/cache_esi_fetch.c +++ b/bin/varnishd/cache_esi_fetch.c @@ -270,8 +270,9 @@ vfp_esi_bytes_gg(struct sess *sp, struct http_conn *htc, size_t bytes) do { VGZ_Obuf(sp->wrk->vgz_rx, ibuf2, sizeof ibuf2); i = VGZ_Gunzip(sp->wrk->vgz_rx, &dp, &dl); - /* XXX: check i */ - assert(i >= VGZ_OK); + if (i < VGZ_OK) { + return (-1); + } vef->bufp = ibuf2; if (dl > 0) VEP_parse(sp, ibuf2, dl); -- Weibin Yao Developer @ Server Platform Team of Taobao -------------- next part -------------- An HTML attachment was scrubbed... URL: From daghf at varnish-software.com Fri Feb 22 10:27:25 2013 From: daghf at varnish-software.com (Dag Haavi Finstad) Date: Fri, 22 Feb 2013 11:27:25 +0100 Subject: [Patch] Avoid varnish assert failure when inflating corrupt compressed ESI response In-Reply-To: References: Message-ID: Hi Weibin I'm pretty sure this is #1184, which has since been fixed in both master[1] and 3.0[2]. Thanks though! [1]: https://www.varnish-cache.org/trac/changeset/7c784d5c9d2dd959a5ea1a1bea5f7bbb4173437d [2]: https://www.varnish-cache.org/trac/changeset/e837c4bcd26218489576af899f563c66e6b0511f On Fri, Feb 22, 2013 at 8:17 AM, Weibin Yao wrote: > Hi folks, > > This is a small patch, it fixes the segment fault caused by the failure > assertion with the error gunzip return value. > > I don't know if this patch is sufficient, it seems work for me. > > Thanks. > > diff --git a/bin/varnishd/cache_esi_fetch.c > b/bin/varnishd/cache_esi_fetch.c > index ab86ac8..e3f80f3 100644 > --- a/bin/varnishd/cache_esi_fetch.c > +++ b/bin/varnishd/cache_esi_fetch.c > @@ -270,8 +270,9 @@ vfp_esi_bytes_gg(struct sess *sp, struct http_conn > *htc, size_t bytes) > do { > VGZ_Obuf(sp->wrk->vgz_rx, ibuf2, sizeof ibuf2); > i = VGZ_Gunzip(sp->wrk->vgz_rx, &dp, &dl); > - /* XXX: check i */ > - assert(i >= VGZ_OK); > + if (i < VGZ_OK) { > + return (-1); > + } > vef->bufp = ibuf2; > if (dl > 0) > VEP_parse(sp, ibuf2, dl); > > > -- > Weibin Yao > Developer @ Server Platform Team of Taobao > > _______________________________________________ > varnish-dev mailing list > varnish-dev at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev > -- *Dag Haavi Finstad* Developer | Varnish Software AS Phone: +47 21 98 92 60 We Make Websites Fly! -------------- next part -------------- An HTML attachment was scrubbed... URL: From tfheen at varnish-software.com Fri Feb 22 11:49:06 2013 From: tfheen at varnish-software.com (Tollef Fog Heen) Date: Fri, 22 Feb 2013 12:49:06 +0100 Subject: [PATCH] Make std.collect() also work for resp.http and bereq.http. In-Reply-To: References: Message-ID: <20130222114906.GA23844@err.no> ]] Dag Haavi Finstad > See attached patch. Thx, applied. -- Tollef Fog Heen Technical lead | Varnish Software AS t: +47 21 98 92 64 We Make Websites Fly!