From ahooper at bmjgroup.com Mon Jul 6 12:26:59 2009 From: ahooper at bmjgroup.com (Alex Hooper) Date: Mon, 6 Jul 2009 13:26:59 +0100 Subject: tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC) Message-ID: <4A51ED93.3020805@bmjgroup.com> Hi, Having a requirement to ease the load on a high-traffic site of ours, I was recently diverted from reaching for Squid by a colleague who recommended Varnish. I grabbed the source for 2.0.4 (having looked for a package but not found one, despite a suggestion that it was available as part of Sun's Webstack) and compiled it using the compiler bundled with Sun Studio 12 update 1 (having initially tried using gcc and hitting a problem with isfinite()). All compiled fine so I created a simple test config: backend test01 { .host = "83.231.175.37"; .port = "80"; .connect_timeout = 10s; .first_byte_timeout = 15s; .between_bytes_timeout = 2s; } sub vcl_recv { set req.backend = test01; } and started up with varnishd -a :80 -T localhost:6082 -p sess_timeout=60 -f /local/etc/varnish/bmjgroup.vcl That looked OK, but requests to varnish all gave 503s. Watching the logs, I could see that flow was passing from 'pass' to 'error', but was given no more clue. It wasn't until I watched the packets on the wire that I saw that Varnish was sending resets before even completing a handshake: varnish -> web: SYN varnish -> web: RST web -> varnish: SYN ACK varnish -> web: RST Having searched the wiki, mailing lists and bug database for a resolution to this, I found nothing except for issue #522 (Odd TCP reset problems with trunk 4080) which seems not to be related. I compiled, and am running varnish on, Solaris 10 5/08 (SPARC). I wonder does anyone have an idea of what might be happening? Cheers, Alex. -- Alex Hooper | www.bmjpg.com Systems and Database Administration | ahooper at bmjgroup.com BMJ Technology, BMJ Publishing Group | +44 20 7383 6049 BMA House, LONDON, WC1H 9JR | _______________________________________________________________________ The BMJ Group is one of the world's most trusted providers of medical information for doctors, researchers, health care workers and patients www.bmjgroup.bmj.com. This email and any attachments are confidential. If you have received this email in error, please delete it and kindly notify us. If the email contains personal views then the BMJ Group accepts no responsibility for these statements. The recipient should check this email and attachments for viruses because the BMJ Group accepts no liability for any damage caused by viruses. Emails sent or received by the BMJ Group may be monitored for size, traffic, distribution and content. BMJ Publishing Group Limited trading as BMJ Group. A private limited company, registered in England and Wales under registration number 03102371. Registered office: BMA House, Tavistock Square, London WC1H 9JR, UK. _______________________________________________________________________ From l at lrowe.co.uk Mon Jul 6 14:26:26 2009 From: l at lrowe.co.uk (Laurence Rowe) Date: Mon, 6 Jul 2009 16:26:26 +0200 Subject: Inline C and memory allocation In-Reply-To: References: Message-ID: Hi, Thought my C is rather rusty by now, I'd like to make the mod_auth_tkt [1] signed cookie authentication / authorisation system work with Varnish. The idea would be to encode the acceptable authorisation tokens for a page into it's response header then check the tokens in the user's auth_tkt cookie against the tokens in the cached header during vcl_deliver. I can find examples online that read data from headers using VRT_GetHdr, but in order to implement/port mod_auth_tkt I will need to decode the data in the cookie and write the decoded contents to new headers. With apache, I would use apr_psprintf or similar to allocate memory from the pool. What would be the equivalent in Varnish? Laurence [1] http://www.openfusion.com.au/labs/mod_auth_tkt/ From v.bilek at 1art.cz Fri Jul 3 10:26:42 2009 From: v.bilek at 1art.cz (=?UTF-8?B?VsOhY2xhdiBCw61sZWs=?=) Date: Fri, 03 Jul 2009 12:26:42 +0200 Subject: bereq.connect_timeout sub second value Message-ID: <4A4DDCE2.4050604@1art.cz> helo is it posible to set subsecond valu in bereq.connect_timeout example: set bereq.connect_timeout = 0.3 and how to find out that the timeout was exceded? ... the idea is that varnish is in ffront of LVS cluster and vhen one LVS backend is too slow i want to restart te reques on another LVS backend ( for varnish it means only restart of the request). Vaclav Bilek From rtshilston at gmail.com Mon Jul 6 15:31:45 2009 From: rtshilston at gmail.com (Rob S) Date: Mon, 06 Jul 2009 16:31:45 +0100 Subject: tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC) In-Reply-To: <4A521772.9080201@bmjgroup.com> References: <4A51ED93.3020805@bmjgroup.com> <4A5215A0.9040206@gmail.com> <4A521772.9080201@bmjgroup.com> Message-ID: <4A5218E1.40100@gmail.com> Alex Hooper wrote: > 5 VCL_call c recv > 5 VCL_return c pass > 5 VCL_call c pass > 5 VCL_return c pass > 5 VCL_call c error > 5 VCL_return c deliver It looks like you're using "pass", rather than "fetch", which probably isn't desirable when you're just doing a simple GET request. I'd expect to see something like: 7 VCL_call c recv 7 VCL_return c lookup 7 VCL_call c hash 7 VCL_return c hash 7 VCL_call c miss 7 VCL_return c fetch Can you send your VCL file, so that I can take a look at the logic? Rob From rtshilston at gmail.com Mon Jul 6 15:17:52 2009 From: rtshilston at gmail.com (Rob S) Date: Mon, 06 Jul 2009 16:17:52 +0100 Subject: tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC) In-Reply-To: <4A51ED93.3020805@bmjgroup.com> References: <4A51ED93.3020805@bmjgroup.com> Message-ID: <4A5215A0.9040206@gmail.com> Alex Hooper wrote: > I wonder does anyone have an idea of what might be happening? > Alex, I've not seen this before, but I've found that 'varnishlog' typically provides very helpful information. Can you post a log of the request? Rob From tfheen at redpill-linpro.com Wed Jul 1 00:11:56 2009 From: tfheen at redpill-linpro.com (Tollef Fog Heen) Date: Wed, 01 Jul 2009 02:11:56 +0200 Subject: Thread memory allocation question In-Reply-To: <5039.1245446512@critter.freebsd.dk> (Poul-Henning Kamp's message of "Fri, 19 Jun 2009 21:21:52 +0000") References: <5039.1245446512@critter.freebsd.dk> Message-ID: <874otxntwz.fsf@qurzaw.linpro.no> ]] "Poul-Henning Kamp" | In message <5C056AE2-7207-42F8-9E4B-0F541DC4B1B2 at slide.com>, Ken Brownfield wri | tes: | | >Would a stack overflow take out the whole child, or just that thread? | | The kernel would try to extend the stack and provided you are not on | a 32 bit system, it shouldn't ever have a problem with that. On the other hand, the gain from decreasing the stack size would just be a bit less book-keeping for the kernel, unless you have overcommit turned off (which I don't think anybody actually uses), right? -- Tollef Fog Heen Redpill Linpro -- Changing the game! t: +47 21 54 41 73 From tfheen at redpill-linpro.com Wed Jul 1 00:14:23 2009 From: tfheen at redpill-linpro.com (Tollef Fog Heen) Date: Wed, 01 Jul 2009 02:14:23 +0200 Subject: "ExpBan: nnn was banned" In-Reply-To: <24a219a50906191906v59c6588bkeac4747d7547db85@mail.gmail.com> (Martin Goldman's message of "Fri, 19 Jun 2009 22:06:52 -0400") References: <24a219a50906191906v59c6588bkeac4747d7547db85@mail.gmail.com> Message-ID: <87zlbpmf8g.fsf@qurzaw.linpro.no> ]] Martin Goldman | I got a complaint from a user that some percentage of his page views seem | too slow to be coming from the cache. I brought up our home page and started | refreshing it over and over. Sure enough, while most of the page views are | in fact getting cached, once every 5 or 6 times, I get a cache miss. When | this happens, varnishlog shows something like this: | | 32 VCL_call c recv lookup | 32 VCL_call c hash hash | * 32 ExpBan c 1607291667 was banned | * 32 VCL_call c miss fetch Do you add purges every now and then? This looks like the result of a purge forcing an object to be removed, which then gives you a cache miss. -- Tollef Fog Heen Redpill Linpro -- Changing the game! t: +47 21 54 41 73 From tfheen at redpill-linpro.com Wed Jul 1 00:14:23 2009 From: tfheen at redpill-linpro.com (Tollef Fog Heen) Date: Wed, 01 Jul 2009 02:14:23 +0200 Subject: "ExpBan: nnn was banned" In-Reply-To: <24a219a50906191906v59c6588bkeac4747d7547db85@mail.gmail.com> (Martin Goldman's message of "Fri, 19 Jun 2009 22:06:52 -0400") References: <24a219a50906191906v59c6588bkeac4747d7547db85@mail.gmail.com> Message-ID: <87zlbpmf8g.fsf@qurzaw.linpro.no> ]] Martin Goldman | I got a complaint from a user that some percentage of his page views seem | too slow to be coming from the cache. I brought up our home page and started | refreshing it over and over. Sure enough, while most of the page views are | in fact getting cached, once every 5 or 6 times, I get a cache miss. When | this happens, varnishlog shows something like this: | | 32 VCL_call c recv lookup | 32 VCL_call c hash hash | * 32 ExpBan c 1607291667 was banned | * 32 VCL_call c miss fetch Do you add purges every now and then? This looks like the result of a purge forcing an object to be removed, which then gives you a cache miss. -- Tollef Fog Heen Redpill Linpro -- Changing the game! t: +47 21 54 41 73 From ahooper at bmjgroup.com Mon Jul 6 15:25:38 2009 From: ahooper at bmjgroup.com (Alex Hooper) Date: Mon, 6 Jul 2009 16:25:38 +0100 Subject: tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC) In-Reply-To: <4A5215A0.9040206@gmail.com> References: <4A51ED93.3020805@bmjgroup.com> <4A5215A0.9040206@gmail.com> Message-ID: <4A521772.9080201@bmjgroup.com> Rob S uttered: > Alex Hooper wrote: >> I wonder does anyone have an idea of what might be happening? >> > Alex, > > I've not seen this before, but I've found that 'varnishlog' typically > provides very helpful information. Can you post a log of the request? Hi Rob, Log follows. # /local/bin/varnishlog 0 CLI - Rd ping 0 CLI - Wr 0 200 PONG 1246893786 1.0 0 CLI - Rd ping 0 CLI - Wr 0 200 PONG 1246893789 1.0 0 CLI - Rd ping 0 CLI - Wr 0 200 PONG 1246893792 1.0 5 SessionOpen c 193.22.89.2 32966 :80 5 ReqStart c 193.22.89.2 32966 1075615183 5 RxRequest c GET 5 RxURL c / 5 RxProtocol c HTTP/1.1 5 RxHeader c Host: group.bmj.com 5 RxHeader c User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729) 5 RxHeader c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 5 RxHeader c Accept-Language: en,en-us;q=0.7,fr;q=0.3 5 RxHeader c Accept-Encoding: gzip,deflate 5 RxHeader c Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 5 RxHeader c Keep-Alive: 300 5 RxHeader c Connection: keep-alive 5 RxHeader c Cookie: OAX=wRZZAknt5E8AB5OI; RMAM=01cen16_1230.4aM4O9VW|cen7_1230.4a4Oc3E0|; __utma=1.790239517130949200.1240951802.1241617626.1241622593.3; __utmz=1.1240951802.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); anonId=0bbac82d-19f2-4efe-904b-6d2ca5cbbb4 5 RxHeader c Cache-Control: max-age=0 5 VCL_call c recv 5 VCL_return c pass 5 VCL_call c pass 5 VCL_return c pass 5 VCL_call c error 5 VCL_return c deliver 5 Length c 466 5 VCL_call c deliver 5 VCL_return c deliver 5 TxProtocol c HTTP/1.1 5 TxStatus c 503 5 TxResponse c Service Unavailable 5 TxHeader c Server: Varnish 5 TxHeader c Retry-After: 0 5 TxHeader c Content-Type: text/html; charset=utf-8 5 TxHeader c Content-Length: 466 5 TxHeader c Date: Mon, 06 Jul 2009 15:23:14 GMT 5 TxHeader c X-Varnish: 1075615183 5 TxHeader c Age: 0 5 TxHeader c Via: 1.1 varnish 5 TxHeader c Connection: close 5 ReqEnd c 1075615183 1246893794.202419758 1246893794.217293501 0.001565456 0.014699459 0.000174284 5 SessionClose c error 5 StatSess c 193.22.89.2 32966 0 1 1 0 1 0 235 466 0 StatAddr - 193.22.89.2 0 10951 27 27 0 27 0 6345 12582 0 CLI - Rd ping 0 CLI - Wr 0 200 PONG 1246893795 1.0 Cheers, Alex. -- Alex Hooper | www.bmjpg.com Systems and Database Administration | ahooper at bmjgroup.com BMJ Technology, BMJ Publishing Group | +44 20 7383 6049 BMA House, LONDON, WC1H 9JR | _______________________________________________________________________ The BMJ Group is one of the world's most trusted providers of medical information for doctors, researchers, health care workers and patients www.bmjgroup.bmj.com. This email and any attachments are confidential. If you have received this email in error, please delete it and kindly notify us. If the email contains personal views then the BMJ Group accepts no responsibility for these statements. The recipient should check this email and attachments for viruses because the BMJ Group accepts no liability for any damage caused by viruses. Emails sent or received by the BMJ Group may be monitored for size, traffic, distribution and content. BMJ Publishing Group Limited trading as BMJ Group. A private limited company, registered in England and Wales under registration number 03102371. Registered office: BMA House, Tavistock Square, London WC1H 9JR, UK. _______________________________________________________________________ From ahooper at bmjgroup.com Mon Jul 6 15:43:52 2009 From: ahooper at bmjgroup.com (Alex Hooper) Date: Mon, 6 Jul 2009 16:43:52 +0100 Subject: tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC) In-Reply-To: <4A5218E1.40100@gmail.com> References: <4A51ED93.3020805@bmjgroup.com> <4A5215A0.9040206@gmail.com> <4A521772.9080201@bmjgroup.com> <4A5218E1.40100@gmail.com> Message-ID: <4A521BB8.10303@bmjgroup.com> Rob S uttered: > Alex Hooper wrote: > >> 5 VCL_call c recv >> 5 VCL_return c pass >> 5 VCL_call c pass >> 5 VCL_return c pass >> 5 VCL_call c error >> 5 VCL_return c deliver > > > It looks like you're using "pass", rather than "fetch", which probably > isn't desirable when you're just doing a simple GET request. I'd expect > to see something like: > > 7 VCL_call c recv > 7 VCL_return c lookup > 7 VCL_call c hash > 7 VCL_return c hash > 7 VCL_call c miss > 7 VCL_return c fetch > > Can you send your VCL file, so that I can take a look at the logic? > Hi, Having not got past the testing stage, my VCL file comprises only the lines I origianlly posted, the rest is all default. Here it is again for convenience: backend test01 { .host = "83.231.175.37"; .port = "80"; .connect_timeout = 10s; .first_byte_timeout = 15s; .between_bytes_timeout = 2s; } sub vcl_recv { set req.backend = test01; } Cheers, Alex. -- Alex Hooper | www.bmjpg.com Systems and Database Administration | ahooper at bmjgroup.com BMJ Technology, BMJ Publishing Group | +44 20 7383 6049 BMA House, LONDON, WC1H 9JR | _______________________________________________________________________ The BMJ Group is one of the world's most trusted providers of medical information for doctors, researchers, health care workers and patients www.bmjgroup.bmj.com. This email and any attachments are confidential. If you have received this email in error, please delete it and kindly notify us. If the email contains personal views then the BMJ Group accepts no responsibility for these statements. The recipient should check this email and attachments for viruses because the BMJ Group accepts no liability for any damage caused by viruses. Emails sent or received by the BMJ Group may be monitored for size, traffic, distribution and content. BMJ Publishing Group Limited trading as BMJ Group. A private limited company, registered in England and Wales under registration number 03102371. Registered office: BMA House, Tavistock Square, London WC1H 9JR, UK. _______________________________________________________________________ From ross at trademe.co.nz Mon Jul 6 21:16:07 2009 From: ross at trademe.co.nz (Ross Brown) Date: Tue, 7 Jul 2009 09:16:07 +1200 Subject: Segfault in libvarnishcompat.so.1.0.0, after upgrading to build 4131 Message-ID: <1FF67D7369ED1A45832180C7C1109BCA0EA5AC8BA7@tmmail0.trademe.local> After upgrading to trunk (build 4131) last week, we are seeing an issue when the object cache (using malloc) becomes full. We are running a server with 16GB of RAM with the following startup options: -s malloc,12G -a 0.0.0.0:80 -T 0.0.0.0:8021 -f /usr/local/etc/current.vcl -t 86400 -h classic,42013 -P /var/run/varnish.pid -p obj_workspace=4096 -p sess_workspace=262144 -p lru_interval=60 -p sess_timeout=10 -p shm_workspace=32768 -p ping_interval=1 -p thread_pools=4 -p thread_pool_min=50 -p thread_pool_max=4000 -p cli_timeout=20 VCL is pretty basic, we normalise and only accept GET and HEAD requests. Plotting usage using Cacti, we see varnishd crash and restart when the object cache is full. Example of an error occurring : Jul 3 11:04:50 tmcache2 kernel: [68325.150385] varnishd[15155]: segfault at ff ip 00007f1df03a4d06 sp 00007f1dd44b6120 error 4 in libvarnishcompat.so.1.0.0[7f1df039e000+e000] Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, killing it. Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, killing it. Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (15130) died signal=11 Jul 3 11:04:52 tmcache2 varnishd[2594]: Child cleanup complete Jul 3 11:04:52 tmcache2 varnishd[2594]: child (5066) Started Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Closed fds: 3 4 5 8 9 11 12 Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Child starts Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Ready This bug only occurs in build 4131, prior to this we were using build 4019 and didn't have this issue. Ross Brown Trade Me Limited From kb+varnish at slide.com Mon Jul 6 23:12:40 2009 From: kb+varnish at slide.com (Ken Brownfield) Date: Mon, 6 Jul 2009 16:12:40 -0700 Subject: Inline C and memory allocation In-Reply-To: References: Message-ID: <3B689787-A7EB-4485-B013-8D9B5391BC6E@slide.com> Isn't VRT_SetHdr() what you're looking for? Mind its semantics, though. -- Ken. On Jul 6, 2009, at 7:26 AM, Laurence Rowe wrote: > Hi, > > Thought my C is rather rusty by now, I'd like to make the mod_auth_tkt > [1] signed cookie authentication / authorisation system work with > Varnish. The idea would be to encode the acceptable authorisation > tokens for a page into it's response header then check the tokens in > the user's auth_tkt cookie against the tokens in the cached header > during vcl_deliver. > > I can find examples online that read data from headers using > VRT_GetHdr, but in order to implement/port mod_auth_tkt I will need to > decode the data in the cookie and write the decoded contents to new > headers. With apache, I would use apr_psprintf or similar to allocate > memory from the pool. What would be the equivalent in Varnish? > > Laurence > > [1] http://www.openfusion.com.au/labs/mod_auth_tkt/ > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc From kb+varnish at slide.com Mon Jul 6 23:12:40 2009 From: kb+varnish at slide.com (Ken Brownfield) Date: Mon, 6 Jul 2009 16:12:40 -0700 Subject: Inline C and memory allocation In-Reply-To: References: Message-ID: <3B689787-A7EB-4485-B013-8D9B5391BC6E@slide.com> Isn't VRT_SetHdr() what you're looking for? Mind its semantics, though. -- Ken. On Jul 6, 2009, at 7:26 AM, Laurence Rowe wrote: > Hi, > > Thought my C is rather rusty by now, I'd like to make the mod_auth_tkt > [1] signed cookie authentication / authorisation system work with > Varnish. The idea would be to encode the acceptable authorisation > tokens for a page into it's response header then check the tokens in > the user's auth_tkt cookie against the tokens in the cached header > during vcl_deliver. > > I can find examples online that read data from headers using > VRT_GetHdr, but in order to implement/port mod_auth_tkt I will need to > decode the data in the cookie and write the decoded contents to new > headers. With apache, I would use apr_psprintf or similar to allocate > memory from the pool. What would be the equivalent in Varnish? > > Laurence > > [1] http://www.openfusion.com.au/labs/mod_auth_tkt/ > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc From kb+varnish at slide.com Tue Jul 7 00:35:17 2009 From: kb+varnish at slide.com (Ken Brownfield) Date: Mon, 6 Jul 2009 17:35:17 -0700 Subject: Thread memory allocation question In-Reply-To: <874otxntwz.fsf@qurzaw.linpro.no> References: <5039.1245446512@critter.freebsd.dk> <874otxntwz.fsf@qurzaw.linpro.no> Message-ID: Overcommit defaults off; sane use cases for overcommit are few and far between, IMHO. With overcommit on, the performance implications might be more of a wash... but then you have two problems. Even though the stack remains mostly unused, it would still have to be swapped out under memory pressure, and thread creation and reclamation would cause more swap thrash. Used or not, the performance pain is the same. Plus I'd rather not allocate/waste 8GB of RAM for 1,000 varnish threads at idle, which represents two orders of magnitude more than it needs. A 1MB limit holds up fine for me, though 256KB or even 128KB should be fine as well, since Varnish tends to sit at around 86KB under Linux/ x86_64. Maybe Varnish could have its own stacksize parameter, rather than using the ulimit value? Out-of-box scalability would be much better, and this is how MySQL handles it, FWIW. -- Ken. On Jun 30, 2009, at 5:11 PM, Tollef Fog Heen wrote: > > ]] "Poul-Henning Kamp" > > | In message <5C056AE2-7207-42F8-9E4B-0F541DC4B1B2 at slide.com>, Ken > Brownfield wri > | tes: > | > | >Would a stack overflow take out the whole child, or just that > thread? > | > | The kernel would try to extend the stack and provided you are not on > | a 32 bit system, it shouldn't ever have a problem with that. > > On the other hand, the gain from decreasing the stack size would > just be > a bit less book-keeping for the kernel, unless you have overcommit > turned off (which I don't think anybody actually uses), right? > > -- > Tollef Fog Heen > Redpill Linpro -- Changing the game! > t: +47 21 54 41 73 > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc From me at mgoldman.com Tue Jul 7 00:42:08 2009 From: me at mgoldman.com (Martin Goldman) Date: Mon, 6 Jul 2009 20:42:08 -0400 Subject: "ExpBan: nnn was banned" In-Reply-To: <87zlbpmf8g.fsf@qurzaw.linpro.no> References: <24a219a50906191906v59c6588bkeac4747d7547db85@mail.gmail.com> <87zlbpmf8g.fsf@qurzaw.linpro.no> Message-ID: <24a219a50907061742qea1316cwec98e36f0164c31b@mail.gmail.com> Yes - Eventually I kind of figured out/assumed that's what was going on. Initially I couldn't figure out what "banned" meant. Thanks, Martin On Tue, Jun 30, 2009 at 8:14 PM, Tollef Fog Heen wrote: > ]] Martin Goldman > > | I got a complaint from a user that some percentage of his page views seem > | too slow to be coming from the cache. I brought up our home page and > started > | refreshing it over and over. Sure enough, while most of the page views > are > | in fact getting cached, once every 5 or 6 times, I get a cache miss. > When > | this happens, varnishlog shows something like this: > | > | 32 VCL_call c recv lookup > | 32 VCL_call c hash hash > | * 32 ExpBan c 1607291667 was banned > | * 32 VCL_call c miss fetch > > Do you add purges every now and then? This looks like the result of a > purge forcing an object to be removed, which then gives you a cache > miss. > > -- > Tollef Fog Heen > Redpill Linpro -- Changing the game! > t: +47 21 54 41 73 > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From l at lrowe.co.uk Tue Jul 7 09:56:13 2009 From: l at lrowe.co.uk (Laurence Rowe) Date: Tue, 7 Jul 2009 11:56:13 +0200 Subject: Inline C and memory allocation In-Reply-To: <3B689787-A7EB-4485-B013-8D9B5391BC6E@slide.com> References: <3B689787-A7EB-4485-B013-8D9B5391BC6E@slide.com> Message-ID: I'm not certain if I need to manage the memory of the string that I set the header too. It looks like VRT_SetHdr copies the string into it's own memory managed space though. That just leaves me with the task of allocating enough memory to perform base64 decoding, md5 calculation etc. I guess I can just allocate a large buffer on the stack and test that I won't overrun it. 2009/7/7 Ken Brownfield : > Isn't VRT_SetHdr() what you're looking for? ?Mind its semantics, though. > -- > Ken. > > On Jul 6, 2009, at 7:26 AM, Laurence Rowe wrote: > >> Hi, >> >> Thought my C is rather rusty by now, I'd like to make the mod_auth_tkt >> [1] signed cookie authentication / authorisation system work with >> Varnish. The idea would be to encode the acceptable authorisation >> tokens for a page into it's response header then check the tokens in >> the user's auth_tkt cookie against the tokens in the cached header >> during vcl_deliver. >> >> I can find examples online that read data from headers using >> VRT_GetHdr, but in order to implement/port mod_auth_tkt I will need to >> decode the data in the cookie and write the decoded contents to new >> headers. With apache, I would use apr_psprintf or similar to allocate >> memory from the pool. What would be the equivalent in Varnish? >> >> Laurence >> >> [1] http://www.openfusion.com.au/labs/mod_auth_tkt/ >> _______________________________________________ >> varnish-misc mailing list >> varnish-misc at projects.linpro.no >> http://projects.linpro.no/mailman/listinfo/varnish-misc > > From ahooper at bmjgroup.com Tue Jul 7 11:35:45 2009 From: ahooper at bmjgroup.com (Alex Hooper) Date: Tue, 7 Jul 2009 12:35:45 +0100 Subject: tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC) In-Reply-To: <67B625C3-2259-4483-AC9B-BB222B7BD5F1@slide.com> References: <4A51ED93.3020805@bmjgroup.com> <67B625C3-2259-4483-AC9B-BB222B7BD5F1@slide.com> Message-ID: <4A533311.8060604@bmjgroup.com> Ken Brownfield uttered: > Your tcpdump seems to imply that there's an immediate timeout on the > connection. Do the timeouts (and other settings) emitted from > "varnishadm -T :6082 param.show" all have sane values? > As far as I can tell...: accept_fd_holdoff 50 [ms] acceptor default (ports, poll) auto_restart on [bool] backend_http11 on [bool] between_bytes_timeout 60.000000 [s] cache_vbe_conns off [bool] cc_command "cc -Kpic -G -o %o %s" cli_buffer 8192 [bytes] cli_timeout 5 [seconds] client_http11 off [bool] clock_skew 10 [s] connect_timeout 5.000000 [s] default_grace 10 default_ttl 120 [seconds] diag_bitmap 0x0 [bitmap] err_ttl 0 [seconds] esi_syntax 0 [bitmap] fetch_chunksize 128 [kilobytes] first_byte_timeout 60.000000 [s] group nobody (60001) listen_address :80 listen_depth 1024 [connections] log_hashstring off [bool] log_local_address off [bool] lru_interval 2 [seconds] max_esi_includes 5 [includes] max_restarts 4 [restarts] obj_workspace 8192 [bytes] overflow_max 100 [%] ping_interval 3 [seconds] pipe_timeout 60 [seconds] prefer_ipv6 off [bool] purge_dups off [bool] purge_hash on [bool] rush_exponent 3 [requests per request] send_timeout 600 [seconds] sess_timeout 5 [seconds] sess_workspace 16384 [bytes] session_linger 0 [ms] shm_reclen 255 [bytes] shm_workspace 8192 [bytes] srcaddr_hash 1049 [buckets] srcaddr_ttl 30 [seconds] thread_pool_add_delay 20 [milliseconds] thread_pool_add_threshold 2 [requests] thread_pool_fail_delay 200 [milliseconds] thread_pool_max 500 [threads] thread_pool_min 5 [threads] thread_pool_purge_delay 1000 [milliseconds] thread_pool_timeout 300 [seconds] thread_pools 2 [pools] user nobody (60001) vcl_trace off [bool] I did note this in config.log; not sure if it implies the kind of problem I'm seeing: configure:19586: checking whether SO_SNDTIMEO works configure:19629: gcc -std=gnu99 -o conftest -g -O2 conftest.c >&5 Undefined first referenced symbol in file socket /var/tmp//ccKa38Tg.o setsockopt /var/tmp//ccKa38Tg.o ld: fatal: Symbol referencing errors. No output written to conftest ... configure:19673: WARNING: connection timeouts will not work > You might also experiment by adding a probe to your backend, to see if > the probes pass. > They show as sick, with the same reset symptom in tcpdump. > Otherwise, I guess I'd suspect an issue with compilation. They say > Varnish only supports gcc (which might not be strictly true, but I'll > bet it's not tested under the Sun compiler). What was the isfinite() > issue? The same as that reported at http://varnish.projects.linpro.no/ticket/464. Actually, having re-tried, I can get around the isfinite() issue with the provided patch, but am caught by the second issue with NAN. I get: gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I../.. -I../../include -DVARNISH_STATE_DIR='"/local/var/varnish"' -g -O2 -MT varnishd-cache_center.o -MD -MP -MF .deps/varnishd-cache_center.Tpo -c -o varnishd-cache_center.o `test -f 'cache_center.c' || echo './'`cache_center.c cache_center.c: In function `cnt_done': cache_center.c:234: error: incompatible types in assignment cache_center.c:241: error: incompatible types in assignment *** Error code 1 It's starting to look as though there may be a few too many issues with Varnish on Solaris currently. I'll keep trying to resolve, but time issues may force me to fall back to squid for the moment. Cheers, Alex. -- Alex Hooper | www.bmjpg.com Systems and Database Administration | ahooper at bmjgroup.com BMJ Technology, BMJ Publishing Group | +44 20 7383 6049 BMA House, LONDON, WC1H 9JR | _______________________________________________________________________ The BMJ Group is one of the world's most trusted providers of medical information for doctors, researchers, health care workers and patients www.bmjgroup.bmj.com. This email and any attachments are confidential. If you have received this email in error, please delete it and kindly notify us. If the email contains personal views then the BMJ Group accepts no responsibility for these statements. The recipient should check this email and attachments for viruses because the BMJ Group accepts no liability for any damage caused by viruses. Emails sent or received by the BMJ Group may be monitored for size, traffic, distribution and content. BMJ Publishing Group Limited trading as BMJ Group. A private limited company, registered in England and Wales under registration number 03102371. Registered office: BMA House, Tavistock Square, London WC1H 9JR, UK. _______________________________________________________________________ From solskogen at carebears.mine.nu Wed Jul 8 05:41:33 2009 From: solskogen at carebears.mine.nu (Christer Solskogen) Date: Wed, 08 Jul 2009 07:41:33 +0200 Subject: Notice: locking SHMFILE in core failed: Resource temporarily, unavailable Message-ID: I'm getting this notice whenever I (re)start varnish on FreeBSD 7.2 (amd64). Varnish runs and works as I would expect, so is it harmless? shine# grep varnishd /etc/rc.conf varnishd_enable="YES" varnishd_listen=":80" varnishd_config="/usr/local/etc/varnish/shine.vcl" varnishd_storage="file,/data/div/varnish,60G" Notice: locking SHMFILE in core failed: Resource temporarily unavailable shine# /usr/local/etc/rc.d/varnishd restart Stopping varnishd. Starting varnishd. storage_file: filename: /data/div/varnish size 61440 MB. Using old SHMFILE Notice: locking SHMFILE in core failed: Resource temporarily unavailable shine# pkg_info | grep varnish varnish-2.0.4 The Varnish high-performance HTTP accelerator -- chs From des at des.no Wed Jul 8 10:24:07 2009 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Wed, 08 Jul 2009 12:24:07 +0200 Subject: Notice: locking SHMFILE in core failed: Resource temporarily, unavailable In-Reply-To: (Christer Solskogen's message of "Wed, 08 Jul 2009 07:41:33 +0200") References: Message-ID: <8663e3bhh4.fsf@ds4.des.no> Christer Solskogen writes: > I'm getting this notice whenever I (re)start varnish on FreeBSD 7.2 > (amd64). Varnish runs and works as I would expect, so is it harmless? Yes, it's harmless (though it may have an impact on performance if you're running a log consumer). The cause is the same as for the GPG "insecure memory" warning: http://maycontaintracesofbolts.blogspot.com/2009/06/gpg-insecure-memory.html Somebody[tm] should document this Somewhere[tm]. DES -- Dag-Erling Sm?rgrav - des at des.no From solskogen at carebears.mine.nu Wed Jul 8 11:21:31 2009 From: solskogen at carebears.mine.nu (Christer Solskogen) Date: Wed, 08 Jul 2009 13:21:31 +0200 Subject: Notice: locking SHMFILE in core failed: Resource temporarily, unavailable In-Reply-To: <8663e3bhh4.fsf@ds4.des.no> References: <8663e3bhh4.fsf@ds4.des.no> Message-ID: On 7/8/09 12:24 PM, Dag-Erling Sm?rgrav wrote: > Christer Solskogen writes: >> I'm getting this notice whenever I (re)start varnish on FreeBSD 7.2 >> (amd64). Varnish runs and works as I would expect, so is it harmless? > > Yes, it's harmless (though it may have an impact on performance if > you're running a log consumer). The cause is the same as for the GPG > "insecure memory" warning: > > http://maycontaintracesofbolts.blogspot.com/2009/06/gpg-insecure-memory.html > Thanks. Setting vm.max_wired to 700000 is no problem on a machine with 8GB of memory I guess. (But I also have to admit I dont really know what vm.max_wired really means) > Somebody[tm] should document this Somewhere[tm]. > I think you just did :) Neither Google, nor searching the mailinglist came up with anything. I also notice that a certain Someone(tm) is the maintainer of the varnish port on FreeBSD. Maybe that Someone(tm) could add a note about it in pkg-message? -- chs From des at des.no Wed Jul 8 12:42:56 2009 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Wed, 08 Jul 2009 14:42:56 +0200 Subject: Notice: locking SHMFILE in core failed: Resource temporarily, unavailable In-Reply-To: (Christer Solskogen's message of "Wed, 08 Jul 2009 13:21:31 +0200") References: <8663e3bhh4.fsf@ds4.des.no> Message-ID: <86y6qz5orz.fsf@ds4.des.no> Christer Solskogen writes: > Thanks. Setting vm.max_wired to 700000 is no problem on a machine with > 8GB of memory I guess. (But I also have to admit I dont really know what > vm.max_wired really means) It sets the maximum amount of memory that may be wired (i.e. marked as unswappable). Unfortunately, this is a rather fuzzy concept in FreeBSD, as it includes memory wired by userland applications using mlock(2), the unified buffer cache, and (if I'm not mistaken) memory wired by device drivers for DMA buffers and the like. There should be separate limits for all of these, and per-process limits for the first. Patches are welcome :) DES -- Dag-Erling Sm?rgrav - des at des.no From balghi at gmail.com Sun Jul 12 03:28:29 2009 From: balghi at gmail.com (Babblu Ghimire) Date: Sun, 12 Jul 2009 09:13:29 +0545 Subject: To force to use same cache directory at everytime I started varnish Message-ID: <91720cac0907112028y60b8c54v9f3d22e75d0d3ca3@mail.gmail.com> I am a lerner and know very little about Varnish. Though, every time I started varnish on my RHEL, change of cache directory is shown as follows : storage_file: filename: ./varnish.kEDdOW (unlinked) size 2047 MB. I am curious - is there anyway to force to use same cache directory at everytime I started varnish ? Regards Balram From davel at creative.ly Fri Jul 10 23:07:20 2009 From: davel at creative.ly (Dave Llopis) Date: Fri, 10 Jul 2009 16:07:20 -0700 Subject: make check fails most tests on Ubuntu 8.10 : missing dependency? Message-ID: I'm trying to install either 2.04 or trunk on Ubuntu Intrepid, but almost all the "make check" tests are failing, all (or at least most) of which seem to fail like this: Assert error in varnish_start(), vtc_varnish.c line 329: Condition(u == CLIS_OK) not true. I'm running a base install of Intrepid, with just the few packages I think are required for varnish: ncurses-term, automake, libtool, make, and their dependencies. The output of autogen.sh, configure, make, and make check is half a megabyte, so I'm posting it elsewhere rather including it here: Any ideas? Thanks. From lazy404 at gmail.com Fri Jul 10 22:21:38 2009 From: lazy404 at gmail.com (Lazy) Date: Sat, 11 Jul 2009 00:21:38 +0200 Subject: varnish 2.0.4 backend errors Message-ID: Hi, We are having hard time figuring out what's cosing varnish 503 error, our backend is apache is debian 5 default, os is linux x86_64 2.6.26, everything is running on a single machine /usr/local/sbin/varnishd -a 0.0.0.0:80 -f /usr/local/etc/varnish/default.vcl -s malloc -T localhost:9999 -w 10,6000,300 -u nobody running with a single backend .connect_timeout = 1s; added to the backend definition I added sub vcl_error { if (req.restarts < 10) { restart; } } (is it possible to add a pause before doing restart ?) which helps in some cases but not always In about 0.1% of request we get 10 TxRequest b POST 10 TxURL b /php 10 TxProtocol b HTTP/1.1 10 TxHeader b x-requested-with: XMLHttpRequest 10 TxHeader b Accept-Language: pl 10 TxHeader b Referer: http://www.xxxxx/php 10 TxHeader b Accept: text/html, */* 10 TxHeader b Content-Type: application/x-www-form-urlencoded 10 TxHeader b UA-CPU: x86 10 TxHeader b Accept-Encoding: gzip, deflate 10 TxHeader b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1) 10 TxHeader b Content-Length: 8 10 TxHeader b Cookie: _.1 10 TxHeader b X-NovINet: v1.2 10 TxHeader b Host: www.kinograj.cinema-city.pl 10 TxHeader b X-Varnish: 603437812 10 TxHeader b X-Forwarded-For: 79.162.xxx 10 BackendClose b default 31 VCL_call c error 31 VCL_return c deliver 31 Length c 465 31 VCL_call c deliver 31 VCL_return c deliver 31 TxProtocol c HTTP/1.1 31 TxStatus c 503 machine is not overloaded, there are 150 apache running 80% of them is idle what does 31 VCL_call c error mean , a connection error, apache returned invalid response ? can I get some more information about this error using some syslog in vcl_error or mayby in some other way ? my param.show accept_fd_holdoff 50 [ms] acceptor default (epoll, poll) auto_restart on [bool] backend_http11 on [bool] between_bytes_timeout 60.000000 [s] cache_vbe_conns off [bool] cc_command "exec cc -fpic -shared -Wl,-x -o %o %s" cli_buffer 8192 [bytes] cli_timeout 5 [seconds] client_http11 off [bool] clock_skew 10 [s] connect_timeout 1.000000 [s] default_grace 10 default_ttl 120 [seconds] diag_bitmap 0x0 [bitmap] err_ttl 0 [seconds] esi_syntax 0 [bitmap] fetch_chunksize 128 [kilobytes] first_byte_timeout 60.000000 [s] group nogroup (65534) listen_address 0.0.0.0:80 listen_depth 1024 [connections] log_hashstring off [bool] log_local_address off [bool] lru_interval 2 [seconds] max_esi_includes 5 [includes] max_restarts 4 [restarts] obj_workspace 8192 [bytes] overflow_max 100 [%] ping_interval 3 [seconds] pipe_timeout 60 [seconds] prefer_ipv6 off [bool] purge_dups off [bool] purge_hash on [bool] rush_exponent 3 [requests per request] send_timeout 600 [seconds] sess_timeout 5 [seconds] sess_workspace 16384 [bytes] session_linger 0 [ms] shm_reclen 255 [bytes] shm_workspace 8192 [bytes] srcaddr_hash 1049 [buckets] srcaddr_ttl 30 [seconds] thread_pool_add_delay 20 [milliseconds] thread_pool_add_threshold 2 [requests] thread_pool_fail_delay 200 [milliseconds] thread_pool_max 6000 [threads] thread_pool_min 10 [threads] thread_pool_purge_delay 1000 [milliseconds] thread_pool_timeout 300 [seconds] thread_pools 2 [pools] user nobody (65534) vcl_trace off [bool] stats 4571680 Client connections accepted 13074671 Client requests received 9246516 Cache hits 8084 Cache hits for pass 159743 Cache misses 3768909 Backend connections success 0 Backend connections not attempted 0 Backend connections too many 60064 Backend connections failures this is old and it's not changing now 2610440 Backend connections reuses 3471493 Backend connections recycles 0 Backend connections unused 1053 N struct srcaddr 6 N active struct srcaddr 1017 N struct sess_mem 60 N struct sess 852 N struct object 4183 N struct objecthead 0 N struct smf 0 N small free smf 0 N large free smf 22 N struct vbe_conn 494 N struct bereq 20 N worker threads 4152 N worker threads created 0 N worker threads not created 0 N worker threads limited 0 N queued work requests 226847 N overflowed work requests 0 N dropped work requests 2 N backends 159680 N expired objects 0 N LRU nuked objects 0 N LRU saved objects 4008654 N LRU moved objects 0 N objects on deathrow 0 HTTP header overflows 0 Objects sent with sendfile 5482793 Objects sent with write 0 Objects overflowing workspace 4564281 Total Sessions 13075788 Total Requests 25 Total pipe 3669512 Total pass 3766737 Total fetch 3167362228 Total header bytes 129249988603 Total body bytes 446282 Session Closed 22017 Session Pipeline 78623 Session Read Ahead 0 Session Linger 12628704 Session herd 650794078 SHM records 44487143 SHM writes 627 SHM flushes due to overflow 81635 SHM MTX contention 283 SHM cycles through buffer 0 allocator requests 0 outstanding allocations 0 bytes allocated 0 bytes free 7558668 SMA allocator requests 122 SMA outstanding allocations 1781820 SMA outstanding bytes 198293321655 SMA bytes allocated 198291539835 SMA bytes free 62427 SMS allocator requests 0 SMS outstanding allocations 18446744073709546036 SMS outstanding bytes 29026230 SMS bytes allocated 29028555 SMS bytes freed 3766947 Backend requests made 4 N vcl total 4 N vcl available 0 N vcl discarded 1 N total active purges 1 N new purges added 0 N old purges deleted 0 N objects tested 0 N regexps tested against 0 N duplicate purges removed 0 HCB Lookups without lock 0 HCB Lookups with lock 0 HCB Inserts 0 Objects ESI parsed (unlock) 0 ESI parse errors (unlock) I would be grateful if anyone gave me some pointers about where to look next. From brian.pan at light-mc.com Wed Jul 8 20:42:12 2009 From: brian.pan at light-mc.com (Brian Pan) Date: Wed, 8 Jul 2009 15:42:12 -0500 Subject: Purging Cache Message-ID: <6FF2E7BD5BC7CF48A9455770E7970D1E46E055A6C4@AUSP01VMBX05.collaborationhost.net> Hi all, I have a newbie question regarding 'purging the cache.' When I run the following command to purge the root domain of my website, varnishadm -T localhost:80 purge.url ^/$ I get this error: An error occured in receiving status. Based on examples, I've setup my VCL as follows: //******** acl purge { "localhost"; "my local ip address"; } sub vcl_recv { # Allow wildcard purging if (req.request == "PURGE"){ if (!client.ip ~ purge) { error 405 "Not allowed."; } purge_url(req.url); } ... //*********** Any help would be greatly appreciated. Thanks, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.pan at light-mc.com Thu Jul 9 19:17:35 2009 From: brian.pan at light-mc.com (Brian Pan) Date: Thu, 9 Jul 2009 14:17:35 -0500 Subject: Question about Cache Purge Message-ID: <6FF2E7BD5BC7CF48A9455770E7970D1E46E0602BAC@AUSP01VMBX05.collaborationhost.net> Hi all, I have a newbie question regarding 'purging the cache.' When I run the following command to purge the root domain of my website, varnishadm -T localhost:80 purge.url ^/$ I get this error: An error occured in receiving status. Based on examples, I've setup my VCL as follows: //******** acl purge { "localhost"; "my local ip address"; } sub vcl_recv { # Allow wildcard purging if (req.request == "PURGE"){ if (!client.ip ~ purge) { error 405 "Not allowed."; } purge_url(req.url); } ... //*********** Any help would be greatly appreciated. Thanks, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From lazy404 at gmail.com Mon Jul 13 21:31:16 2009 From: lazy404 at gmail.com (Lazy) Date: Mon, 13 Jul 2009 23:31:16 +0200 Subject: Fwd: varnish 2.0.4 backend errors In-Reply-To: References: <49A8EBD9-6971-4152-AF3E-A8925DE82490@slide.com> Message-ID: 2009/7/13 Ken Brownfield : > I would try correlating these 503's with actual Apache log lines. ?That way, > you'll see what Apache told Varnish (if it made it to Apache at all). there are no sign of thiese requests in apache logs > > Also, try tcpdumping the backend port to a file, and then use wireshark to > go back to the traffic at the time you see another 503. it happens with or without keep-alive (it was set before and I disabled it recently) it's likelly the same issue as in "tcp reset problem with varnish 2.0.4 o n Solaris 10 (SPARC)" thread it happens on 0,09% of request's, mostly POST'S but not only now I rechecked the logs and I found that varnish closed the connection just after sending the request tcpdump follows, there are no backend connection errors 01:03:50.181412 IP x.x.x.x.48563 > x.x.x.x.88: S 3999574413:3999574413(0) win 32792 ? ? ? ?0x0000: ?4500 003c 6586 4000 4006 3b56 566f f680 ?E.. x.x.x.x.48563: S 4006066350:4006066350(0) ack 3999574414 win 32768 ? ? ? ?0x0000: ?4500 003c 0000 4000 4006 a0dc 566f f680 ?E..<.. at .@...Vo.. ? ? ? ?0x0010: ?566f f680 0058 bdb3 eec7 b8ae ee64 a98e ?Vo...X.......d.. ? ? ? ?0x0020: ?a012 8000 2776 0000 0204 400c 0402 080a ?....'v.... at ..... ? ? ? ?0x0030: ?1b5a cc0b 1b5a cc0b 0103 0307 ? ? ? ? ? ?.Z...Z...... 01:03:50.181433 IP x.x.x.x.48563 > x.x.x.x.88: . ack 1 win 257 ? ? ? ?0x0000: ?4500 0034 6587 4000 4006 3b5d 566f f680 ?E..4e. at .@.;]Vo.. ? ? ? ?0x0010: ?566f f680 bdb3 0058 ee64 a98e eec7 b8af ?Vo.....X.d...... ? ? ? ?0x0020: ?8010 0101 0f9a 0000 0101 080a 1b5a cc0b ?.............Z.. ? ? ? ?0x0030: ?1b5a cc0b ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?.Z.. 01:03:54.581224 IP x.x.x.x.88 > x.x.x.x.48563: S 4006066350:4006066350(0) ack 3999574414 win 32768 ? ? ? ?0x0000: ?4500 003c 0000 4000 4006 a0dc 566f f680 ?E..<.. at .@...Vo.. ? ? ? ?0x0010: ?566f f680 0058 bdb3 eec7 b8ae ee64 a98e ?Vo...X.......d.. ? ? ? ?0x0020: ?a012 8000 232a 0000 0204 400c 0402 080a ?....#*.... at ..... ? ? ? ?0x0030: ?1b5a d057 1b5a cc0b 0103 0307 ? ? ? ? ? ?.Z.W.Z...... 01:03:54.581245 IP x.x.x.x.48563 > x.x.x.x.88: . ack 1 win 257 ? ? ? ?0x0000: ?4500 0040 6588 4000 4006 3b50 566f f680 ?E.. at e.@. at .;PVo.. ? ? ? ?0x0010: ?566f f680 bdb3 0058 ee64 a98e eec7 b8af ?Vo.....X.d...... ? ? ? ?0x0020: ?b010 0101 81fc 0000 0101 080a 1b5a d057 ?.............Z.W ? ? ? ?0x0030: ?1b5a d057 0101 050a eec7 b8ae eec7 b8af ?.Z.W............ 01:03:55.181239 IP x.x.x.x.48563 > x.x.x.x.88: P 1:896(895) ack 1 win 257 ? ? ? ?0x0000: ?4500 03b3 6589 4000 4006 37dc 566f f680 ?E...e. at .@.7.Vo.. ? ? ? ?0x0010: ?566f f680 bdb3 0058 ee64 a98e eec7 b8af ?Vo.....X.d...... ? ? ? ?0x0020: ?8018 0101 9d85 0000 0101 080a 1b5a d0ed ?.............Z.. ? ? ? ?0x0030: ?1b5a d057 504f 5354 202f 7175 697a 2f71 ?.Z.WPOST./quiz/q ? ? ? ?0x0040: ?7565 7374 696f 6e20 4854 5450 2f31 2e31 ?uestion.HTTP/1.1 ? ? ? ?0x0050: ?0d0a 782d 7265 7175 6573 7465 642d 7769 ?..x-requested-wi ? ? ? ?0x0060: ?7468 3a20 584d 4c48 7474 7052 6571 7565 ?th:.XMLHttpReque ...................... request data 01:03:55.181256 IP x.x.x.x.88 > x.x.x.x.48563: . ack 896 win 270 ? ? ? ? ? ? ? ? ? ? ? ?0x0000: ?4500 0034 6842 4000 4006 38a2 566f f680 ?E..4hB at .@.8.Vo.. ? ? ? ? ? ? ? ? ? ? ? ?0x0010: ?566f f680 0058 bdb3 eec7 b8af ee64 ad0d ?Vo...X.......d.. ? ? ? ? ? ? ? ? ? ? ? ?0x0020: ?8010 010e 024a 0000 0101 080a 1b5a d0ed ?.....J.......Z.. ? ? ? ? ? ? ? ? ? ? ? ?0x0030: ?1b5a d0ed ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?.Z.. ........... acked 01:03:55.181302 IP x.x.x.x.48563 > x.x.x.x.88: F 896:896(0) ack 1 win 257 ? ? ? ? ? ? ? ? ? ? ? ?0x0000: ?4500 0034 658a 4000 4006 3b5a 566f f680 ?E..4e. at .@.;ZVo.. ? ? ? ? ? ? ? ? ? ? ? ?0x0010: ?566f f680 bdb3 0058 ee64 ad0d eec7 b8af ?Vo.....X.d...... ? ? ? ? ? ? ? ? ? ? ? ?0x0020: ?8011 0101 0256 0000 0101 080a 1b5a d0ed ?.....V.......Z.. ? ? ? ? ? ? ? ? ? ? ? ?0x0030: ?1b5a d0ed ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?.Z.. ........... connection closed ??? 01:03:55.185068 IP x.x.x.x.88 > x.x.x.x.48563: P 1:2485(2484) ack 897 win 270 ? ? ? ? ? ? ? ? ? ? ? ?0x0000: ?4500 09e8 6843 4000 4006 2eed 566f f680 ?E...hC at .@...Vo.. ? ? ? ? ? ? ? ? ? ? ? ?0x0010: ?566f f680 0058 bdb3 eec7 b8af ee64 ad0e ?Vo...X.......d.. ? ? ? ? ? ? ? ? ? ? ? ?0x0020: ?8018 010e a3ba 0000 0101 080a 1b5a d0ed ?.............Z.. ? ? ? ? ? ? ? ? ? ? ? ?0x0030: ?1b5a d0ed 4854 5450 2f31 2e31 2032 3030 ?.Z..HTTP/1.1.200 ? ? ? ? ? ? ? ? ? ? ? ?0x0040: ?204f 4b0d 0a44 6174 653a 2046 7269 2c20 ?.OK..Date:.Fri,. ? ? ? ? ? ? ? ? ? ? ? ?0x0050: ?3130 204a 756c 2032 3030 3920 3233 3a30 ?10.Jul.2009.23:0 ? ? ? ? ? ? ? ? ? ? ? ?0x0060: ?333a 3535 2047 4d54 0d0a 5365 7276 6572 ?3:55.GMT..Server ? ? ? ? ? ? ? ? ? ? ? ?0x0070: ?3a20 4170 6163 6865 0d0a 4578 7069 7265 ?:.Apache..Expire ? ? ? ? ? ? ? ? ? ? ? ?0x0080: ?733a 2054 6875 2c20 3139 204e 6f76 2031 ?s:.Thu,.19.Nov.1 ? ? ? ? ? ? ? ? ? ? ? ?0x0090: ?3938 3120 3038 3a35 323a 3030 2047 4d54 ?981.08:52:00.GMT ? ? ? ? ? ? ? ? ? ? ? ?0x00a0: ?0d0a 4361 6368 652d 436f 6e74 726f 6c3a ?..Cache-Control: ? ? ? ? ? ? ? ? ? ? ? ?0x00b0: ?206e 6f2d 7374 6f72 652c 206e 6f2d 6361 ?.no-store,.no-ca ? ? ? ? ? ? ? ? ? ? ? ?0x00c0: ?6368 652c 206d 7573 742d 7265 7661 6c69 ?che,.must-revali ? ? ? ? ? ? ? ? ? ? ? ?0x00d0: ?6461 7465 2c20 706f 7374 2d63 6865 636b ?date,.post-check ? ? ? ? ? ? ? ? ? ? ? ?0x00e0: ?3d30 2c20 7072 652d 6368 6563 6b3d 300d ?=0,.pre-check=0. ? ? ? ? ? ? ? ? ? ? ? ?0x00f0: ?0a50 7261 676d 613a 206e 6f2d 6361 6368 ?.Pragma:.no-cach ? ? ? ? ? ? ? ? ? ? ? ?0x0100: ?650d 0a43 6f6e 7465 6e74 2d4c 656e 6774 ?e..Content-Lengt ? ? ? ? ? ? ? ? ? ? ? ?0x0110: ?683a 2032 3230 390d 0a43 6f6e 6e65 6374 ?h:.2209..Connect .......... reset 01:03:55.185091 IP x.x.x.x.48563 > x.x.x.x.88: R 3999575310:3999575310(0) win 0 ? ? ? ?0x0000: ?4500 0028 0000 4000 4006 a0f0 566f f680 ?E..(.. at .@...Vo.. ? ? ? ?0x0010: ?566f f680 bdb3 0058 ee64 ad0e 0000 0000 ?Vo.....X.d...... ? ? ? ?0x0020: ?5004 0000 bc81 0000 ? ? ? ? ? ? ? ? ? ? ?P....... > > Varnish should be logging failed health checks, which could explain periodic > 503s. ?If the health checks are succeeding, then Apache is likely > periodically closing a connection or sending bad data. > > You could also try turning off HTTP 1.1 to Apache from Varnish > (backend_http11=off) to see if that does anything. ?With Varnish fronting > Apache, doing 1.1 to the backend isn't necessarily a huge win. > > In the 503 case specifically, I'm not sure if restarts are respected. ?Does > this happen if you relax the connect_timeout, or if you remove the restart? I tried to fix the issue by adding a restart, it works as expected except that there is no automatic max restarts limit (it whould be nice if there was some just like in vcl_retr) > ?Does this happen only with POST? ?Do you have any custom settings in > vcl_pipe(), or do you bail out of vcl_recv() without falling through to the > default VCL? no, vcl_recv() was empty before, now it only has a restart (not trigerd by the closing connection issue) > > Finally, did you edit the Cookie header? ?It's malformed, which in theory > could cause either Apache or Varnish to dump it. no From lazy404 at gmail.com Mon Jul 13 21:51:26 2009 From: lazy404 at gmail.com (Lazy) Date: Mon, 13 Jul 2009 23:51:26 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: <49A8EBD9-6971-4152-AF3E-A8925DE82490@slide.com> References: <49A8EBD9-6971-4152-AF3E-A8925DE82490@slide.com> Message-ID: One more thing I set the processor affinity for all varnish threads so it will run on a single cpu, to check if it's not some race condition, so far I failed to trigger any errors by using ab, I will see tommorow when some real traffic will show up. 2009/7/13 Ken Brownfield : > I would try correlating these 503's with actual Apache log lines. ?That way, > you'll see what Apache told Varnish (if it made it to Apache at all). > > Also, try tcpdumping the backend port to a file, and then use wireshark to > go back to the traffic at the time you see another 503. > > Varnish should be logging failed health checks, which could explain periodic > 503s. ?If the health checks are succeeding, then Apache is likely > periodically closing a connection or sending bad data. > > You could also try turning off HTTP 1.1 to Apache from Varnish > (backend_http11=off) to see if that does anything. ?With Varnish fronting > Apache, doing 1.1 to the backend isn't necessarily a huge win. > > In the 503 case specifically, I'm not sure if restarts are respected. ?Does > this happen if you relax the connect_timeout, or if you remove the restart? > ?Does this happen only with POST? ?Do you have any custom settings in > vcl_pipe(), or do you bail out of vcl_recv() without falling through to the > default VCL? > > Finally, did you edit the Cookie header? ?It's malformed, which in theory > could cause either Apache or Varnish to dump it. > -- > Ken. > > On Jul 10, 2009, at 3:21 PM, Lazy wrote: > >> Hi, >> >> We are having hard time figuring out what's cosing varnish 503 error, >> our backend is apache is debian 5 default, os is linux x86_64 2.6.26, >> everything is running on a single machine >> >> /usr/local/sbin/varnishd -a 0.0.0.0:80 -f >> /usr/local/etc/varnish/default.vcl -s malloc -T localhost:9999 -w >> 10,6000,300 -u nobody >> >> running with a single backend >> .connect_timeout = 1s; added to the backend definition >> >> I added >> >> sub vcl_error { >> ? if (req.restarts < 10) { >> ? ? ? restart; >> ? } >> } >> >> (is it possible to add a pause before doing restart ?) >> >> which helps in some cases but not always >> >> >> In about 0.1% of request we get >> >> ?10 TxRequest ? ?b POST >> ?10 TxURL ? ? ? ?b /php >> ?10 TxProtocol ? b HTTP/1.1 >> ?10 TxHeader ? ? b x-requested-with: XMLHttpRequest >> ?10 TxHeader ? ? b Accept-Language: pl >> ?10 TxHeader ? ? b Referer: http://www.xxxxx/php >> ?10 TxHeader ? ? b Accept: text/html, */* >> ?10 TxHeader ? ? b Content-Type: application/x-www-form-urlencoded >> ?10 TxHeader ? ? b UA-CPU: x86 >> ?10 TxHeader ? ? b Accept-Encoding: gzip, deflate >> ?10 TxHeader ? ? b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; >> Windows NT 5.1) >> ?10 TxHeader ? ? b Content-Length: 8 >> ?10 TxHeader ? ? b Cookie: _.1 >> ?10 TxHeader ? ? b X-NovINet: v1.2 >> ?10 TxHeader ? ? b Host: www.kinograj.cinema-city.pl >> ?10 TxHeader ? ? b X-Varnish: 603437812 >> ?10 TxHeader ? ? b X-Forwarded-For: 79.162.xxx >> ?10 BackendClose b default >> ?31 VCL_call ? ? c error >> ?31 VCL_return ? c deliver >> ?31 Length ? ? ? c 465 >> ?31 VCL_call ? ? c deliver >> ?31 VCL_return ? c deliver >> ?31 TxProtocol ? c HTTP/1.1 >> ?31 TxStatus ? ? c 503 >> >> machine is not overloaded, there are 150 apache running 80% of them is >> idle >> >> what does >> 31 VCL_call ? ? c error mean , a connection error, apache returned >> invalid response ? >> >> can I get some more information about this error using some syslog in >> vcl_error or mayby in some other way ? >> >> my param.show >> accept_fd_holdoff ? ? ? ? ?50 [ms] >> acceptor ? ? ? ? ? ? ? ? ? default (epoll, poll) >> auto_restart ? ? ? ? ? ? ? on [bool] >> backend_http11 ? ? ? ? ? ? on [bool] >> between_bytes_timeout ? ? ?60.000000 [s] >> cache_vbe_conns ? ? ? ? ? ?off [bool] >> cc_command ? ? ? ? ? ? ? ? "exec cc -fpic -shared -Wl,-x -o %o %s" >> cli_buffer ? ? ? ? ? ? ? ? 8192 [bytes] >> cli_timeout ? ? ? ? ? ? ? ?5 [seconds] >> client_http11 ? ? ? ? ? ? ?off [bool] >> clock_skew ? ? ? ? ? ? ? ? 10 [s] >> connect_timeout ? ? ? ? ? ?1.000000 [s] >> default_grace ? ? ? ? ? ? ?10 >> default_ttl ? ? ? ? ? ? ? ?120 [seconds] >> diag_bitmap ? ? ? ? ? ? ? ?0x0 [bitmap] >> err_ttl ? ? ? ? ? ? ? ? ? ?0 [seconds] >> esi_syntax ? ? ? ? ? ? ? ? 0 [bitmap] >> fetch_chunksize ? ? ? ? ? ?128 [kilobytes] >> first_byte_timeout ? ? ? ? 60.000000 [s] >> group ? ? ? ? ? ? ? ? ? ? ?nogroup (65534) >> listen_address ? ? ? ? ? ? 0.0.0.0:80 >> listen_depth ? ? ? ? ? ? ? 1024 [connections] >> log_hashstring ? ? ? ? ? ? off [bool] >> log_local_address ? ? ? ? ?off [bool] >> lru_interval ? ? ? ? ? ? ? 2 [seconds] >> max_esi_includes ? ? ? ? ? 5 [includes] >> max_restarts ? ? ? ? ? ? ? 4 [restarts] >> obj_workspace ? ? ? ? ? ? ?8192 [bytes] >> overflow_max ? ? ? ? ? ? ? 100 [%] >> ping_interval ? ? ? ? ? ? ?3 [seconds] >> pipe_timeout ? ? ? ? ? ? ? 60 [seconds] >> prefer_ipv6 ? ? ? ? ? ? ? ?off [bool] >> purge_dups ? ? ? ? ? ? ? ? off [bool] >> purge_hash ? ? ? ? ? ? ? ? on [bool] >> rush_exponent ? ? ? ? ? ? ?3 [requests per request] >> send_timeout ? ? ? ? ? ? ? 600 [seconds] >> sess_timeout ? ? ? ? ? ? ? 5 [seconds] >> sess_workspace ? ? ? ? ? ? 16384 [bytes] >> session_linger ? ? ? ? ? ? 0 [ms] >> shm_reclen ? ? ? ? ? ? ? ? 255 [bytes] >> shm_workspace ? ? ? ? ? ? ?8192 [bytes] >> srcaddr_hash ? ? ? ? ? ? ? 1049 [buckets] >> srcaddr_ttl ? ? ? ? ? ? ? ?30 [seconds] >> thread_pool_add_delay ? ? ?20 [milliseconds] >> thread_pool_add_threshold ?2 [requests] >> thread_pool_fail_delay ? ? 200 [milliseconds] >> thread_pool_max ? ? ? ? ? ?6000 [threads] >> thread_pool_min ? ? ? ? ? ?10 [threads] >> thread_pool_purge_delay ? ?1000 [milliseconds] >> thread_pool_timeout ? ? ? ?300 [seconds] >> thread_pools ? ? ? ? ? ? ? 2 [pools] >> user ? ? ? ? ? ? ? ? ? ? ? nobody (65534) >> vcl_trace ? ? ? ? ? ? ? ? ?off [bool] >> >> stats >> ? ?4571680 ?Client connections accepted >> ? 13074671 ?Client requests received >> ? ?9246516 ?Cache hits >> ? ? ? 8084 ?Cache hits for pass >> ? ? 159743 ?Cache misses >> ? ?3768909 ?Backend connections success >> ? ? ? ? ?0 ?Backend connections not attempted >> ? ? ? ? ?0 ?Backend connections too many >> ? ? ?60064 ?Backend connections failures >> this is old and it's not changing now >> >> ? ?2610440 ?Backend connections reuses >> ? ?3471493 ?Backend connections recycles >> ? ? ? ? ?0 ?Backend connections unused >> ? ? ? 1053 ?N struct srcaddr >> ? ? ? ? ?6 ?N active struct srcaddr >> ? ? ? 1017 ?N struct sess_mem >> ? ? ? ? 60 ?N struct sess >> ? ? ? ?852 ?N struct object >> ? ? ? 4183 ?N struct objecthead >> ? ? ? ? ?0 ?N struct smf >> ? ? ? ? ?0 ?N small free smf >> ? ? ? ? ?0 ?N large free smf >> ? ? ? ? 22 ?N struct vbe_conn >> ? ? ? ?494 ?N struct bereq >> ? ? ? ? 20 ?N worker threads >> ? ? ? 4152 ?N worker threads created >> ? ? ? ? ?0 ?N worker threads not created >> ? ? ? ? ?0 ?N worker threads limited >> ? ? ? ? ?0 ?N queued work requests >> ? ? 226847 ?N overflowed work requests >> ? ? ? ? ?0 ?N dropped work requests >> ? ? ? ? ?2 ?N backends >> ? ? 159680 ?N expired objects >> ? ? ? ? ?0 ?N LRU nuked objects >> ? ? ? ? ?0 ?N LRU saved objects >> ? ?4008654 ?N LRU moved objects >> ? ? ? ? ?0 ?N objects on deathrow >> ? ? ? ? ?0 ?HTTP header overflows >> ? ? ? ? ?0 ?Objects sent with sendfile >> ? ?5482793 ?Objects sent with write >> ? ? ? ? ?0 ?Objects overflowing workspace >> ? ?4564281 ?Total Sessions >> ? 13075788 ?Total Requests >> ? ? ? ? 25 ?Total pipe >> ? ?3669512 ?Total pass >> ? ?3766737 ?Total fetch >> ?3167362228 ?Total header bytes >> 129249988603 ?Total body bytes >> ? ? 446282 ?Session Closed >> ? ? ?22017 ?Session Pipeline >> ? ? ?78623 ?Session Read Ahead >> ? ? ? ? ?0 ?Session Linger >> ? 12628704 ?Session herd >> ?650794078 ?SHM records >> ? 44487143 ?SHM writes >> ? ? ? ?627 ?SHM flushes due to overflow >> ? ? ?81635 ?SHM MTX contention >> ? ? ? ?283 ?SHM cycles through buffer >> ? ? ? ? ?0 ?allocator requests >> ? ? ? ? ?0 ?outstanding allocations >> ? ? ? ? ?0 ?bytes allocated >> ? ? ? ? ?0 ?bytes free >> ? ?7558668 ?SMA allocator requests >> ? ? ? ?122 ?SMA outstanding allocations >> ? ?1781820 ?SMA outstanding bytes >> 198293321655 ?SMA bytes allocated >> 198291539835 ?SMA bytes free >> ? ? ?62427 ?SMS allocator requests >> ? ? ? ? ?0 ?SMS outstanding allocations >> 18446744073709546036 ?SMS outstanding bytes >> ? 29026230 ?SMS bytes allocated >> ? 29028555 ?SMS bytes freed >> ? ?3766947 ?Backend requests made >> ? ? ? ? ?4 ?N vcl total >> ? ? ? ? ?4 ?N vcl available >> ? ? ? ? ?0 ?N vcl discarded >> ? ? ? ? ?1 ?N total active purges >> ? ? ? ? ?1 ?N new purges added >> ? ? ? ? ?0 ?N old purges deleted >> ? ? ? ? ?0 ?N objects tested >> ? ? ? ? ?0 ?N regexps tested against >> ? ? ? ? ?0 ?N duplicate purges removed >> ? ? ? ? ?0 ?HCB Lookups without lock >> ? ? ? ? ?0 ?HCB Lookups with lock >> ? ? ? ? ?0 ?HCB Inserts >> ? ? ? ? ?0 ?Objects ESI parsed (unlock) >> ? ? ? ? ?0 ?ESI parse errors (unlock) >> >> >> I would be grateful if anyone gave me some pointers about where to look >> next. >> _______________________________________________ >> varnish-misc mailing list >> varnish-misc at projects.linpro.no >> http://projects.linpro.no/mailman/listinfo/varnish-misc > > From lazy404 at gmail.com Tue Jul 14 00:16:41 2009 From: lazy404 at gmail.com (Lazy) Date: Tue, 14 Jul 2009 02:16:41 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: References: <49A8EBD9-6971-4152-AF3E-A8925DE82490@slide.com> <63D2ADCE-BE4A-40C4-A269-E0193A1350B0@slide.com> Message-ID: 2009/7/14 Ken Brownfield : > The progression from your dump is: > > Varnish ? ? ? ? Apache > SYN > ? ? ? ? ? ? ? ?SYN+ACK > ACK > ...4.4 seconds later... > ? ? ? ? ? ? ? ?SYN+ACK > ACK > PSH+ACK > ? ? ? ? ? ? ? ?ACK > FIN+ACK (???) > ? ? ? ? ? ? ? ?PSH+ACK > RST right i missed the 4second gap, so varnish may be hitting a timeout > > It looks like the ACK from the Varnish side is getting lost on its way to > Apache, and Apache retransmits. ?This itself would imply that you have some > packet loss between Varnish and Apache. ?That 4.4 second delay could easily > be running you against the sess_timeout (or cli_timeout, maybe). ?Bumping > those up to 10 might clear the issue for you. > > Obviously, packet loss will cause issues in general, so I'd investigate that > anyway. it's a local interface, not lo but the traffic is going threw loopback interface > > More bizarrely, that FIN+ACK is missing a FIN from Apache. ?Are you sure > this stream is correct? ?I have no idea what would cause a spurious FIN+ACK > (besides a spurious FIN) and the issue would be highly unlikely a Varnish > issue; more likely a kernel TCP stack issue. > > Is there anything between Varnish and Apache in your config, besides two > machines on a shared switch? ?Proxies? ?Firewall rules? ?Switch ACLs? ?What > OSes/versions are you running on each side? it's only 1 machine From lazy404 at gmail.com Tue Jul 14 00:18:21 2009 From: lazy404 at gmail.com (Lazy) Date: Tue, 14 Jul 2009 02:18:21 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: References: <49A8EBD9-6971-4152-AF3E-A8925DE82490@slide.com> Message-ID: 2009/7/13 Lazy : > One more thing > I set the processor affinity for all varnish threads so it will run on > a single cpu, to check if it's not some race condition, so far I > failed to trigger any errors by using ab, I will see tommorow when > some real traffic will show up. didn't make any diference From kristian at redpill-linpro.com Tue Jul 14 07:13:02 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Tue, 14 Jul 2009 09:13:02 +0200 Subject: Question about Cache Purge In-Reply-To: <6FF2E7BD5BC7CF48A9455770E7970D1E46E0602BAC@AUSP01VMBX05.collaborationhost.net> References: <6FF2E7BD5BC7CF48A9455770E7970D1E46E0602BAC@AUSP01VMBX05.collaborationhost.net> Message-ID: <20090714071302.GA8258@kjeks.linpro.no> On Thu, Jul 09, 2009 at 02:17:35PM -0500, Brian Pan wrote: > Hi all, > > I have a newbie question regarding 'purging the cache.' When I run the following command to purge the root domain of my website, > > varnishadm -T localhost:80 purge.url ^/$ Are you sure the admin interface is at port 80? That sounds very unlikely. How do you start varnishd? Varnish can listen to _two_ ports, one is typically port 80, where normal HTTP data is served, the other is an administrative interface, which is what you want to specify with -T. -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From kristian at redpill-linpro.com Tue Jul 14 07:30:12 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Tue, 14 Jul 2009 09:30:12 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: References: Message-ID: <20090714073011.GB8258@kjeks.linpro.no> On Sat, Jul 11, 2009 at 12:21:38AM +0200, Lazy wrote: > We are having hard time figuring out what's cosing varnish 503 error, > our backend is apache is debian 5 default, os is linux x86_64 2.6.26, > everything is running on a single machine > > /usr/local/sbin/varnishd -a 0.0.0.0:80 -f > /usr/local/etc/varnish/default.vcl -s malloc -T localhost:9999 -w > 10,6000,300 -u nobody 6000 threads is too much. Since it's per pool, it'll cause up to 12 000 threads to start. That's not likely to go over all that well. If you have that sort of traffic, you need to scale out. Also, 10 thread minimum is pretty low. I typically recommend setting the minimum thread count to what you expect your normal traffic to be at peak hours. It's probably a dedicated machines, and idle threads have barely any overhead, while creating new threads can take some time. > running with a single backend > .connect_timeout = 1s; added to the backend definition Any particular reason for adding that? > I added > > sub vcl_error { > if (req.restarts < 10) { > restart; > } > } > > (is it possible to add a pause before doing restart ?) No. This is also a dirty workaround for a fundamental problem. > In about 0.1% of request we get > > 10 TxRequest b POST > 10 TxURL b /php > 10 TxProtocol b HTTP/1.1 > 10 TxHeader b x-requested-with: XMLHttpRequest > 10 TxHeader b Accept-Language: pl > 10 TxHeader b Referer: http://www.xxxxx/php > 10 TxHeader b Accept: text/html, */* > 10 TxHeader b Content-Type: application/x-www-form-urlencoded > 10 TxHeader b UA-CPU: x86 > 10 TxHeader b Accept-Encoding: gzip, deflate > 10 TxHeader b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; > Windows NT 5.1) > 10 TxHeader b Content-Length: 8 > 10 TxHeader b Cookie: _.1 > 10 TxHeader b X-NovINet: v1.2 > 10 TxHeader b Host: www.kinograj.cinema-city.pl > 10 TxHeader b X-Varnish: 603437812 > 10 TxHeader b X-Forwarded-For: 79.162.xxx > 10 BackendClose b default > 31 VCL_call c error > 31 VCL_return c deliver > 31 Length c 465 > 31 VCL_call c deliver > 31 VCL_return c deliver > 31 TxProtocol c HTTP/1.1 > 31 TxStatus c 503 > > machine is not overloaded, there are 150 apache running 80% of them is idle > > what does > 31 VCL_call c error mean , a connection error, apache returned > invalid response ? No, it just means that vcl_error is called. BackendClose notes that the connection to the backend was closed. > can I get some more information about this error using some syslog in > vcl_error or mayby in some other way ? Possibly, but using syslog in vcl is the last thing I'd recommend. Does your syslog say anything meaningful? Like assert-errors... (...) > 60064 Backend connections failures > this is old and it's not changing now Did the error-rate go down once you solved this? What was causing these problems? > 20 N worker threads > 4152 N worker threads created > 0 N worker threads not created > 0 N worker threads limited > 0 N queued work requests > 226847 N overflowed work requests This is what I mean with -w 10,6000 being wrong. After the initial startup, overflowed work requests shouldn't grow much, and you're currently running at only 20 threads (the minimum), which will cause overflows very fast (consider how many connections a single client will use to fetch a front page... You can easily imagine overflowing with just 3-4 concurrent clients.) But that's not really causing any 503s. Just delays while threads are created (and removed). -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From lazy404 at gmail.com Tue Jul 14 09:46:58 2009 From: lazy404 at gmail.com (Lazy) Date: Tue, 14 Jul 2009 11:46:58 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: <20090714073011.GB8258@kjeks.linpro.no> References: <20090714073011.GB8258@kjeks.linpro.no> Message-ID: 2009/7/14 Kristian Lyngstol : > On Sat, Jul 11, 2009 at 12:21:38AM +0200, Lazy wrote: >> We are having hard time figuring out what's cosing varnish 503 error, >> our backend is apache is debian 5 default, os is linux x86_64 2.6.26, >> everything is running on a single machine >> >> /usr/local/sbin/varnishd -a 0.0.0.0:80 -f >> /usr/local/etc/varnish/default.vcl -s malloc -T localhost:9999 -w >> 10,6000,300 -u nobody > > 6000 threads is too much. Since it's per pool, it'll cause up to 12 000 > threads to start. That's not likely to go over all that well. If you have > that sort of traffic, you need to scale out. Also, 10 thread minimum is > pretty low. > > I typically recommend setting the minimum thread count to what you expect > your normal traffic to be at peak hours. It's probably a dedicated > machines, and idle threads have barely any overhead, while creating new > threads can take some time. at first i had 3000 threads set and varnish ocassionly droped connections, so I doubled it so what whould be a recomended values ? will -w 1024,1024 -p thread_pools=6 whould be ok ? the site is usually not so busy, but it has sometimes spikes of static traffic (about 50Mbps) that's why i upped the thread limit, 3000 was to low is it safe to change thread_pools on runtime ? > >> running with a single backend >> .connect_timeout = 1s; added to the backend definition > > Any particular reason for adding that? originally it wasn't there i added it trying to go around the issue > >> I added >> >> sub vcl_error { >> ? ? if (req.restarts < 10) { >> ? ? ? ? restart; >> ? ? } >> } >> >> (is it possible to add a pause before doing restart ?) > > No. This is also a dirty workaround for a fundamental problem. > >> In about 0.1% of request we get >> >> ? ?10 TxRequest ? ?b POST >> ? ?10 TxURL ? ? ? ?b /php >> ? ?10 TxProtocol ? b HTTP/1.1 >> ? ?10 TxHeader ? ? b x-requested-with: XMLHttpRequest >> ? ?10 TxHeader ? ? b Accept-Language: pl >> ? ?10 TxHeader ? ? b Referer: http://www.xxxxx/php >> ? ?10 TxHeader ? ? b Accept: text/html, */* >> ? ?10 TxHeader ? ? b Content-Type: application/x-www-form-urlencoded >> ? ?10 TxHeader ? ? b UA-CPU: x86 >> ? ?10 TxHeader ? ? b Accept-Encoding: gzip, deflate >> ? ?10 TxHeader ? ? b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; >> Windows NT 5.1) >> ? ?10 TxHeader ? ? b Content-Length: 8 >> ? ?10 TxHeader ? ? b Cookie: _.1 >> ? ?10 TxHeader ? ? b X-NovINet: v1.2 >> ? ?10 TxHeader ? ? b X-Varnish: 603437812 >> ? ?10 TxHeader ? ? b X-Forwarded-For: 79.162.xxx >> ? ?10 BackendClose b default >> ? ?31 VCL_call ? ? c error >> ? ?31 VCL_return ? c deliver >> ? ?31 Length ? ? ? c 465 >> ? ?31 VCL_call ? ? c deliver >> ? ?31 VCL_return ? c deliver >> ? ?31 TxProtocol ? c HTTP/1.1 >> ? ?31 TxStatus ? ? c 503 >> >> machine is not overloaded, there are 150 apache running 80% of them is idle >> >> what does >> 31 VCL_call ? ? c error mean , a connection error, apache returned >> invalid response ? > > No, it just means that vcl_error is called. BackendClose notes that the > connection to the backend was closed. > >> can I get some more information about this error using some syslog in >> vcl_error or mayby in some other way ? > > Possibly, but using syslog in vcl is the last thing I'd recommend. > > Does your syslog say anything meaningful? Like assert-errors... no, only info about admin commands > (...) >> ? ? ? ?60064 ?Backend connections failures >> this is old and it's not changing now > > Did the error-rate go down once you solved this? What was causing these > problems? it was related to load testing, in production it went away when i upped maxclients on apache > >> ? ? ? ? ? 20 ?N worker threads >> ? ? ? ? 4152 ?N worker threads created >> ? ? ? ? ? ?0 ?N worker threads not created >> ? ? ? ? ? ?0 ?N worker threads limited >> ? ? ? ? ? ?0 ?N queued work requests >> ? ? ? 226847 ?N overflowed work requests > > This is what I mean with -w 10,6000 being wrong. After the initial startup, > overflowed work requests shouldn't grow much, and you're currently running > at only 20 threads (the minimum), which will cause overflows very fast > (consider how many connections a single client will use to fetch a front > page... You can easily imagine overflowing with just 3-4 concurrent > clients.) > > But that's not really causing any 503s. Just delays while threads are > created (and removed). tcpdump of another 503 (apache is running on port 88), 11:09:50.187842 IP x.x.x.x.50780 > x.x.x.x.88: S 88526893:88526893(0) win 32792 11:09:50.187851 IP x.x.x.x.88 > x.x.x.x.50780: S 81484078:81484078(0) ack 88526894 win 32768 11:09:50.187867 IP x.x.x.x.50780 > x.x.x.x.88: . ack 1 win 257 11:09:53.187730 IP x.x.x.x.88 > x.x.x.x.50780: S 81484078:81484078(0) ack 88526894 win 32768 11:09:53.187740 IP x.x.x.x.50780 > x.x.x.x.88: . ack 1 win 257 11:09:59.191730 IP x.x.x.x.88 > x.x.x.x.50780: S 81484078:81484078(0) ack 88526894 win 32768 11:09:59.191744 IP x.x.x.x.50780 > x.x.x.x.88: . ack 1 win 257 11:10:05.187748 IP x.x.x.x.50780 > x.x.x.x.88: P 1:918(917) ack 1 win 257 11:10:05.187766 IP x.x.x.x.88 > x.x.x.x.50780: . ack 918 win 271 11:10:05.187799 IP x.x.x.x.50780 > x.x.x.x.88: F 918:918(0) ack 1 win 257 11:10:05.190887 IP x.x.x.x.88 > x.x.x.x.50780: P 1:2968(2967) ack 919 win 271 11:10:05.190909 IP x.x.x.x.50780 > x.x.x.x.88: R 88527812:88527812(0) win 0 x.x.x.x is a local address bound to eth0 Thank You for your help. -- Michal Grzedzicki From kristian at redpill-linpro.com Tue Jul 14 10:05:00 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Tue, 14 Jul 2009 12:05:00 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: References: <20090714073011.GB8258@kjeks.linpro.no> Message-ID: <20090714100459.GC8258@kjeks.linpro.no> On Tue, Jul 14, 2009 at 11:46:58AM +0200, Lazy wrote: > 2009/7/14 Kristian Lyngstol : > > 6000 threads is too much. Since it's per pool, it'll cause up to 12 000 > > threads to start. That's not likely to go over all that well. If you have > > that sort of traffic, you need to scale out. Also, 10 thread minimum is > > pretty low. > > > > I typically recommend setting the minimum thread count to what you expect > > your normal traffic to be at peak hours. It's probably a dedicated > > machines, and idle threads have barely any overhead, while creating new > > threads can take some time. > > at first i had 3000 threads set and varnish ocassionly droped > connections, so I doubled it > > so what whould be a recomended values ? > will -w 1024,1024 -p thread_pools=6 whould be ok ? 6 thread pools is overkill. And the number of threads is multiplied with the thread pools, so in this case you're essentially writing -w 6k,6k... I'd advice something like -w 200,1200 -p thread_pools=2 to begin with. Just watch the overflows in varnishstat (it'll increase a good bit during startup since it takes a little time to create the 400 threads). It should stay fairly static after startup. > the site is usually not so busy, but it has sometimes spikes of static > traffic (about 50Mbps) that's why i upped the thread limit, 3000 was > to low I seriously doubt 3k was too low. More likely, the min threads was hurting you. Three thousand threads is quite a bit. Remember that these are actual requests being handled, not keep-alive connections and the like. > is it safe to change thread_pools on runtime ? Safe; I'd assume so. But I don't know if it actually takes effect. I've yet to see any good reason to change it from the default. -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From kb+varnish at slide.com Tue Jul 14 20:32:18 2009 From: kb+varnish at slide.com (Ken Brownfield) Date: Tue, 14 Jul 2009 13:32:18 -0700 Subject: varnish 2.0.4 backend errors In-Reply-To: <20090714100459.GC8258@kjeks.linpro.no> References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> Message-ID: <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> On Jul 14, 2009, at 3:05 AM, Kristian Lyngstol wrote: > On Tue, Jul 14, 2009 at 11:46:58AM +0200, Lazy wrote: >> the site is usually not so busy, but it has sometimes spikes of >> static >> traffic (about 50Mbps) that's why i upped the thread limit, 3000 was >> to low > > I seriously doubt 3k was too low. More likely, the min threads was > hurting > you. Three thousand threads is quite a bit. Remember that these are > actual > requests being handled, not keep-alive connections and the like. I just wanted to humbly second this good advice; if you're familiar with Apache, this is akin to making sure your MinSpareServers is set to a high enough level to handle any transient spikes by avoiding the cost of spawning new processes. Varnish will handle 10x the traffic you're seeing in <64 threads. Anything you're seeing is more likely a concurrency spike causing a temporary slowdown while threads are spawned, or Varnish is simply passing on fail or slowness from your back-end. I'm wondering if this could also be a large object that's taking a while to cache and blocking other children for a while? (rush_exponent) -- Ken. From lazy404 at gmail.com Wed Jul 15 08:01:08 2009 From: lazy404 at gmail.com (Lazy) Date: Wed, 15 Jul 2009 10:01:08 +0200 Subject: Fwd: varnish 2.0.4 backend errors In-Reply-To: References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> Message-ID: 2009/7/15 Lazy : > 2009/7/14 Ken Brownfield : >> On Jul 14, 2009, at 3:05 AM, Kristian Lyngstol wrote: >>> On Tue, Jul 14, 2009 at 11:46:58AM +0200, Lazy wrote: >>>> the site is usually not so busy, but it has sometimes spikes of >>>> static >>>> traffic (about 50Mbps) that's why i upped the thread limit, 3000 was >>>> to low >>> >>> I seriously doubt 3k was too low. More likely, the min threads was >>> hurting >>> you. Three thousand threads is quite a bit. Remember that these are >>> actual >>> requests being handled, not keep-alive connections and the like. >> >> I just wanted to humbly second this good advice; if you're familiar >> with Apache, this is akin to making sure your MinSpareServers is set >> to a high enough level to handle any transient spikes by avoiding the >> cost of spawning new processes. ?Varnish will handle 10x the traffic >> you're seeing in <64 threads. ?Anything you're seeing is more likely a > > but if backend is slow, (there will be many POSTS form ajax app) > threads wil get used fast if php starts lagging behind, > now there are 200 threads running, without raises in overflowed work > requests, and still there are 503 errors, > i will uppp min threads later and see if it helps > >> concurrency spike causing a temporary slowdown while threads are >> spawned, or Varnish is simply passing on fail or slowness from your >> back-end. > i hope this is it, i will find out tommorrow setting 500 as min threads didn't make any diference, funny thing is that failed requests are in logged in apache as succesful 09:23:50.563934 IP x.x.x.x.51235 > x.x.x.x.88: S 1066103134:1066103134(0) win 32792 09:23:50.563940 IP x.x.x.x.88 > x.x.x.x.51235: S 1060616380:1060616380(0) ack 1066103135 win 32768 09:23:50.563946 IP x.x.x.x.51235 > x.x.x.x.88: . ack 1 win 257 09:23:54.163755 IP x.x.x.x.88 > x.x.x.x.51235: S 1060616380:1060616380(0) ack 1066103135 win 32768 09:23:54.163765 IP x.x.x.x.51235 > x.x.x.x.88: . ack 1 win 257 09:23:55.563738 IP x.x.x.x.51235 > x.x.x.x.88: P 1:886(885) ack 1 win 257 09:23:55.563756 IP x.x.x.x.88 > x.x.x.x.51235: . ack 886 win 270 09:23:55.563838 IP x.x.x.x.51235 > x.x.x.x.88: F 886:886(0) ack 1 win 257 09:23:55.567177 IP x.x.x.x.88 > x.x.x.x.51235: P 1:2882(2881) ack 887 win 270 09:23:55.567196 IP x.x.x.x.51235 > x.x.x.x.88: R 1066104021:1066104021(0) win 0 successful POST in apache logs, in varnishlog 14 Backend ? ? ?c 15 default default ? 15 TxRequest ? ?b POST ? 15 TxURL ? ? ? ?b /quiz/question ? 15 TxProtocol ? b HTTP/1.1 ? 15 TxHeader ? ? b x-requested-with: XMLHttpRequest ? 15 TxHeader ? ? b Accept-Language: pl ? 15 TxHeader ? ? b Accept: text/html, */* ? 15 TxHeader ? ? b Content-Type: application/x-www-form-urlencoded ? 15 TxHeader ? ? b UA-CPU: x86 ? 15 TxHeader ? ? b Accept-Encoding: gzip, deflate ? 15 TxHeader ? ? b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30) ? 15 TxHeader ? ? b Host: ? 15 TxHeader ? ? b Content-Length: 8 ? 15 TxHeader ? ? b Cookie: xxx ? 15 TxHeader ? ? b X-Varnish: 782633863 ? 15 TxHeader ? ? b X-Forwarded-For: xxx ? 15 BackendClose b default ? 14 VCL_call ? ? c error ? 14 VCL_return ? c restart ? 14 VCL_call ? ? c recv ? 14 VCL_return ? c pass ? 14 VCL_call ? ? c error ? 14 VCL_return ? c restart ? 14 VCL_call ? ? c recv ... ? 14 Length ? ? ? c 465 ? 14 VCL_call ? ? c deliver ? 14 VCL_return ? c deliver ? 14 TxProtocol ? c HTTP/1.1 ? 14 TxStatus ? ? c 503 ? 14 TxResponse ? c Service Unavailable ? 14 TxHeader ? ? c Server: Varnish ? 14 TxHeader ? ? c Retry-After: 0 ? 14 TxHeader ? ? c Content-Type: text/html; charset=utf-8 ? 14 TxHeader ? ? c Content-Length: 465 ? 14 TxHeader ? ? c Date: Wed, 15 Jul 2009 07:23:55 GMT ? 14 TxHeader ? ? c X-Varnish: 782633863 ? 14 TxHeader ? ? c Age: 20 ? 14 TxHeader ? ? c Via: 1.1 varnish ? 14 TxHeader ? ? c Connection: close ? 14 ReqEnd ? ? ? c 782633863 1247642615.566063166 1247642635.564052582 0.006083488 19.997969866 0.000019550 ? 14 SessionClose c error restart shouldn't make another backend connection if current one got closed ? a can't find any subsequent retries in tcpdump and sometime's restart works -- Michal Grzedzicki From lazy404 at gmail.com Wed Jul 15 09:38:27 2009 From: lazy404 at gmail.com (Lazy) Date: Wed, 15 Jul 2009 11:38:27 +0200 Subject: How many simultanious users Message-ID: Hi, I'm trying to figure out how many simultaneous users a single 8 core machine with local apache running as a backend can handle assumming that all the requests are cached. testing with ab on a slow 100Mbps link shows 2500 hit/s, locally i got 12 000 hit/s with over 200Mbps traffic assuming that each user loads 40 files in 1 minute we get 12000*60/40=18 000 users per minute Is it possible to get half of that 18k users/per minute in real word ignoring the amounts of traffic it will generate ? For now it's only theoretical question, but we would like to estimate how many machines will it take to handle this kind of load. Another question how to scale varnish, I'm thinking about setting a 2 loadbalancers whitch will take care of sessions getting to the same server, and 3x8 core machines for www + varnish or maybe 2x4 core loadbalancers with varnish and 3x8 core machines for www. I would be possible to use varnish as a loadbalancer with some http cookie trickery. I will be grateful to anyone willing to share his experience. -- Michal Grzedzicki From kristian at redpill-linpro.com Wed Jul 15 09:48:20 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Wed, 15 Jul 2009 11:48:20 +0200 Subject: Fwd: varnish 2.0.4 backend errors In-Reply-To: References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> Message-ID: <20090715094820.GA9071@kjeks.getinternet.no> On Wed, Jul 15, 2009 at 10:01:08AM +0200, Lazy wrote: (...) > setting 500 as min threads didn't make any diference, funny thing is > that failed requests are in logged in apache as succesful Sorry if I didn't make myself clear: The output from varnishstat didn't indicate that threads was the issue causing 503s. It was more of a tuning-comment than a shot at solving the real problem. Sorry about the confusion. > 09:23:50.563934 IP x.x.x.x.51235 > x.x.x.x.88: S > 1066103134:1066103134(0) win 32792 552835403 0,nop,wscale 7> (....) Is it possible for you to mail me the raw cap-file? (Feel free to obfuscate the IPs, but I'm really more of an ethereal-man than tcpdump-output...). Feel free to drop it directly to me, if that makes it easier. > successful POST in apache logs, > > in varnishlog > > 14 Backend ? ? ?c 15 default default (...) The request must've started earlier, as this is just the backend request. Do you have the entire transaction available? (...) > restart shouldn't make another backend connection if current one got closed ? It should, but it should be noted that: 1. We don't know _why_ it fails. 2. Restart in vcl_error is brand new, so there might still be some issues. You could try to turn on vcl-trace. (-p vcl_trace=on). Also, could you post your VCL in it's entirety? (If you already did, I must've missed it.) -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From kristian at redpill-linpro.com Wed Jul 15 10:14:02 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Wed, 15 Jul 2009 12:14:02 +0200 Subject: How many simultanious users In-Reply-To: References: Message-ID: <20090715101402.GB9071@kjeks.getinternet.no> On Wed, Jul 15, 2009 at 11:38:27AM +0200, Lazy wrote: > I'm trying to figure out how many simultaneous users a single 8 core > machine with local apache running as a backend can handle assumming > that all the requests are cached. This is actually very difficult to test, as you often end up with client-side issues. Or issues that are irrelevant to a production environment. I've done extensive stress testing, and getting the load up on the Varnish-server is not trivial. Our stress-testing rig is using a single dual-core opteron _clocked down_ to 1ghz as Varnish-server, and roughly 8-12 cpu cores spread over 3-5 servers as 'clients'. And it's able to handle 18-19k req/s consistently (these are 1-3byte pages of cache hits). That should give you an idea of the synthetic performance Varnish can offer. > testing with ab on a slow 100Mbps link shows 2500 hit/s, locally i got > 12 000 hit/s with over 200Mbps traffic How big are the pages requested? Are they all hits? What's the load on the client and server? > assuming that each user loads 40 files in 1 minute we get > 12000*60/40=18 000 users per minute > > Is it possible to get half of that 18k users/per minute in real word > ignoring the amounts of traffic it will generate ? I'd say so, but it depends on how big the data set is. If you can store it in memory, varnish is ridiculously fast. I also wouldn't recommend relying on a single Varnish for more than a few thousand requests per second. If something goes wrong (suddenly getting lots of misses for instance), it will quickly spread. For comparison, I'm looking at a box with roughly 0.4 load serving 2000req/s as we speak, and that's on 2xDual Core Opteron 2212. Going by those numbers, it should theoretically be able to handle almost ten times as many requests if Varnish scaled as a straight line. That'd give you roughly 18000 req/s at peak (give or take a little...) Now you're talking about 8 cores, that should be 36k req/s. That's _not_ unrealistic, from what we've seen in synthetic tests. If each client requires 40 items, that means roughly 900 clients _per second_. Or 54k in a minute. This math is all rough estimates, but the foundation is production sites and real traffic patterns. The problem is that getting your Varnish to deal with 36k req/s is rather difficult, and you quickly run into network issues and similar. And at 36k req/s you can hardly take any amount of backend traffic or delays before it all falls over. > For now it's only theoretical question, but we would like to estimate > how many machines will it take to handle this kind of load. > > Another question how to scale varnish, I'm thinking about setting a 2 > loadbalancers whitch will take care of sessions getting to the same > server, and 3x8 core machines for www + varnish or maybe 2x4 core > loadbalancers with varnish and 3x8 core machines for www. I would be > possible to use varnish as a loadbalancer with some http cookie > trickery. I wouldn't recommend using Varnish to implement sticky-sessions, even though it might be possible. What I've seen people do, though, is put apache with mod_proxy _behind_ varnish, and let that deal with sticky-sessions, then varnish only have to know what to cache and what not to cache. (And for varnish, there'll be only one backend). -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From lazy404 at gmail.com Wed Jul 15 10:14:21 2009 From: lazy404 at gmail.com (Lazy) Date: Wed, 15 Jul 2009 12:14:21 +0200 Subject: Fwd: varnish 2.0.4 backend errors In-Reply-To: <20090715094820.GA9071@kjeks.getinternet.no> References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> <20090715094820.GA9071@kjeks.getinternet.no> Message-ID: 2009/7/15 Kristian Lyngstol : > On Wed, Jul 15, 2009 at 10:01:08AM +0200, Lazy wrote: > (...) >> setting 500 as min threads didn't make any diference, funny thing is >> that failed requests are in logged in apache as succesful > > Sorry if I didn't make myself clear: The output from varnishstat didn't > indicate that threads was the issue causing 503s. It was more of a > tuning-comment than a shot at solving the real problem. Sorry about the > confusion. > >> 09:23:50.563934 IP x.x.x.x.51235 > x.x.x.x.88: S >> 1066103134:1066103134(0) win 32792 > 552835403 0,nop,wscale 7> > > (....) > > Is it possible for you to mail me the raw cap-file? (Feel free to obfuscate > the IPs, but I'm really more of an ethereal-man than tcpdump-output...). > Feel free to drop it directly to me, if that makes it easier. ok, it will take a minute i have a 50MB file to chop > >> successful POST in apache logs, >> >> in varnishlog >> >> 14 Backend ? ? ?c 15 default default > > (...) > > The request must've started earlier, as this is just the backend request. > Do you have the entire transaction available? yes 14 VCL_call c error 14 VCL_return c restart 14 VCL_call c recv 14 VCL_return c pass 14 VCL_call c pass 14 VCL_return c pass 15 BackendOpen b default xxx 51219 xxx 88 14 Backend c 15 default default 15 TxRequest b POST 15 TxURL b /quiz/question 15 TxProtocol b HTTP/1.1 15 TxHeader b x-requested-with: XMLHttpRequest 15 TxHeader b Accept-Language: pl 15 TxHeader b Accept: text/html, */* 15 TxHeader b Content-Type: application/x-www-form-urlencoded 15 TxHeader b UA-CPU: x86 15 TxHeader b Accept-Encoding: gzip, deflate 15 TxHeader b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30) 15 TxHeader b X-Varnish: 782633863 15 TxHeader b X-Forwarded-For: 82.177. 15 BackendClose b default 14 VCL_call c error 14 VCL_return c restart 14 VCL_call c recv 14 VCL_return c pass 14 VCL_call c pass 14 VCL_return c pass 15 BackendOpen b default xxx 51235 xxx 88 14 Backend c 15 default default > > (...) > >> restart shouldn't make another backend connection if current one got closed ? > > It should, but it should be noted that: 1. We don't know _why_ it fails. 2. > Restart in vcl_error is brand new, so there might still be some issues. > > You could try to turn on vcl-trace. (-p vcl_trace=on). right now i have something 100-200 req/s won't it kill the server and I can't restart varnish now so I will have try to enable it at runtime > > Also, could you post your VCL in it's entirety? (If you already did, I > must've missed it.) backend default { .host = "xxxx"; .port = "88"; } sub vcl_recv { if (req.url ~ "\.(png|gif|jpg|swf|css|js|ico)$") { lookup; } if (req.url ~ "landing/") { lookup; } if (req.http.Cache-Control ~ "no-cache") { pass; } } sub vcl_fetch { if (req.url ~ "\.(png|gif|jpg|swf)$") { unset obj.http.set-cookie; } if (obj.http.Pragma ~ "no-cache" || obj.http.Cache-Control ~ "no-cache") { pass; } } sub vcl_error { if (req.restarts < 30) { restart; } } From kristian at redpill-linpro.com Wed Jul 15 10:31:53 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Wed, 15 Jul 2009 12:31:53 +0200 Subject: Fwd: varnish 2.0.4 backend errors In-Reply-To: References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> <20090715094820.GA9071@kjeks.getinternet.no> Message-ID: <20090715103152.GC9071@kjeks.getinternet.no> On Wed, Jul 15, 2009 at 12:14:21PM +0200, Lazy wrote: > 2009/7/15 Kristian Lyngstol : > > You could try to turn on vcl-trace. (-p vcl_trace=on). > > right now i have something 100-200 req/s won't it kill the server and > I can't restart varnish now so I will have try to enable it at runtime Ok, I'm not entirely sure if that'll work, but it's worth a try (and it should cause any damage anyway). > > > > > Also, could you post your VCL in it's entirety? (If you already did, I > > must've missed it.) > > sub vcl_recv { (...) > if (req.http.Cache-Control ~ "no-cache") { > pass; You probably do not want this; It will mean that every time a _client_ sends no-cache, it's a pass. That'll be quite often (on refresh, for instance), and the refreshed item wont get cached, so it wont refresh the cache. (I'll have a look at the cap when I find some time. Could take a few hours). -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From lazy404 at gmail.com Thu Jul 16 09:23:27 2009 From: lazy404 at gmail.com (Lazy) Date: Thu, 16 Jul 2009 11:23:27 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: <1CEFC7A6-31D9-4EC6-AE1C-47CF09A17A44@slide.com> References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> <1CEFC7A6-31D9-4EC6-AE1C-47CF09A17A44@slide.com> Message-ID: 2009/7/16 Ken Brownfield : > Yeah, this one might be one for the Varnish devs. ?Only last thing I can > think of is that Apache is returning a response that's larger than > Content-Length: and Varnish is just closing the connection... apache didn't set Content-Length on thiese requests Yesterday i switched the machine to squid, and errors came back as httpAppendBody: Request not yet fully sent "POST in squid cache log Finally I realized that there are many tcp checksum errors in the dump, I turned off tx and rx checksum offloading and now the checksums are ok, there was so far no error in 25 minutes (before it was happening in about 15 minute intervals). Maybe the fact that local backend is bound to eth0's ip address made a difference. I will investigate later. Maybe it will be worth mentioning in the wiki if it's proven. After all it wasn't a varnish error at all. Thank You all for your time and expertise. -- Michal Grzedzicki From lazy404 at gmail.com Thu Jul 16 10:28:38 2009 From: lazy404 at gmail.com (Lazy) Date: Thu, 16 Jul 2009 12:28:38 +0200 Subject: How many simultanious users In-Reply-To: References: <20090715101402.GB9071@kjeks.getinternet.no> Message-ID: >>> assuming that each user loads 40 files in 1 minute we get >>> 12000*60/40=18 000 users per minute >>> >>> Is it possible to get half of that 18k users/per minute in real word >>> ignoring the amounts of traffic it will generate ? >> >> I'd say so, but it depends on how big the data set is. If you can store it >> in memory, varnish is ridiculously fast. I also wouldn't recommend relying > i think it will fit in to ram, i will be a single site > >> on a single Varnish for more than a few thousand requests per second. If >> something goes wrong (suddenly getting lots of misses for instance), it >> will quickly spread. >> >> For comparison, I'm looking at a box with roughly 0.4 load serving >> 2000req/s as we speak, and that's on 2xDual Core Opteron 2212. Going by >> those numbers, it should theoretically be able to handle almost ten times >> as many requests if Varnish scaled as a straight line. >> >> That'd give you roughly 18000 req/s at peak (give or take a little...) Now >> you're talking about 8 cores, that should be 36k req/s. That's _not_ >> unrealistic, from what we've seen in synthetic tests. If each client >> requires 40 items, that means roughly 900 clients _per second_. Or 54k in a >> minute. This math is all rough estimates, but the foundation is production >> sites and real traffic patterns. >> >> The problem is that getting your Varnish to deal with 36k req/s is rather >> difficult, and you quickly run into network issues and similar. And at 36k >> req/s you can hardly take any amount of backend traffic or delays before it >> all falls over. today i did some ad-hoc tests with 45 byte body, when I enable keep-alive I'm getting 39k req/s with 100 concurrent gets and over 40k with 300 concurrent connections (max cpu load was under 2 cores for varnish) without keep alive i'm stuck with 12k req/s, that might be end of ab's performance in making new connections or kernel, i tried the performance tips from the wiki, but id didn't make a significant difference in this test, Later i will try to use a benchmark running on another machine. 12k req/s is more then enough for me already so I'm happy with that -- Michal Grzedzicki From lazy404 at gmail.com Thu Jul 16 12:01:11 2009 From: lazy404 at gmail.com (Lazy) Date: Thu, 16 Jul 2009 14:01:11 +0200 Subject: varnish 2.0.4 backend errors In-Reply-To: References: <20090714073011.GB8258@kjeks.linpro.no> <20090714100459.GC8258@kjeks.linpro.no> <302D9BDA-4BC2-41FA-A27F-A17CAFB27C48@slide.com> <1CEFC7A6-31D9-4EC6-AE1C-47CF09A17A44@slide.com> Message-ID: 2009/7/16 Lazy : > 2009/7/16 Ken Brownfield : >> Yeah, this one might be one for the Varnish devs. ?Only last thing I can >> think of is that Apache is returning a response that's larger than >> Content-Length: and Varnish is just closing the connection... > apache didn't set Content-Length on thiese requests > > Yesterday i switched the machine to squid, and errors came back as > httpAppendBody: Request not yet fully sent "POST in squid cache log > > Finally I realized that there are many tcp checksum errors in the dump, > I turned off tx and rx checksum offloading and now the checksums are ok, > there was so far no error in 25 minutes (before it was happening in about > 15 minute intervals). Maybe the fact that local backend is bound to eth0's ip > address made a difference. I will investigate later. Maybe it will be > worth mentioning > in the wiki if it's proven. > > After all it wasn't a varnish error at all. > > Thank You all for your time and expertise. Arghh said too soon, traffic went up and there are still thiese errors, but no so many From n.leutner at all2e.com Thu Jul 16 19:31:01 2009 From: n.leutner at all2e.com (Norman Leutner) Date: Thu, 16 Jul 2009 15:31:01 -0400 Subject: Varnish munin plugin trouble Message-ID: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> Hi, I'm new to this mailing list. As a German partner of eZ Systems AS we're using varnish in combination with eZ Publish with great success. Currently I'm trying to get munin running with the plugin from the trunk http://varnish.projects.linpro.no/browser/trunk/varnish-tools/munin Now I stuck cause the munin-node doesn't deliver data... Trying manually using munin-run I'm getting the expected results. # munin-run varnish_hit_rate client_req.value 30714 cache_miss.value 9349 cache_hitpass.value 97 cache_hit.value 20207 Using the munin telnet interface I get no data... # telnet localhost 4949 Trying 127.0.0.1... Connected to localhost.localdomain. Escape character is '^]'. # munin node at xxxx.com fetch varnish_hit_rate . Any hints ? Thanks in advance Norman Leutner all2e GmbH Enterprise Content Management http://www.all2e.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kristian at redpill-linpro.com Fri Jul 17 05:43:45 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Fri, 17 Jul 2009 07:43:45 +0200 Subject: Varnish munin plugin trouble In-Reply-To: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> Message-ID: <20090717054344.GA5586@kjeks.linpro.no> On Thu, Jul 16, 2009 at 03:31:01PM -0400, Norman Leutner wrote: > Trying manually using munin-run I'm getting the expected results. > > # munin-run varnish_hit_rate > client_req.value 30714 (...) > Using the munin telnet interface I get no data... > > # telnet localhost 4949 > Trying 127.0.0.1... > Connected to localhost.localdomain. > Escape character is '^]'. > # munin node at xxxx.com > fetch varnish_hit_rate > . > > Any hints ? This is most commonly caused by differences in run-time parameters, typically PATH or similar. Where is the varnishstat binary located on your system (this can be defined by setting the 'varnishstat' variable) ? -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From n.leutner at all2e.com Fri Jul 17 09:04:20 2009 From: n.leutner at all2e.com (Norman Leutner) Date: Fri, 17 Jul 2009 11:04:20 +0200 Subject: AW: Varnish munin plugin trouble In-Reply-To: <20090717054344.GA5586@kjeks.linpro.no> References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> Message-ID: <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> Hi Kristian, Your're the one who wrote that plugin... Thanks for that great price of code. Thanks for the hint, it's been the PATH... [varnish_*] env.varnishstat /usr/local/bin/varnishstat works fine now. I found a small mistake in your plugin, when using multiple varnish installations on a server: 645 $arg .= " -l $varnishname"; should be 645 $arg .= " -n $varnishname"; -l # Lists the available fields to use Have you ever thought of writing a plugin which catches the most frequent requests passed to the apache server? Norman Leutner all2e GmbH Enterprise Content Management http://www.all2e.com From kristian at redpill-linpro.com Fri Jul 17 09:44:05 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Fri, 17 Jul 2009 11:44:05 +0200 Subject: Varnish munin plugin trouble In-Reply-To: <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> Message-ID: <20090717094405.GA27748@kjeks.linpro.no> On Fri, Jul 17, 2009 at 11:04:20AM +0200, Norman Leutner wrote: > Thanks for the hint, it's been the PATH... > > [varnish_*] > env.varnishstat /usr/local/bin/varnishstat > > works fine now. Glad to hear it :) > I found a small mistake in your plugin, when using multiple varnish > installations on a server: > > 645 $arg .= " -l $varnishname"; > should be > 645 $arg .= " -n $varnishname"; > > -l # Lists the available fields to use Fixed, nice catch. > Have you ever thought of writing a plugin which catches the most > frequent requests passed to the apache server? You can achieve this with varnishtop: varnishtop -i TxURL It'll list all backend requests and sort by which is most frequently requested. Fairly useful. This is realtime, though, so it'll have to keep running. If we want a script to gather this data over time, it'll essentially be the same as varnishtop but a daemon of sorts. So far, I've settled with varnishtop. -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From des at des.no Fri Jul 17 11:03:07 2009 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Fri, 17 Jul 2009 13:03:07 +0200 Subject: AW: Varnish munin plugin trouble In-Reply-To: <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> (Norman Leutner's message of "Fri, 17 Jul 2009 11:04:20 +0200") References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> Message-ID: <86ab33h8r8.fsf@ds4.des.no> Norman Leutner writes: > [to Kristian Lyngstol] > Your're the one who wrote that plugin... I was about to say "no, that was me" - then I found out that my code had been replaced. Kristian, what was wrong with the existing plugin? More to the point, what was so wrong about it that you didn't even bother to discuss the matter with me before you blew it away? DES -- Dag-Erling Sm?rgrav - des at des.no From n.leutner at all2e.com Fri Jul 17 11:18:19 2009 From: n.leutner at all2e.com (Norman Leutner) Date: Fri, 17 Jul 2009 13:18:19 +0200 Subject: AW: AW: Varnish munin plugin trouble In-Reply-To: <86ab33h8r8.fsf@ds4.des.no> References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> <86ab33h8r8.fsf@ds4.des.no> Message-ID: <242AED17A878084BAA524EBD35F3CEFF1A9CE4C7@winxbede03.exchange.xchg> Hi DES, thanks for your reply in the devoloper list. The issue was quite easy to fix. It's been to late yesterday ;) Kristian has created a more abstract plugin which includes all needed functions in one plugin. Norman -----Urspr?ngliche Nachricht----- Von: Dag-Erling Sm?rgrav [mailto:des at des.no] Gesendet: Freitag, 17. Juli 2009 13:03 An: Norman Leutner Cc: varnish-misc at projects.linpro.no Betreff: Re: AW: Varnish munin plugin trouble Norman Leutner writes: > [to Kristian Lyngstol] > Your're the one who wrote that plugin... I was about to say "no, that was me" - then I found out that my code had been replaced. Kristian, what was wrong with the existing plugin? More to the point, what was so wrong about it that you didn't even bother to discuss the matter with me before you blew it away? DES -- Dag-Erling Sm?rgrav - des at des.no From des at des.no Fri Jul 17 11:37:10 2009 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Fri, 17 Jul 2009 13:37:10 +0200 Subject: AW: AW: Varnish munin plugin trouble In-Reply-To: <242AED17A878084BAA524EBD35F3CEFF1A9CE4C7@winxbede03.exchange.xchg> (Norman Leutner's message of "Fri, 17 Jul 2009 13:18:19 +0200") References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> <86ab33h8r8.fsf@ds4.des.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE4C7@winxbede03.exchange.xchg> Message-ID: <8663drh76h.fsf@ds4.des.no> Norman Leutner writes: > Kristian has created a more abstract plugin which includes all needed > functions in one plugin. And how is that different from the one I wrote? DES -- Dag-Erling Sm?rgrav - des at des.no From kristian at redpill-linpro.com Fri Jul 17 12:39:03 2009 From: kristian at redpill-linpro.com (Kristian Lyngstol) Date: Fri, 17 Jul 2009 14:39:03 +0200 Subject: New munin plugin (Was: Varnish munin plugin trouble) In-Reply-To: <86ab33h8r8.fsf@ds4.des.no> References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> <86ab33h8r8.fsf@ds4.des.no> Message-ID: <20090717123903.GB27748@kjeks.linpro.no> On Fri, Jul 17, 2009 at 01:03:07PM +0200, Dag-Erling Sm?rgrav wrote: > Norman Leutner writes: > > [to Kristian Lyngstol] > > Your're the one who wrote that plugin... > > I was about to say "no, that was me" - then I found out that my code had > been replaced. > > Kristian, what was wrong with the existing plugin? More to the point, > what was so wrong about it that you didn't even bother to discuss the > matter with me before you blew it away? The problem with the old plugin was that it didn't let the user see the relationship between relevant statistics. With the growing number of statistics available, it simply generated too many graphs which made it difficult to see the bigger picture. The current implementation combines related graphs and still includes all information if the user wants it (ie: 9 or so aspects aren't linked by default). The previous plugin wasn't horribly big and it would've been more work to re-factor it the way we wanted it. It simply didn't strike me that it would be an issue replacing it. Or that I had to discuss it with you. -- Kristian Lyngst?l Redpill Linpro AS Tlf: +47 21544179 Mob: +47 99014497 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From des at des.no Fri Jul 17 19:06:31 2009 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Fri, 17 Jul 2009 21:06:31 +0200 Subject: New munin plugin In-Reply-To: <20090717123903.GB27748@kjeks.linpro.no> (Kristian Lyngstol's message of "Fri, 17 Jul 2009 14:39:03 +0200") References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> <86ab33h8r8.fsf@ds4.des.no> <20090717123903.GB27748@kjeks.linpro.no> Message-ID: <867hy7uo20.fsf@ds4.des.no> Kristian Lyngstol writes: > The previous plugin wasn't horribly big and it would've been more work to > re-factor it the way we wanted it. It simply didn't strike me that it would > be an issue replacing it. Or that I had to discuss it with you. That seems to be the MO these days at RL. "To hell with the other developers, we'll just do what we feel like; what do they know, anyway?" You should look up "respect" and "courtesy" in a good dictionary. DES -- Dag-Erling Sm?rgrav - des at des.no From n.leutner at all2e.com Sat Jul 18 14:46:48 2009 From: n.leutner at all2e.com (Norman Leutner) Date: Sat, 18 Jul 2009 16:46:48 +0200 Subject: AW: New munin plugin In-Reply-To: <867hy7uo20.fsf@ds4.des.no> References: <242AED17A878084BAA524EBD35F3CEFF1A9CE434@winxbede03.exchange.xchg> <20090717054344.GA5586@kjeks.linpro.no> <242AED17A878084BAA524EBD35F3CEFF1A9CE48B@winxbede03.exchange.xchg> <86ab33h8r8.fsf@ds4.des.no> <20090717123903.GB27748@kjeks.linpro.no> <867hy7uo20.fsf@ds4.des.no> Message-ID: <242AED17A878084BAA524EBD35F3CEFF1A9CE54C@winxbede03.exchange.xchg> >Kristian Lyngstol writes: >> The previous plugin wasn't horribly big and it would've been more work to >> re-factor it the way we wanted it. It simply didn't strike me that it would >> be an issue replacing it. Or that I had to discuss it with you. >That seems to be the MO these days at RL. "To hell with the other >developers, we'll just do what we feel like; what do they know, anyway?" >You should look up "respect" and "courtesy" in a good dictionary. Hi DES, I got your point. But I think Kristian is doing a good job here, even when he replaced you plugin without agreement. He also responded directly on my question here regarding the plugin and varnish. I remember your comment: >Adding new plugins to Munin is a nightmare... I might be a good idea to let Kristian support that plugin. Greetings Norman From ryanchan404 at gmail.com Sun Jul 26 04:10:45 2009 From: ryanchan404 at gmail.com (Ryan Chan) Date: Sun, 26 Jul 2009 12:10:45 +0800 Subject: 100% Transparent Reverse Proxy Message-ID: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> Hello, I have serveral web sites running on Apache/PHP, I want to install a Transparent Reverse Proxy (e.g. squid, varnish) to cache the static stuff. (By looking at expire or LM resposne header) However, one of my requirements is that neither client (browser) or server (Apache/PHP) is aware of existences of that proxy. E.g. Client will not see header such as via, age etc. Server will not see header such as X-Forwarded-For I want to ask: Is it possible to do the above stuffs using varnish? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at dynamine.net Sun Jul 26 04:46:42 2009 From: michael at dynamine.net (Michael S. Fischer) Date: Sat, 25 Jul 2009 21:46:42 -0700 Subject: 100% Transparent Reverse Proxy In-Reply-To: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> References: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> Message-ID: <61DBAB59-4508-4489-8B30-0AF4025CBCE9@dynamine.net> What's the purpose of these requirements? Just curious. --Michael On Jul 25, 2009, at 9:10 PM, Ryan Chan wrote: > > Hello, > > I have serveral web sites running on Apache/PHP, I want to install a > Transparent Reverse Proxy (e.g. squid, varnish) to cache the static > stuff. (By looking at expire or LM resposne header) > > However, one of my requirements is that neither client (browser) or > server (Apache/PHP) is aware of existences of that proxy. > > E.g. > > Client will not see header such as via, age etc. > Server will not see header such as X-Forwarded-For > > I want to ask: Is it possible to do the above stuffs using varnish? > > Thanks. > > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc From ryanchan404 at gmail.com Sun Jul 26 07:16:38 2009 From: ryanchan404 at gmail.com (Ryan Chan) Date: Sun, 26 Jul 2009 15:16:38 +0800 Subject: 100% Transparent Reverse Proxy In-Reply-To: <61DBAB59-4508-4489-8B30-0AF4025CBCE9@dynamine.net> References: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> <61DBAB59-4508-4489-8B30-0AF4025CBCE9@dynamine.net> Message-ID: <45d40ce30907260016u19e0148fs2929c6197138f4a1@mail.gmail.com> Hi, On Sun, Jul 26, 2009 at 12:46 PM, Michael S. Fischer wrote: > What's the purpose of these requirements? Just curious. > > --Michael > > To the client side, it is a security requirement. To the server side, sometimes PHP need to get remote IP, obviously proxy ip is meaningless to the program. (But I don't want to change the program tp handle X-Forwarded-For) -------------- next part -------------- An HTML attachment was scrubbed... URL: From xi at borderworlds.dk Sun Jul 26 11:10:51 2009 From: xi at borderworlds.dk (Christian Laursen) Date: Sun, 26 Jul 2009 13:10:51 +0200 Subject: 100% Transparent Reverse Proxy In-Reply-To: <45d40ce30907260016u19e0148fs2929c6197138f4a1@mail.gmail.com> References: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> <61DBAB59-4508-4489-8B30-0AF4025CBCE9@dynamine.net> <45d40ce30907260016u19e0148fs2929c6197138f4a1@mail.gmail.com> Message-ID: <4A6C39BB.8050808@borderworlds.dk> Ryan Chan wrote: > Hi, > > On Sun, Jul 26, 2009 at 12:46 PM, Michael S. Fischer > > wrote: > > What's the purpose of these requirements? Just curious. > > --Michael > > > To the client side, it is a security requirement. > To the server side, sometimes PHP need to get remote IP, obviously proxy > ip is meaningless to the program. (But I don't want to change the > program tp handle X-Forwarded-For) Then use an apache module like rpaf (http://stderr.net/apache/rpaf/) which will remove the need to modify you PHP application. -- Christian Laursen From justin at redwiredesign.com Mon Jul 27 11:35:41 2009 From: justin at redwiredesign.com (Justin Finkelstein) Date: Mon, 27 Jul 2009 12:35:41 +0100 Subject: 100% Transparent Reverse Proxy In-Reply-To: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> References: <45d40ce30907252110y6b7ff2eu1425fa203c5e72e1@mail.gmail.com> Message-ID: <1248694541.6798.233.camel@justin.studio-12.net> Hi Ryan We use varnish for proxying one of our larger websites and we had a similar problem with IP addressing when we first installed it. To get the correct IP address, you need to add the following varnish configuration: sub vcl_recv { remove req.http.X-Forwarded-For; } It looks a bit non-sensical but it worked for us! All the best, Justin On Sun, 2009-07-26 at 12:10 +0800, Ryan Chan wrote: > > Hello, > > I have serveral web sites running on Apache/PHP, I want to install a > Transparent Reverse Proxy (e.g. squid, varnish) to cache the static > stuff. (By looking at expire or LM resposne header) > > However, one of my requirements is that neither client (browser) or > server (Apache/PHP) is aware of existences of that proxy. > > E.g. > > Client will not see header such as via, age etc. > Server will not see header such as X-Forwarded-For > > I want to ask: Is it possible to do the above stuffs using varnish? > > Thanks. > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc -- Redwire Design Limited 54 Maltings Place 169 Tower Bridge Road London SE1 3LJ www.redwiredesign.com [ 020 7403 1444 ] - voice [ 020 7378 8711 ] - fax -------------- next part -------------- An HTML attachment was scrubbed... URL: From rtshilston at gmail.com Mon Jul 27 17:03:50 2009 From: rtshilston at gmail.com (Rob S) Date: Mon, 27 Jul 2009 18:03:50 +0100 Subject: Memory spreading, then stop responding Message-ID: <4A6DDDF6.4030603@gmail.com> Hi, Here's my setup: [root at varnish1 ~]# rpm -qa |grep varnish varnish-libs-2.0.4-1.el5 varnish-2.0.4-1.el5 [root at varnish1 ~]# uname -a Linux varnish1.example.com 2.6.18-128.el5 #1 SMP Wed Jan 21 10:41:14 EST 2009 x86_64 x86_64 x86_64 GNU/Linux [root at varnish1 ~]# ps aux |grep varnishd root 27993 0.0 0.0 106472 816 ? Ss 17:42 0:00 /usr/sbin/varnishd -P /var/run/varnish.pid -a 10.1.2.51:80 -T :6082 -f /etc/varnish/default.vcl -u varnish -g varnish -s file,/var/lib/varnish/varnish_storage.bin,1G varnish 28063 0.9 1.0 1474728 62860 ? Sl 17:43 0:06 /usr/sbin/varnishd -P /var/run/varnish.pid -a 10.1.2.51:80 -T :6082 -f /etc/varnish/default.vcl -u varnish -g varnish -s file,/var/lib/varnish/varnish_storage.bin,1G root 28799 0.0 0.0 61192 732 pts/3 S+ 17:56 0:00 grep varnishd The problem that I've encountered twice now is the following: 1) Varnish spreads to use over 8GB of swap, despite appearing to be configured to only use 1GB of storage 2) Our automated monitoring indicates that we're running out of swap space. 3) Restart varnish 4) From this point, varnishlog and varnishncsa return no output. Can anyone suggest why varnish is using more memory than it's allocated, and why varnishlog would stop returning any output? Varnishlog was writing to disk, so I can probably extract the end of that, if it's of use. Very grateful to anyone who can point me in the right direction. Rob From darryl.dixon at winterhouseconsulting.com Mon Jul 27 22:04:14 2009 From: darryl.dixon at winterhouseconsulting.com (Darryl Dixon - Winterhouse Consulting) Date: Tue, 28 Jul 2009 10:04:14 +1200 (NZST) Subject: Memory spreading, then stop responding In-Reply-To: <4A6DDDF6.4030603@gmail.com> References: <4A6DDDF6.4030603@gmail.com> Message-ID: <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> > Hi, > > Here's my setup: > > [root at varnish1 ~]# rpm -qa |grep varnish > varnish-libs-2.0.4-1.el5 > varnish-2.0.4-1.el5 > [root at varnish1 ~]# uname -a > Linux varnish1.example.com 2.6.18-128.el5 #1 SMP Wed Jan 21 10:41:14 EST > 2009 x86_64 x86_64 x86_64 GNU/Linux > [root at varnish1 ~]# ps aux |grep varnishd > root 27993 0.0 0.0 106472 816 ? Ss 17:42 0:00 > /usr/sbin/varnishd -P /var/run/varnish.pid -a 10.1.2.51:80 -T :6082 -f > /etc/varnish/default.vcl -u varnish -g varnish -s > file,/var/lib/varnish/varnish_storage.bin,1G > varnish 28063 0.9 1.0 1474728 62860 ? Sl 17:43 0:06 > /usr/sbin/varnishd -P /var/run/varnish.pid -a 10.1.2.51:80 -T :6082 -f > /etc/varnish/default.vcl -u varnish -g varnish -s > file,/var/lib/varnish/varnish_storage.bin,1G > root 28799 0.0 0.0 61192 732 pts/3 S+ 17:56 0:00 grep > varnishd > > The problem that I've encountered twice now is the following: > > 1) Varnish spreads to use over 8GB of swap, despite appearing to be > configured to only use 1GB of storage > 2) Our automated monitoring indicates that we're running out of swap > space. > 3) Restart varnish > 4) From this point, varnishlog and varnishncsa return no output. > > Can anyone suggest why varnish is using more memory than it's allocated, > and why varnishlog would stop returning any output? Varnishlog was > writing to disk, so I can probably extract the end of that, if it's of > use. > Hi Rob, There have been a few threads about this now on this mailing list. Probably it relates to the use of purge_url in your VCL. Are you using this function at all? regards, Darryl Dixon Winterhouse Consulting Ltd http://www.winterhouseconsulting.com From sridhar at primesoftsolutionsinc.com Tue Jul 28 04:29:42 2009 From: sridhar at primesoftsolutionsinc.com (Sridhar) Date: Tue, 28 Jul 2009 09:59:42 +0530 Subject: varnish caching problem Message-ID: Hi Team, My Varnish is not showing cache hits while I run from firefox. I am able to see cache hits when I use internet explorer. My setup is VARNISH > POUND > Plone Is there any firefox is sending any extra content to varnish?? Due to which the varnish is not able to server from cache?? Please help me in solving this issue. Regards, Sridhar Raju PrimeSoft Solutions Inc Phone: 040-27762986/27762987 Skype ID: sridharsagi www.primesoftsolutionsinc.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 2699 bytes Desc: not available URL: From phk at phk.freebsd.dk Tue Jul 28 06:42:04 2009 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Tue, 28 Jul 2009 06:42:04 +0000 Subject: varnish caching problem In-Reply-To: Your message of "Tue, 28 Jul 2009 09:59:42 +0530." Message-ID: <5094.1248763324@critter.freebsd.dk> In message , "Sridhar" writes: >My Varnish is not showing cache hits while I run from firefox. Check if you send cookies, by default cookies diables caching. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From jauderho at gmail.com Tue Jul 28 09:01:45 2009 From: jauderho at gmail.com (Jauder Ho) Date: Tue, 28 Jul 2009 02:01:45 -0700 Subject: varnish + if-none-match Message-ID: I just saw that If-None-Match support was added to trunk today. Does anyone have an example of how to enable/configure for this as I would be curious to test this out. Thanks. --Jauder -------------- next part -------------- An HTML attachment was scrubbed... URL: From hip.hydra at gmail.com Mon Jul 13 08:40:49 2009 From: hip.hydra at gmail.com (Hip Hydra) Date: Mon, 13 Jul 2009 08:40:49 -0000 Subject: Configuring varnish for many different domains with similar content Message-ID: <4A586F74.5040500@gmail.com> Hi, I'm running a network of web proxies, where the domain of each proxy is different, but any websites a user goes to will have the same content regardless of the proxy domain. For example: http://proxy1.com/visit.php?site=22abc862ce4 http://proxy2.com/visit.php?site=22abc862ce4 http://proxy3.com/visit.php?site=22abc862ce4 ... These URLs all go to the same site. However, proxy1.com, proxy2.com, etc. have different base files. Only requests that go through visit.php are the same if the site variable is the same. Is there any way I can optimize Varnish to cache the data files more effectively? Many thanks! John From sridhar at primesoftsolutionsinc.com Tue Jul 28 15:14:45 2009 From: sridhar at primesoftsolutionsinc.com (Sridhar) Date: Tue, 28 Jul 2009 20:44:45 +0530 Subject: varnish caching problem In-Reply-To: <5094.1248763324@critter.freebsd.dk> References: Your message of "Tue, 28 Jul 2009 09:59:42 +0530." <5094.1248763324@critter.freebsd.dk> Message-ID: Hi, If I disable cookies on firefox, I am not able to log into plone. If I configured varnish to ignore cache as defined bellow. ##################### sub vcl_recv { if (req.request == "GET" && req.http.cookie) { lookup; } lookup; } ######################### I am not able to login into plone site using either firefox or Internet Explorer Please help me in resolving the issue Sridhar Raju PrimeSoft Solutions Inc Phone: 040-27762986/27762987 Skype ID: sridharsagi www.primesoftsolutionsinc.com -----Original Message----- From: phk at critter.freebsd.dk [mailto:phk at critter.freebsd.dk] On Behalf Of Poul-Henning Kamp Sent: Tuesday, July 28, 2009 12:12 PM To: Sridhar Cc: varnish-misc at projects.linpro.no Subject: Re: varnish caching problem In message , "Sridhar" writes: >My Varnish is not showing cache hits while I run from firefox. Check if you send cookies, by default cookies diables caching. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From rtshilston at gmail.com Tue Jul 28 19:22:23 2009 From: rtshilston at gmail.com (Rob S) Date: Tue, 28 Jul 2009 20:22:23 +0100 Subject: Memory spreading, then stop responding In-Reply-To: <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> Message-ID: <4A6F4FEF.1030506@gmail.com> Darryl Dixon - Winterhouse Consulting wrote: >> >> Can anyone suggest why varnish is using more memory than it's allocated, >> and why varnishlog would stop returning any output? Varnishlog was >> writing to disk, so I can probably extract the end of that, if it's of >> use. >> >> > > Hi Rob, > > There have been a few threads about this now on this mailing list. > Probably it relates to the use of purge_url in your VCL. Are you using > this function at all? > > regards, > Darryl Darryl, Thanks for your reply. Yes we are using purge_url, but I was under the impression that since http://varnish.projects.linpro.no/changeset/3329, there wasn't a problem. I've not succeeded in finding the threads you mentioned in your email. Can you either point me at them, or let me know their conclusion? Thanks, Rob From kb+varnish at slide.com Tue Jul 28 19:37:11 2009 From: kb+varnish at slide.com (Ken Brownfield) Date: Tue, 28 Jul 2009 12:37:11 -0700 Subject: Configuring varnish for many different domains with similar content In-Reply-To: <4A586F74.5040500@gmail.com> References: <4A586F74.5040500@gmail.com> Message-ID: <06D7F6EC-E2C3-4CAF-94CC-2B3EF2C3ADE2@slide.com> See the FAQ: http://varnish.projects.linpro.no/wiki/FAQ#IhaveasitewithmanyhostnameshowdoIkeepthemfrommultiplyingthecache If your backends need to see the original hostname, you can unrewrite it in vcl_miss(). -- Ken On Jul 11, 2009, at 3:54 AM, Hip Hydra wrote: > Hi, I'm running a network of web proxies, where the domain of each > proxy > is different, but any websites a user goes to will have the same > content > regardless of the proxy domain. > > For example: > http://proxy1.com/visit.php?site=22abc862ce4 > http://proxy2.com/visit.php?site=22abc862ce4 > http://proxy3.com/visit.php?site=22abc862ce4 > ... > > These URLs all go to the same site. However, proxy1.com, proxy2.com, > etc. have different base files. Only requests that go through > visit.php > are the same if the site variable is the same. > > Is there any way I can optimize Varnish to cache the data files more > effectively? > > Many thanks! > John > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc From kova70 at gmail.com Tue Jul 28 20:50:49 2009 From: kova70 at gmail.com (Raymond Hall) Date: Tue, 28 Jul 2009 15:50:49 -0500 Subject: varnish akamai interaction Message-ID: <53dee190907281350x73921223r7071b7435bc9d57d@mail.gmail.com> Hi there, I'm wondering if anyone here has akamaized an already varinsh accelerated website, and if so, what experiences did you get? I'm especially interested in the Cache-Control, Expires, Last-Modified headers interaction. regards, Ray -- Knowingly entering a Ponzi scheme can be rational, in the economic sense, even at the last round of the scheme if a government will likely bail out those participating in the Ponzi scheme. The IPAB Posse From darryl.dixon at winterhouseconsulting.com Tue Jul 28 21:24:55 2009 From: darryl.dixon at winterhouseconsulting.com (Darryl Dixon - Winterhouse Consulting) Date: Wed, 29 Jul 2009 09:24:55 +1200 (NZST) Subject: Memory spreading, then stop responding In-Reply-To: <4A6F4FEF.1030506@gmail.com> References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> <4A6F4FEF.1030506@gmail.com> Message-ID: <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> > Darryl Dixon - Winterhouse Consulting wrote: >>> >>> Can anyone suggest why varnish is using more memory than it's >>> allocated, >>> and why varnishlog would stop returning any output? Varnishlog was >>> writing to disk, so I can probably extract the end of that, if it's of >>> use. >>> >>> >> >> Hi Rob, >> >> There have been a few threads about this now on this mailing list. >> Probably it relates to the use of purge_url in your VCL. Are you using >> this function at all? >> >> regards, >> Darryl > Darryl, > > Thanks for your reply. Yes we are using purge_url, but I was under the > impression that since http://varnish.projects.linpro.no/changeset/3329, > there wasn't a problem. I've not succeeded in finding the threads you > mentioned in your email. Can you either point me at them, or let me > know their conclusion? > Hi Rob, See the thread concluding here (the solution to swap purge_url for obj.ttl=0 is the correct one): http://projects.linpro.no/pipermail/varnish-misc/2009-April/002743.html And also the thread concluding here: http://projects.linpro.no/pipermail/varnish-misc/2009-June/002840.html regards, Darryl Dixon Winterhouse Consulting Ltd http://www.winterhouseconsulting.com From rtshilston at gmail.com Tue Jul 28 21:35:48 2009 From: rtshilston at gmail.com (Rob S) Date: Tue, 28 Jul 2009 22:35:48 +0100 Subject: Memory spreading, then stop responding In-Reply-To: <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> <4A6F4FEF.1030506@gmail.com> <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> Message-ID: <4A6F6F34.1060500@gmail.com> Darryl Dixon - Winterhouse Consulting wrote: >> Darryl Dixon - Winterhouse Consulting wrote: >> >>>> >>>> Can anyone suggest why varnish is using more memory than it's >>>> allocated, >>>> and why varnishlog would stop returning any output? Varnishlog was >>>> writing to disk, so I can probably extract the end of that, if it's of >>>> use. >>>> >>>> >>>> >>> Hi Rob, >>> >>> There have been a few threads about this now on this mailing list. >>> Probably it relates to the use of purge_url in your VCL. Are you using >>> this function at all? >>> >>> regards, >>> Darryl >>> >> Darryl, >> >> Thanks for your reply. Yes we are using purge_url, but I was under the >> impression that since http://varnish.projects.linpro.no/changeset/3329, >> there wasn't a problem. I've not succeeded in finding the threads you >> mentioned in your email. Can you either point me at them, or let me >> know their conclusion? >> >> > > Hi Rob, > > See the thread concluding here (the solution to swap purge_url for > obj.ttl=0 is the correct one): > http://projects.linpro.no/pipermail/varnish-misc/2009-April/002743.html > > And also the thread concluding here: > http://projects.linpro.no/pipermail/varnish-misc/2009-June/002840.html > > regards, > Darryl Dixon > Winterhouse Consulting Ltd > http://www.winterhouseconsulting.com > > > Thanks Darryl. However, I don't think this solution will work in our usage. We're running a blog. Administrators get un-cached access, straight through varnish. Then, when they publish, we issue a purge across the entire site. We need to do this as there's various bits of navigation that'd need to be updated. I can't see that we can do this if we set obj.ttl. Has anyone any recommendations as to how best to deal with purges like this? Rob From michael at dynamine.net Tue Jul 28 22:07:47 2009 From: michael at dynamine.net (Michael S. Fischer) Date: Tue, 28 Jul 2009 15:07:47 -0700 Subject: Memory spreading, then stop responding In-Reply-To: <4A6F6F34.1060500@gmail.com> References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> <4A6F4FEF.1030506@gmail.com> <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> <4A6F6F34.1060500@gmail.com> Message-ID: On Jul 28, 2009, at 2:35 PM, Rob S wrote: > Thanks Darryl. However, I don't think this solution will work in our > usage. We're running a blog. Administrators get un-cached access, > straight through varnish. Then, when they publish, we issue a purge > across the entire site. We need to do this as there's various bits of > navigation that'd need to be updated. I can't see that we can do this > if we set obj.ttl. > > Has anyone any recommendations as to how best to deal with purges > like this? If you're issuing a PURGE across the entire site, why not simply restart Varnish with an empty cache? --Michael From rtshilston at gmail.com Tue Jul 28 22:09:52 2009 From: rtshilston at gmail.com (Rob S) Date: Tue, 28 Jul 2009 23:09:52 +0100 Subject: Memory spreading, then stop responding In-Reply-To: References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> <4A6F4FEF.1030506@gmail.com> <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> <4A6F6F34.1060500@gmail.com> Message-ID: <4A6F7730.4060707@gmail.com> Michael S. Fischer wrote: > On Jul 28, 2009, at 2:35 PM, Rob S wrote: >> Thanks Darryl. However, I don't think this solution will work in our >> usage. We're running a blog. Administrators get un-cached access, >> straight through varnish. Then, when they publish, we issue a purge >> across the entire site. We need to do this as there's various bits of >> navigation that'd need to be updated. I can't see that we can do this >> if we set obj.ttl. >> >> Has anyone any recommendations as to how best to deal with purges >> like this? > > If you're issuing a PURGE across the entire site, why not simply > restart Varnish with an empty cache? > > --Michael > Because Varnish is also working for other hosts which don't need purging at the same time... Rob From michael at dynamine.net Wed Jul 29 01:04:52 2009 From: michael at dynamine.net (Michael S. Fischer) Date: Tue, 28 Jul 2009 18:04:52 -0700 Subject: Memory spreading, then stop responding In-Reply-To: <4A6F7730.4060707@gmail.com> References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> <4A6F4FEF.1030506@gmail.com> <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> <4A6F6F34.1060500@gmail.com> <4A6F7730.4060707@gmail.com> Message-ID: <7CE0A11D-5636-4A05-82B8-BA5CA8A17825@dynamine.net> On Jul 28, 2009, at 3:09 PM, Rob S wrote: > Michael S. Fischer wrote: >> On Jul 28, 2009, at 2:35 PM, Rob S wrote: >>> Thanks Darryl. However, I don't think this solution will work in >>> our >>> usage. We're running a blog. Administrators get un-cached access, >>> straight through varnish. Then, when they publish, we issue a purge >>> across the entire site. We need to do this as there's various >>> bits of >>> navigation that'd need to be updated. I can't see that we can do >>> this >>> if we set obj.ttl. >>> >>> Has anyone any recommendations as to how best to deal with purges >>> like this? >> >> If you're issuing a PURGE across the entire site, why not simply >> restart Varnish with an empty cache? >> >> --Michael >> > Because Varnish is also working for other hosts which don't need > purging at the same time... My company gets around this madness by versioning its URLs. It works pretty well. --Michael From kb+varnish at slide.com Wed Jul 29 01:24:18 2009 From: kb+varnish at slide.com (Ken Brownfield) Date: Tue, 28 Jul 2009 18:24:18 -0700 Subject: varnish akamai interaction In-Reply-To: <53dee190907281350x73921223r7071b7435bc9d57d@mail.gmail.com> References: <53dee190907281350x73921223r7071b7435bc9d57d@mail.gmail.com> Message-ID: <4F6AA6CE-5BDC-42EC-A4E6-FA372894BDB5@slide.com> Quite a coincidence... We've moved some traffic /off/ of Akamai, but only today did we start stacking. I don't expect there to be any problems *EXCEPT* in the 304 response case -- currently Varnish strips the Expires and Cache-Control (among other) headers from 304 Not Modified responses, which could be problematic for browsers and Akamai alike. See ticket #529. Rog?rio Schneider wrote a patch which Tollef Fog Heen checked into trunk. A version of the patch ported to 2.0.4 is attached to that ticket, and has been stable in our production 2.0.4 environment for the last day or two. Cache and 304 behavior seems correct. However, it may depend on your Akamai config. We emit Expires, which we have Akamai use to control caching. Our default is to not cache in the absense of Expires and not touch any headers. Depending on your Akamai config, YMMV. Hope it helps, -- Ken On Jul 28, 2009, at 1:50 PM, Raymond Hall wrote: > Hi there, > > I'm wondering if anyone here has akamaized an already varinsh > accelerated website, and if so, what experiences did you get? > I'm especially interested in the Cache-Control, Expires, Last-Modified > headers interaction. > > regards, > Ray > > -- > Knowingly entering a Ponzi scheme can be rational, in the economic > sense, even at the last round of the scheme if a government will > likely bail out those participating in the Ponzi scheme. The IPAB > Posse > _______________________________________________ > varnish-misc mailing list > varnish-misc at projects.linpro.no > http://projects.linpro.no/mailman/listinfo/varnish-misc From sridhar at primesoftsolutionsinc.com Wed Jul 29 04:21:54 2009 From: sridhar at primesoftsolutionsinc.com (Sridhar) Date: Wed, 29 Jul 2009 09:51:54 +0530 Subject: varnish caching problem References: Your message of "Tue, 28 Jul 2009 09:59:42 +0530." <5094.1248763324@critter.freebsd.dk> Message-ID: Hi, If I disable cookies on firefox, I am not able to log into plone. If I configured varnish to ignore cache as defined bellow. ##################### sub vcl_recv { if (req.request == "GET" && req.http.cookie) { lookup; } lookup; } ######################### I am not able to login into plone site using either firefox or Internet Explorer Please help me in resolving the issue Regards, Sridhar Raju PrimeSoft Solutions Inc Phone: 040-27762986/27762987 Skype ID: sridharsagi www.primesoftsolutionsinc.com -----Original Message----- From: Sridhar [mailto:sridhar at primesoftsolutionsinc.com] Sent: Tuesday, July 28, 2009 8:45 PM To: 'Poul-Henning Kamp' Cc: 'varnish-misc at projects.linpro.no' Subject: RE: varnish caching problem Hi, If I disable cookies on firefox, I am not able to log into plone. If I configured varnish to ignore cache as defined bellow. ##################### sub vcl_recv { if (req.request == "GET" && req.http.cookie) { lookup; } lookup; } ######################### I am not able to login into plone site using either firefox or Internet Explorer Please help me in resolving the issue Sridhar Raju PrimeSoft Solutions Inc Phone: 040-27762986/27762987 Skype ID: sridharsagi www.primesoftsolutionsinc.com -----Original Message----- From: phk at critter.freebsd.dk [mailto:phk at critter.freebsd.dk] On Behalf Of Poul-Henning Kamp Sent: Tuesday, July 28, 2009 12:12 PM To: Sridhar Cc: varnish-misc at projects.linpro.no Subject: Re: varnish caching problem In message , "Sridhar" writes: >My Varnish is not showing cache hits while I run from firefox. Check if you send cookies, by default cookies diables caching. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From rtshilston at gmail.com Wed Jul 29 09:04:55 2009 From: rtshilston at gmail.com (Rob S) Date: Wed, 29 Jul 2009 10:04:55 +0100 Subject: Memory spreading, then stop responding In-Reply-To: <7CE0A11D-5636-4A05-82B8-BA5CA8A17825@dynamine.net> References: <4A6DDDF6.4030603@gmail.com> <61724.58.28.124.90.1248732254.squirrel@services.directender.co.nz> <4A6F4FEF.1030506@gmail.com> <62132.58.28.124.90.1248816295.squirrel@services.directender.co.nz> <4A6F6F34.1060500@gmail.com> <4A6F7730.4060707@gmail.com> <7CE0A11D-5636-4A05-82B8-BA5CA8A17825@dynamine.net> Message-ID: <4A7010B7.7090403@gmail.com> >>>> Thanks Darryl. However, I don't think this solution will work in our >>>> usage. We're running a blog. Administrators get un-cached access, >>>> straight through varnish. Then, when they publish, we issue a purge >>>> across the entire site. We need to do this as there's various bits of >>>> navigation that'd need to be updated. I can't see that we can do this >>>> if we set obj.ttl. >>>> >>>> Has anyone any recommendations as to how best to deal with purges >>>> like this? >>> >>> If you're issuing a PURGE across the entire site, why not simply >>> restart Varnish with an empty cache? >>> >>> --Michael >>> >> Because Varnish is also working for other hosts which don't need >> purging at the same time... > > My company gets around this madness by versioning its URLs. It works > pretty well. > > --Michael Thanks. Are there any varnish developers who can comment on this memory-usage-growth-when-purging? I can't see any open tickets for this, and I'm sure there are several mailing list members who might like to contribute a bounty for development of a fix to this. Rob From tfheen at redpill-linpro.com Wed Jul 29 11:21:04 2009 From: tfheen at redpill-linpro.com (Tollef Fog Heen) Date: Wed, 29 Jul 2009 13:21:04 +0200 Subject: Inline C and memory allocation In-Reply-To: (Laurence Rowe's message of "Tue, 7 Jul 2009 11:56:13 +0200") References: <3B689787-A7EB-4485-B013-8D9B5391BC6E@slide.com> Message-ID: <87iqhb7n1b.fsf@qurzaw.linpro.no> ]] Laurence Rowe | I'm not certain if I need to manage the memory of the string that I | set the header too. It looks like VRT_SetHdr copies the string into | it's own memory managed space though. You don't, it's part of the object's workspace. -- Tollef Fog Heen Redpill Linpro -- Changing the game! t: +47 21 54 41 73 From samcrawford at gmail.com Thu Jul 30 10:38:54 2009 From: samcrawford at gmail.com (Sam Crawford) Date: Thu, 30 Jul 2009 11:38:54 +0100 Subject: Intermittent 503 errors Message-ID: Morning all, Once every few days we get a report of a user with a nice big "503 Service Unavailable" on a webpage that refreshes itself every 30 seconds. The webserver that serves the content appears to be up continuously, although it could well be garbage collecting (it's GlassFish - I need to enable GC logs to capture this activity). Also, our backend connect timeout in our Varnish VCL is currently set to 1s, which could be problematic. In order to help confirm my suspicions would someone mind casting their eyes over the log below? I'm interested to know what caused Varnish to report the 503 - was it a plain connection refused? Did the connection to the backend timeout? I can't tell myself from this log. Thanks, Sam 14 SessionOpen c 10.99.1.15 4679 10.98.13.28:7070 14 ReqStart c 10.99.1.15 4679 1053467939 14 RxRequest c GET 14 RxURL c /some/uri/goes/here 14 RxProtocol c HTTP/1.1 14 RxHeader c Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, applicati 14 RxHeader c Accept-Language: en-gb 14 RxHeader c UA-CPU: x86 14 RxHeader c Accept-Encoding: gzip, deflate 14 RxHeader c User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0 .04506.648; .NET CLR 3.5.21022; InfoPath.2; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) 14 RxHeader c Host: intranet.company.com 14 RxHeader c Connection: Keep-Alive 14 RxHeader c Cookie: amlbcookie=01 14 VCL_call c recv 14 VCL_return c pass 14 VCL_call c pass 14 VCL_return c pass 14 Backend c 27 localhost localhost 27 TxRequest b GET 27 TxURL b /some/uri/goes/here 27 TxProtocol b HTTP/1.1 27 TxHeader b Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, applicati 27 TxHeader b Accept-Language: en-gb 27 TxHeader b UA-CPU: x86 27 TxHeader b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0 .04506.648; .NET CLR 3.5.21022; InfoPath.2; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) 27 TxHeader b Host: intranet.company.com 27 TxHeader b Cookie: amlbcookie=01 27 TxHeader b Accept-Encoding: gzip 27 TxHeader b X-Varnish: 1053467939 27 TxHeader b X-Forwarded-For: 10.99.1.15 27 BackendClose b localhost 14 VCL_call c error 14 VCL_return c deliver 14 Length c 466 14 VCL_call c deliver 14 VCL_return c deliver 14 TxProtocol c HTTP/1.1 14 TxStatus c 503 14 TxResponse c Service Unavailable 14 TxHeader c Server: Varnish 14 TxHeader c Retry-After: 0 14 TxHeader c Content-Type: text/html; charset=utf-8 14 TxHeader c Content-Length: 466 14 TxHeader c Date: Thu, 30 Jul 2009 09:57:46 GMT 14 TxHeader c X-Varnish: 1053467939 14 TxHeader c Age: 0 14 TxHeader c Via: 1.1 varnish 14 TxHeader c Connection: close 14 ReqEnd c 1053467939 1248947866.962788105 1248947866.963104010 0.000128031 0.000283957 0.000031948 14 SessionClose c error From samcrawford at gmail.com Thu Jul 30 10:53:30 2009 From: samcrawford at gmail.com (Sam Crawford) Date: Thu, 30 Jul 2009 11:53:30 +0100 Subject: Intermittent 503 errors In-Reply-To: References: Message-ID: I made a mistake in my original mail - there's a 5 second connect timeout, not 1 second. Also, I wonder if there may be some pooling/keep-alive mismatch between Varnish and the webserver, whereby Varnish holds a stale connections and tries to use it. Suggestions welcome! Thanks, Sam 2009/7/30 Sam Crawford : > Morning all, > > Once every few days we get a report of a user with a nice big "503 > Service Unavailable" on a webpage that refreshes itself every 30 > seconds. > > The webserver that serves the content appears to be up continuously, > although it could well be garbage collecting (it's GlassFish - I need > to enable GC logs to capture this activity). Also, our backend connect > timeout in our Varnish VCL is currently set to 1s, which could be > problematic. > > In order to help confirm my suspicions would someone mind casting > their eyes over the log below? I'm interested to know what caused > Varnish to report the 503 - was it a plain connection refused? Did the > connection to the backend timeout? I can't tell myself from this log. > > Thanks, > > Sam > > > ? 14 SessionOpen ?c 10.99.1.15 4679 10.98.13.28:7070 > ? 14 ReqStart ? ? c 10.99.1.15 4679 1053467939 > ? 14 RxRequest ? ?c GET > ? 14 RxURL ? ? ? ?c /some/uri/goes/here > ? 14 RxProtocol ? c HTTP/1.1 > ? 14 RxHeader ? ? c Accept: image/gif, image/x-xbitmap, image/jpeg, > image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, > application/msword, > ?application/x-shockwave-flash, application/xaml+xml, > application/vnd.ms-xpsdocument, application/x-ms-xbap, applicati > ? 14 RxHeader ? ? c Accept-Language: en-gb > ? 14 RxHeader ? ? c UA-CPU: x86 > ? 14 RxHeader ? ? c Accept-Encoding: gzip, deflate > ? 14 RxHeader ? ? c User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; > Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR > 3.0.04506.30; .NET CLR 3.0 > .04506.648; .NET CLR 3.5.21022; InfoPath.2; .NET CLR 3.0.4506.2152; > .NET CLR 3.5.30729) > ? 14 RxHeader ? ? c Host: intranet.company.com > ? 14 RxHeader ? ? c Connection: Keep-Alive > ? 14 RxHeader ? ? c Cookie: amlbcookie=01 > ? 14 VCL_call ? ? c recv > ? 14 VCL_return ? c pass > ? 14 VCL_call ? ? c pass > ? 14 VCL_return ? c pass > ? 14 Backend ? ? ?c 27 localhost localhost > ? 27 TxRequest ? ?b GET > ? 27 TxURL ? ? ? ?b /some/uri/goes/here > ? 27 TxProtocol ? b HTTP/1.1 > ? 27 TxHeader ? ? b Accept: image/gif, image/x-xbitmap, image/jpeg, > image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, > application/msword, > ?application/x-shockwave-flash, application/xaml+xml, > application/vnd.ms-xpsdocument, application/x-ms-xbap, applicati > ? 27 TxHeader ? ? b Accept-Language: en-gb > ? 27 TxHeader ? ? b UA-CPU: x86 > ? 27 TxHeader ? ? b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; > Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR > 3.0.04506.30; .NET CLR 3.0 > .04506.648; .NET CLR 3.5.21022; InfoPath.2; .NET CLR 3.0.4506.2152; > .NET CLR 3.5.30729) > ? 27 TxHeader ? ? b Host: intranet.company.com > ? 27 TxHeader ? ? b Cookie: amlbcookie=01 > ? 27 TxHeader ? ? b Accept-Encoding: gzip > ? 27 TxHeader ? ? b X-Varnish: 1053467939 > ? 27 TxHeader ? ? b X-Forwarded-For: 10.99.1.15 > ? 27 BackendClose b localhost > ? 14 VCL_call ? ? c error > ? 14 VCL_return ? c deliver > ? 14 Length ? ? ? c 466 > ? 14 VCL_call ? ? c deliver > ? 14 VCL_return ? c deliver > ? 14 TxProtocol ? c HTTP/1.1 > ? 14 TxStatus ? ? c 503 > ? 14 TxResponse ? c Service Unavailable > ? 14 TxHeader ? ? c Server: Varnish > ? 14 TxHeader ? ? c Retry-After: 0 > ? 14 TxHeader ? ? c Content-Type: text/html; charset=utf-8 > ? 14 TxHeader ? ? c Content-Length: 466 > ? 14 TxHeader ? ? c Date: Thu, 30 Jul 2009 09:57:46 GMT > ? 14 TxHeader ? ? c X-Varnish: 1053467939 > ? 14 TxHeader ? ? c Age: 0 > ? 14 TxHeader ? ? c Via: 1.1 varnish > ? 14 TxHeader ? ? c Connection: close > ? 14 ReqEnd ? ? ? c 1053467939 1248947866.962788105 > 1248947866.963104010 0.000128031 0.000283957 0.000031948 > ? 14 SessionClose c error >