From frank.gruellich at mapsolute.com  Thu Apr  1 12:20:23 2010
From: frank.gruellich at mapsolute.com (Frank Gruellich)
Date: Thu, 01 Apr 2010 13:20:23 +0100
Subject: Ignoring health checks in statistics
In-Reply-To: <064FF286FD17EC418BFB74629578BED117DA7796@tmg-mail4.torstar.net>
References: <4BA89FE0.8090406@mapsolute.com>
	<064FF286FD17EC418BFB74629578BED117DA7796@tmg-mail4.torstar.net>
Message-ID: <4BB48F87.8060106@mapsolute.com>

Hi,

okay, I'll have to see if I can make our load balancer to consider an
HTTP 404 as healthy...  Or could I simple say error 200 "I am OK"?

Can I make this dependent on the health of my backend?  Would something
like this

sub vcl_recv
{
	set req.backend = my_backend;
	if (req.url = "/static.txt")
	{
		if (req.backend.healthy)
		{
			error 200 "I am OK";
		}
		else
		{
			error 404 "I am not OK";
		}
	}
}

work and still not have that counted in statistics?

Kind regards,
 Frank.

On 03/23/10 14:11, Caunter, Stefan wrote:
> Just put this in sub vcl_recv
> 
> else {
>                  error 404 "I am OK";
>      }
> 
> 
> Stefan Caunter :: Senior Systems Administrator :: TOPS
> e: scaunter at topscms.com  ::  m: (416) 561-4871
> www.thestar.com www.topscms.com
> 
> -----Original Message-----
> From: varnish-dev-bounces at varnish-cache.org
> [mailto:varnish-dev-bounces at varnish-cache.org] On Behalf Of Frank
> Gruellich
> Sent: March-23-10 7:03 AM
> To: varnish-dev at varnish-cache.org
> Subject: Ignoring health checks in statistics
> 
> Hi,
> 
> we have two Varnish instances and a load balancer in front.  The load
> balancer periodically polls the varnishd's for a static file (lying also
> at all the backend servers) to check if varnishd is still working.  It
> does that once per second.  This completely messes up statistics like
> hit ratio because that static file is served from cache all the time, so
> currently our hit ratio is close to 99.9%.  I would like to have counted
> this specific file neither as miss nor hit, but just serve it and ignore
> it otherwise.  It doesn't even need to be logged.
> 
> haproxy has an option monitor-net where you can specify that it does not
> need to log requests from that subnet.  Does varnish offer a similar
> option?  Or is there some cool VCL snippet I could add?
> 
> Thanks in advance.
> 
> Kind regards,
> --
> Navteq (DE) GmbH
> Frank Gruellich
> Map24 Systems and Networks
> 
> Duesseldorfer Strasse 40a
> 65760 Eschborn
> Germany
> 
> Phone:      +49 6196 77756-414
> Fax:        +49 6196 77756-100
> 
> HRB 46215, Local Court Frankfurt am Main Managing Directors: Thomas
> Golob, Hans Pieter Gieszen, Martin Robert Stockman
> USt-ID-No.: DE 197947163
> 
> 
> 
> 

Mit freundlichen Gruessen,
Kind regards,
-- 
Navteq (DE) GmbH
Frank Gruellich
Map24 Systems and Networks

Duesseldorfer Strasse 40a
65760 Eschborn
Germany

Phone:      +49 6196 77756-414
Fax:        +49 6196 77756-100

HRB 46215, Local Court Frankfurt am Main
Managing Directors: Thomas Golob, Hans Pieter Gieszen, Martin Robert Stockman
USt-ID-No.: DE 197947163

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100401/55d64a25/attachment.bin>

From scaunter at topscms.com  Thu Apr  1 13:36:07 2010
From: scaunter at topscms.com (Caunter, Stefan)
Date: Thu, 1 Apr 2010 09:36:07 -0400
Subject: Ignoring health checks in statistics
In-Reply-To: <4BB48F87.8060106@mapsolute.com>
References: <4BA89FE0.8090406@mapsolute.com><064FF286FD17EC418BFB74629578BED117DA7796@tmg-mail4.torstar.net>
	<4BB48F87.8060106@mapsolute.com>
Message-ID: <064FF286FD17EC418BFB74629578BED11814D179@tmg-mail4.torstar.net>

The example satisfies a check for a text string that would keep varnish
"in the pool". 
"Health" is different, and configurable based on agreed parameters
defined in the load balancer based on expected backend behaviour. 

If varnish cannot send a 404 it would be a problem I think.

Stefan Caunter :: Senior Systems Administrator :: TOPS
e: scaunter at topscms.com  ::  m: (416) 561-4871
www.thestar.com www.topscms.com

-----Original Message-----
From: varnish-dev-bounces at varnish-cache.org
[mailto:varnish-dev-bounces at varnish-cache.org] On Behalf Of Frank
Gruellich
Sent: April-01-10 8:20 AM
To: varnish-dev at varnish-cache.org
Subject: Re: Ignoring health checks in statistics

Hi,

okay, I'll have to see if I can make our load balancer to consider an
HTTP 404 as healthy...  Or could I simple say error 200 "I am OK"?

Can I make this dependent on the health of my backend?  Would something
like this

sub vcl_recv
{
	set req.backend = my_backend;
	if (req.url = "/static.txt")
	{
		if (req.backend.healthy)
		{
			error 200 "I am OK";
		}
		else
		{
			error 404 "I am not OK";
		}
	}
}

work and still not have that counted in statistics?

Kind regards,
 Frank.

On 03/23/10 14:11, Caunter, Stefan wrote:
> Just put this in sub vcl_recv
> 
> else {
>                  error 404 "I am OK";
>      }
> 
> 
> Stefan Caunter :: Senior Systems Administrator :: TOPS
> e: scaunter at topscms.com  ::  m: (416) 561-4871 www.thestar.com 
> www.topscms.com
> 
> -----Original Message-----
> From: varnish-dev-bounces at varnish-cache.org
> [mailto:varnish-dev-bounces at varnish-cache.org] On Behalf Of Frank 
> Gruellich
> Sent: March-23-10 7:03 AM
> To: varnish-dev at varnish-cache.org
> Subject: Ignoring health checks in statistics
> 
> Hi,
> 
> we have two Varnish instances and a load balancer in front.  The load 
> balancer periodically polls the varnishd's for a static file (lying 
> also at all the backend servers) to check if varnishd is still 
> working.  It does that once per second.  This completely messes up 
> statistics like hit ratio because that static file is served from 
> cache all the time, so currently our hit ratio is close to 99.9%.  I 
> would like to have counted this specific file neither as miss nor hit,

> but just serve it and ignore it otherwise.  It doesn't even need to be
logged.
> 
> haproxy has an option monitor-net where you can specify that it does 
> not need to log requests from that subnet.  Does varnish offer a 
> similar option?  Or is there some cool VCL snippet I could add?
> 
> Thanks in advance.
> 
> Kind regards,
> --
> Navteq (DE) GmbH
> Frank Gruellich
> Map24 Systems and Networks
> 
> Duesseldorfer Strasse 40a
> 65760 Eschborn
> Germany
> 
> Phone:      +49 6196 77756-414
> Fax:        +49 6196 77756-100
> 
> HRB 46215, Local Court Frankfurt am Main Managing Directors: Thomas 
> Golob, Hans Pieter Gieszen, Martin Robert Stockman
> USt-ID-No.: DE 197947163
> 
> 
> 
> 

Mit freundlichen Gruessen,
Kind regards,
--
Navteq (DE) GmbH
Frank Gruellich
Map24 Systems and Networks

Duesseldorfer Strasse 40a
65760 Eschborn
Germany

Phone:      +49 6196 77756-414
Fax:        +49 6196 77756-100

HRB 46215, Local Court Frankfurt am Main Managing Directors: Thomas
Golob, Hans Pieter Gieszen, Martin Robert Stockman
USt-ID-No.: DE 197947163


From letzterfreiercoolername at googlemail.com  Fri Apr  9 15:22:03 2010
From: letzterfreiercoolername at googlemail.com (Peter Fischer)
Date: Fri, 9 Apr 2010 17:22:03 +0200
Subject: Error on http://varnish-cache.org/wiki/VCLExampleDisguiseServer
Message-ID: <w2mcdf44c231004090822qdad3c81bg15fbe277cd4b9520@mail.gmail.com>

Hi list!

http://varnish-cache.org/wiki/VCLExampleDisguiseServer seems to be
misguiding...

Just copy, paste & adjust the example given on this TRAC page raises:

> Message from VCC-compiler:

Variable 'obj.http.Server' not accessible in method 'vcl_fetch'.

At: (input Line 213 Pos 11)

    unset obj.http.Server;

----------###############-

Running VCC-compiler failed, exit 1


'obj' above should probably be 'beresp', because

sub vcl_fetch {

    /* hide behind old Apache mask */

    unset beresp.http.Server;

    set beresp.http.Server = "Apache/1.3.8";

[...]

}


works for me with the desired result!

Varnish version is a

> ~# /opt/varnish/sbin/varnishd -V

varnishd (varnish-2.1 SVN 4640:4641)

Copyright (c) 2006-2009 Linpro AS / Verdens Gang AS

compiled from source on a

> ~# lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description:    Ubuntu 8.04.1

Release:        8.04

Codename:       hardy

installation.

Cheers,
    Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100409/57b3d1f3/attachment.html>

From thebog at gmail.com  Fri Apr  9 15:36:59 2010
From: thebog at gmail.com (thebog)
Date: Fri, 9 Apr 2010 17:36:59 +0200
Subject: Error on http://varnish-cache.org/wiki/VCLExampleDisguiseServer
In-Reply-To: <w2mcdf44c231004090822qdad3c81bg15fbe277cd4b9520@mail.gmail.com>
References: <w2mcdf44c231004090822qdad3c81bg15fbe277cd4b9520@mail.gmail.com>
Message-ID: <w2v19260a5a1004090836m686a7bbbgf23d4f59740b484c@mail.gmail.com>

Thank you for noticing this, and letting us know! I have now updated
the page quickly to reflect this.

YS
Anders Berg

On Fri, Apr 9, 2010 at 5:22 PM, Peter Fischer
<letzterfreiercoolername at googlemail.com> wrote:
> Hi list!
> http://varnish-cache.org/wiki/VCLExampleDisguiseServer seems to be
> misguiding...
> Just copy, paste & adjust the example given on this TRAC page raises:
>>
>> Message from VCC-compiler:
>>
>> Variable 'obj.http.Server' not accessible in method 'vcl_fetch'.
>>
>> At: (input Line 213 Pos 11)
>>
>> ?? ?unset obj.http.Server;
>>
>> ----------###############-
>>
>> Running VCC-compiler failed, exit 1
>
> 'obj' above should probably be 'beresp', because
>>
>> sub vcl_fetch {
>>
>> ?? ?/* hide behind old Apache mask */
>>
>> ?? ?unset beresp.http.Server;
>>
>> ?? ?set beresp.http.Server = "Apache/1.3.8";
>>
>> [...]
>>
>> }
>
> works for me with the desired result!
> Varnish version is a
>>
>> ~# /opt/varnish/sbin/varnishd -V
>>
>> varnishd (varnish-2.1 SVN 4640:4641)
>>
>> Copyright (c) 2006-2009 Linpro AS / Verdens Gang AS
>
> compiled from source on a
>>
>> ~# lsb_release -a
>>
>> No LSB modules are available.
>>
>> Distributor ID: Ubuntu
>>
>> Description: ? ?Ubuntu 8.04.1
>>
>> Release: ? ? ? ?8.04
>>
>> Codename: ? ? ? hardy
>
> installation.
> Cheers,
> ?? ?Peter
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at varnish-cache.org
> http://lists.varnish-cache.org/mailman/listinfo/varnish-dev
>


From yang at knownsec.com  Sun Apr 11 03:43:35 2010
From: yang at knownsec.com (jilong yang)
Date: Sun, 11 Apr 2010 11:43:35 +0800
Subject: I want check POST data and Deliver body data, I need to modify the 
	CNT_recv and CNT_deliver function ?
Message-ID: <m2nd711d8e01004102043g3fa33f1au5808d7ca2d33e623@mail.gmail.com>

I had use Lua programe script in VCL by using Inline C?It can check the
req.header data now .
but the req and the obj can't give me the POST data and the deliver Body
data, so I should modify the cache_center.c to get the data.
but I can't sure which function  is the best start point .
who  guide me , thanks very much !
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100411/1c27c5bb/attachment.html>

From phk at phk.freebsd.dk  Sun Apr 11 20:33:51 2010
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Sun, 11 Apr 2010 20:33:51 +0000
Subject: production of Varnish Documentation 
Message-ID: <27051.1271018031@critter.freebsd.dk>


I have spent some time since VUG2, checking out how various open source
projects have dealt with documentation, and I have reached a conclusion
which I will share with you in a moment.

The first school of thought on documentation, is the one we subscribe
to in Varnish right now: "Documentation schmocumentation..."  It does
not work for anybody.

The second school is the "Write a {La}TeX document" school, where
the documentation is seen as a stand alone product, which is produced
independently.  This works great for PDF output, and sucks royally
for HTML and TXT output.

The third school is the "Literate programming" school, which abandons
readability of BOTH the program source code AND the documentation
source, which seems to be one of the best access protections
one can put on the source code of either.

The fourth school is the "DoxyGen" school, which lets a program
collect a mindless list of hyperlinked variable, procedure, class
and filenames, and call that "documentation".

And the fifth school is anything that uses a fileformat that
cannot be put into a version control system, because it is
binary and non-diff'able.  It doesn't matter if it is
OpenOffice, LyX or Word, a non-diffable doc source is a no go
with programmers.

Quite frankly, none of these works very well in practice.

One of the very central issues, is that writing documentation must
not become a big and clear context-switch from programming.  That
precludes special graphical editors, browser-based (wiki!) formats
etc.

Yes, if you write documentation for half your workday, that works,
but if you write code most of your workday, that does not work.
Trust me on this, I have 25 years of experience avoiding using such
tools.

I found one project which has thought radically about the problem,
and their reasoning is interesting, and quite attractive to me:

	1. .TXT files are the lingua franca of computers, even if
	you are logged with TELNET using IP over Avian Carriers
	(Which is more widespread in Norway than you would think)
	you can read documentation in a .TXT format.

	2. .TXT is the most restrictive typographical format, so
	rather than trying to neuter a high-level format into .TXT,
	it is smarter to make the .TXT the source, and reinterpret
	it structurally into the more capable formats.

In other words: we are talking about the "ReStructuredText" of the
Python project.

Unless there is something I have totally failed to spot, that is
going to be the new documentation platform in Varnish.

Take a peek at the Python docs, and try pressing the "show source"
link at the bottom of the left menu:

(link to random python doc page:)

	http://docs.python.org/py3k/reference/expressions.html

Dependency wise, that means you can edit docs with no special
tools, you need python+docutils to format HTML and a LaTex
(pdflatex ?) to produce PDFs, something I only expect to happen
on the project server on a regular basis.

I can live with that, I might even rewrite the VCC scripts
from Tcl to Python in that case.

Comments, inputs... ?

Poul-Henning

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


From jack at facebook.com  Sun Apr 11 23:35:13 2010
From: jack at facebook.com (Jack Lindamood)
Date: Sun, 11 Apr 2010 16:35:13 -0700
Subject: [PATCH] Random director tries all backends before giving up
Message-ID: <5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2@SC-MBXC1.TheFacebook.com>

The following is a patch I've made to varnish that I hope improves the random director: which anyone's welcome to use (even varnish trunk?).  My motivation was to reduce the number of vcl_error calls when a director is mostly good.  You can get the entire patch at this link.


http://github.com/cep21/Varnish/commit/6f5e98143ac2636504d9febf574b14c3c1a072fc


Here's the commit message:


Random director tries all backends before giving up


Summary:

The current random director gives up when it can't get a FD to the backend it wants retries times in a row.  Rather than give up and return NULL, which is guaranteed to cause a vcl_error, as a last ditch effort we try all other healthy backends until we get one that works.  This is mostly useful in the between time after a backend server dies and before the health check fails enough to mark a backend unhealthy.


Backwards Compatibility =  Not strictly backwards compatible.  In cases when the old code would of fallen through to vcl_error this will give a shot at getting a good result.


Performance = In the worse case, this will add extra calls for getting a FD, but only for situations that vcl_error


Test Plan: New varnish unittest.  It fails in the old code and works in this new code.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100411/edb30834/attachment.html>

From aotto at mosso.com  Mon Apr 12 02:51:27 2010
From: aotto at mosso.com (Adrian Otto)
Date: Sun, 11 Apr 2010 19:51:27 -0700
Subject: [PATCH] Random director tries all backends before giving up
In-Reply-To: <5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2@SC-MBXC1.TheFacebook.com>
References: <5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2@SC-MBXC1.TheFacebook.com>
Message-ID: <FD0D6E76-26AE-4FD6-8D71-A418BA78A76E@mosso.com>

Jack,

This approach is probably not a good idea if (a) you have a large cluster, (b) a heavily loaded cluster, and/or (c) if your backends are sensitive to overload. You are likely to trigger a cascading failure. It might be smarter to have a configurable number of backends to try... perhaps 2 or 3. Imagine if you have 50 backends. There is no point in trying 50 times to find a healthy backend. Changes are that if 25% of your backends are down, trying more is just going to exacerbate the problem. 

Adrian

On Apr 11, 2010, at 4:35 PM, Jack Lindamood wrote:

> The following is a patch I?ve made to varnish that I hope improves the random director: which anyone?s welcome to use (even varnish trunk?).  My motivation was to reduce the number of vcl_error calls when a director is mostly good.  You can get the entire patch at this link.
>  
> http://github.com/cep21/Varnish/commit/6f5e98143ac2636504d9febf574b14c3c1a072fc
>  
> Here?s the commit message:
>  
> Random director tries all backends before giving up
>  
> Summary:
> The current random director gives up when it can't get a FD to the backend it wants retries times in a row.  Rather than give up and return NULL, which is guaranteed to cause a vcl_error, as a last ditch effort we try all other healthy backends until we get one that works.  This is mostly useful in the between time after a backend server dies and before the health check fails enough to mark a backend unhealthy.
>  
> Backwards Compatibility =  Not strictly backwards compatible.  In cases when the old code would of fallen through to vcl_error this will give a shot at getting a good result.
>  
> Performance = In the worse case, this will add extra calls for getting a FD, but only for situations that vcl_error
>  
> Test Plan: New varnish unittest.  It fails in the old code and works in this new code.
>  
>  
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at varnish-cache.org
> http://lists.varnish-cache.org/mailman/listinfo/varnish-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100411/4326dd3f/attachment.html>

From tfheen at varnish-software.com  Mon Apr 12 07:13:41 2010
From: tfheen at varnish-software.com (Tollef Fog Heen)
Date: Mon, 12 Apr 2010 09:13:41 +0200
Subject: production of Varnish Documentation
In-Reply-To: <27051.1271018031@critter.freebsd.dk> (Poul-Henning Kamp's
	message of "Sun, 11 Apr 2010 20:33:51 +0000")
References: <27051.1271018031@critter.freebsd.dk>
Message-ID: <87tyrhcfdm.fsf@qurzaw.linpro.no>

]] Poul-Henning Kamp 

| 	1. .TXT files are the lingua franca of computers, even if
| 	you are logged with TELNET using IP over Avian Carriers
| 	(Which is more widespread in Norway than you would think)
| 	you can read documentation in a .TXT format.
| 
| 	2. .TXT is the most restrictive typographical format, so
| 	rather than trying to neuter a high-level format into .TXT,
| 	it is smarter to make the .TXT the source, and reinterpret
| 	it structurally into the more capable formats.

Those are good principles.

| In other words: we are talking about the "ReStructuredText" of the
| Python project.

[...]

| Comments, inputs... ?

At the risk of bikeshedding here, have you looked at [markdown][1]?
I've always found rst looking really funny when I've read it, to the
extent that I've gone ?what's wrong with this text file?  It has odd
backticks and trailing underlines?.

(As a bonus here, we can move the wiki towards using markdown or rst
instead of trac's built-in syntax.  This can be done with either format
so should not really be a factor in choosing between markup formats.)

[1] http://daringfireball.net/projects/markdown/syntax

-- 
Tollef Fog Heen
Varnish Software
t: +47 21 54 41 73


From phk at phk.freebsd.dk  Mon Apr 12 07:23:29 2010
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Mon, 12 Apr 2010 07:23:29 +0000
Subject: production of Varnish Documentation 
In-Reply-To: Your message of "Mon, 12 Apr 2010 09:13:41 +0200."
	<87tyrhcfdm.fsf@qurzaw.linpro.no> 
Message-ID: <64092.1271057009@critter.freebsd.dk>

In message <87tyrhcfdm.fsf at qurzaw.linpro.no>, Tollef Fog Heen writes:
>]] Poul-Henning Kamp=20
>

>At the risk of bikeshedding here, have you looked at [markdown][1]?

Yes, and I didn't like it, it is basically a simplified HTML markup,
and for instance does not have a sensible way to make tables that
I can find.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From kristian at varnish-software.com  Mon Apr 12 12:38:52 2010
From: kristian at varnish-software.com (Kristian Lyngstol)
Date: Mon, 12 Apr 2010 14:38:52 +0200
Subject: [PATCH] Expose the backend of a simple director
Message-ID: <20100412123849.GA22330@kjeks.varnish-software.com>

This is needed if a director (like the dns director) needs a closer look at
each backend to determine which to use. IE: Look at the IP or similar.

PS: It seems a bit wrong to expose vdi_simple_SOMETHING directly, but it
also seemed strange to do VBE_Something. More semantics than anything,
though.
---
 varnish-cache/bin/varnishd/cache.h         |    1 +
 varnish-cache/bin/varnishd/cache_backend.c |   14 ++++++++++++++
 2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/varnish-cache/bin/varnishd/cache.h b/varnish-cache/bin/varnishd/cache.h
index 2561a0d..35e286d 100644
--- a/varnish-cache/bin/varnishd/cache.h
+++ b/varnish-cache/bin/varnishd/cache.h
@@ -464,6 +464,7 @@ void VBE_ClosedFd(struct sess *sp);
 void VBE_RecycleFd(struct sess *sp);
 void VBE_AddHostHeader(const struct sess *sp);
 void VBE_Poll(void);
+struct backend *vdi_simple_get_backend(const struct director *d);
 
 /* cache_backend_cfg.c */
 void VBE_Init(void);
diff --git a/varnish-cache/bin/varnishd/cache_backend.c b/varnish-cache/bin/varnishd/cache_backend.c
index ed35fec..a0c54c6 100644
--- a/varnish-cache/bin/varnishd/cache_backend.c
+++ b/varnish-cache/bin/varnishd/cache_backend.c
@@ -498,6 +498,20 @@ vdi_simple_healthy(double now, const struct director *d, uintptr_t target)
 	return (vbe_Healthy(now, target, vs->backend));
 }
 
+/* Reveals the real backend of a simple director if needed. */
+struct backend *
+vdi_simple_get_backend(const struct director *d)
+{
+	CHECK_OBJ_NOTNULL(d, DIRECTOR_MAGIC);
+	struct vdi_simple *vs, *vs2;
+
+	vs2 = d->priv;
+	if (vs2->magic != VDI_SIMPLE_MAGIC)
+		return NULL;
+	CAST_OBJ_NOTNULL(vs, d->priv, VDI_SIMPLE_MAGIC);
+	return vs->backend;
+}
+
 /*lint -e{818} not const-able */
 static void
 vdi_simple_fini(struct director *d)
-- 
1.5.4.3

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100412/e2aa6b20/attachment.bin>

From kristian at varnish-software.com  Mon Apr 12 12:45:08 2010
From: kristian at varnish-software.com (Kristian Lyngstol)
Date: Mon, 12 Apr 2010 14:45:08 +0200
Subject: Using TimeUnit()
Message-ID: <20100412124506.GB22330@kjeks.varnish-software.com>

In the DNS director I need to be able to set a TTL for caching dns. Since
this is set when the director is parsed, it's not relative to anything. For
this, I've used TimeUnit(), which does exactly what I need. However, as
that's a static function I either need to wrap it, rename it or find some
better alternative.

Am I missing something obvious here? I'm assuming I am since I'm not very
friendly with the VCC.

I'm attaching the "patch" (if you can call it that) that I use in the dns
director right now, which does the trick (I added the prototype in the
relevant .c-files temporarily):

diff --git a/varnish-cache/lib/libvcl/vcc_parse.c b/varnish-cache/lib/libvcl/vcc_parse.c
index 36fec17..7e669b8 100644
--- a/varnish-cache/lib/libvcl/vcc_parse.c
+++ b/varnish-cache/lib/libvcl/vcc_parse.c
@@ -63,7 +63,7 @@ static void Cond_0(struct tokenlist *tl);
  * Recognize and convert units of time, return seconds.
  */
 
-static double
+double
 TimeUnit(struct tokenlist *tl)
 {
        double sc = 1.0;


-- 
Kristian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100412/4b26f0c8/attachment.bin>

From phk at phk.freebsd.dk  Mon Apr 12 12:58:28 2010
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Mon, 12 Apr 2010 12:58:28 +0000
Subject: Using TimeUnit() 
In-Reply-To: Your message of "Mon, 12 Apr 2010 14:45:08 +0200."
	<20100412124506.GB22330@kjeks.varnish-software.com> 
Message-ID: <79162.1271077108@critter.freebsd.dk>

In message <20100412124506.GB22330 at kjeks.varnish-software.com>, Kristian Lyngst
ol writes:

>Am I missing something obvious here? I'm assuming I am since I'm not very
>friendly with the VCC.

Shouldn't you be using vcc_TimeVal() to get the number _and_ unit ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From kristian at varnish-software.com  Mon Apr 12 13:05:45 2010
From: kristian at varnish-software.com (Kristian Lyngstol)
Date: Mon, 12 Apr 2010 15:05:45 +0200
Subject: Using TimeUnit()
In-Reply-To: <79162.1271077108@critter.freebsd.dk>
References: <20100412124506.GB22330@kjeks.varnish-software.com>
	<79162.1271077108@critter.freebsd.dk>
Message-ID: <20100412130544.GC22330@kjeks.varnish-software.com>

On Mon, Apr 12, 2010 at 12:58:28PM +0000, Poul-Henning Kamp wrote:
> In message <20100412124506.GB22330 at kjeks.varnish-software.com>, Kristian Lyngst
> ol writes:
> 
> >Am I missing something obvious here? I'm assuming I am since I'm not very
> >friendly with the VCC.
> 
> Shouldn't you be using vcc_TimeVal() to get the number _and_ unit ?

I probably should, but this is where VCC-fu comes into play: vcc_TimeVal
returns void and prints what I want with Fb(), whereas TimeUnit and
vcc_DoubleVal returns data in a normal fashion. I don't really grasp how to
fetch from whatever Fb() writes to - or why I would want to go down that
road.

- Kristian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100412/b0c0b778/attachment.bin>

From jack at facebook.com  Mon Apr 12 22:11:54 2010
From: jack at facebook.com (Jack Lindamood)
Date: Mon, 12 Apr 2010 15:11:54 -0700
Subject: [PATCH] Random director tries all backends before giving up
In-Reply-To: <FD0D6E76-26AE-4FD6-8D71-A418BA78A76E@mosso.com>
References: <5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2@SC-MBXC1.TheFacebook.com>
	<FD0D6E76-26AE-4FD6-8D71-A418BA78A76E@mosso.com>
Message-ID: <5C8A0918F4B0CC41B41F7C2DFE855B6007204F9160@SC-MBXC1.TheFacebook.com>

Thanks for the feedback.  While not an issue in my case, a configuration parameter that limits the number of backends to try could be useful for others.  I don't know how most people use varnish, but potentially triggering vcl_error when a single backend shuts down is probably undesirable behavior for most users.

From: varnish-dev-bounces at varnish-cache.org [mailto:varnish-dev-bounces at varnish-cache.org] On Behalf Of Adrian Otto
Sent: Sunday, April 11, 2010 7:51 PM
To: varnish-dev at varnish-cache.org
Subject: Re: [PATCH] Random director tries all backends before giving up

Jack,

This approach is probably not a good idea if (a) you have a large cluster, (b) a heavily loaded cluster, and/or (c) if your backends are sensitive to overload. You are likely to trigger a cascading failure. It might be smarter to have a configurable number of backends to try... perhaps 2 or 3. Imagine if you have 50 backends. There is no point in trying 50 times to find a healthy backend. Changes are that if 25% of your backends are down, trying more is just going to exacerbate the problem.

Adrian

On Apr 11, 2010, at 4:35 PM, Jack Lindamood wrote:


The following is a patch I've made to varnish that I hope improves the random director: which anyone's welcome to use (even varnish trunk?).  My motivation was to reduce the number of vcl_error calls when a director is mostly good.  You can get the entire patch at this link.

http://github.com/cep21/Varnish/commit/6f5e98143ac2636504d9febf574b14c3c1a072fc

Here's the commit message:

Random director tries all backends before giving up

Summary:
The current random director gives up when it can't get a FD to the backend it wants retries times in a row.  Rather than give up and return NULL, which is guaranteed to cause a vcl_error, as a last ditch effort we try all other healthy backends until we get one that works.  This is mostly useful in the between time after a backend server dies and before the health check fails enough to mark a backend unhealthy.

Backwards Compatibility =  Not strictly backwards compatible.  In cases when the old code would of fallen through to vcl_error this will give a shot at getting a good result.

Performance = In the worse case, this will add extra calls for getting a FD, but only for situations that vcl_error

Test Plan: New varnish unittest.  It fails in the old code and works in this new code.


_______________________________________________
varnish-dev mailing list
varnish-dev at varnish-cache.org<mailto:varnish-dev at varnish-cache.org>
http://lists.varnish-cache.org/mailman/listinfo/varnish-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100412/44d1a37c/attachment.html>

From phk at phk.freebsd.dk  Tue Apr 13 18:53:45 2010
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Tue, 13 Apr 2010 18:53:45 +0000
Subject: [PATCH] Random director tries all backends before giving up 
In-Reply-To: Your message of "Sun, 11 Apr 2010 16:35:13 MST."
	<5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2@SC-MBXC1.TheFacebook.com> 
Message-ID: <46033.1271184825@critter.freebsd.dk>

In message <5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2 at SC-MBXC1.TheFacebook.com

>Summary:
>
>The current random director gives up when it can't get a FD to the backend 
>it wants retries times in a row.  Rather than give up and return NULL, which
>is guaranteed to cause a vcl_error, as a last ditch effort we try all other
>healthy backends until we get one that works.  This is mostly useful in 
>the between time after a backend server dies and before the health check
>fails enough to mark a backend unhealthy.

I'm not sure this is a good idea, and my worry is that making it a
good idea requires far too many config handles to make any sense.

But I'm willing to be persuaded by good arguments.

At one level, I would really love if we could move director policy
into VCL, but my attempts to write even a mockup has failed to
produce something sensible.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From jack at facebook.com  Tue Apr 13 20:02:49 2010
From: jack at facebook.com (Jack Lindamood)
Date: Tue, 13 Apr 2010 13:02:49 -0700
Subject: [PATCH] Random director tries all backends before giving up 
In-Reply-To: <46033.1271184825@critter.freebsd.dk>
References: Your message of "Sun, 11 Apr 2010 16:35:13 MST."
	<5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2@SC-MBXC1.TheFacebook.com>
	<46033.1271184825@critter.freebsd.dk>
Message-ID: <5C8A0918F4B0CC41B41F7C2DFE855B6007208D8DB7@SC-MBXC1.TheFacebook.com>

Thanks for the comments.  After thinking about it more, here's an alternative patch that should solve my need for the original [PATCH], resolve Adrian's issue by not increasing the calls to VBE_GetFd, and address a XXX inside the code.

http://github.com/cep21/Varnish/commit/5723dfeb8754d371b52445ad457ff6b752efc875 

-----Original Message-----
From: phk at critter.freebsd.dk [mailto:phk at critter.freebsd.dk] On Behalf Of Poul-Henning Kamp
Sent: Tuesday, April 13, 2010 11:54 AM
To: Jack Lindamood
Cc: varnish-dev at varnish-cache.org
Subject: Re: [PATCH] Random director tries all backends before giving up 

In message <5C8A0918F4B0CC41B41F7C2DFE855B6007204F8FE2 at SC-MBXC1.TheFacebook.com

>Summary:
>
>The current random director gives up when it can't get a FD to the backend 
>it wants retries times in a row.  Rather than give up and return NULL, which
>is guaranteed to cause a vcl_error, as a last ditch effort we try all other
>healthy backends until we get one that works.  This is mostly useful in 
>the between time after a backend server dies and before the health check
>fails enough to mark a backend unhealthy.

I'm not sure this is a good idea, and my worry is that making it a
good idea requires far too many config handles to make any sense.

But I'm willing to be persuaded by good arguments.

At one level, I would really love if we could move director policy
into VCL, but my attempts to write even a mockup has failed to
produce something sensible.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From slink at schokola.de  Sun Apr 18 15:40:39 2010
From: slink at schokola.de (Nils Goroll)
Date: Sun, 18 Apr 2010 17:40:39 +0200
Subject: Alignment of ws allocations
Message-ID: <4BCB27F7.9040304@schokola.de>

Hi Poul-Henning,

could you provide a brief explanation on your decision to make all WS
allocations (void *) aligned? http://varnish-cache.org/ticket/665#comment:2

I had considered this solution, but I had thought aligning char * would just be
a waste of ws space. Or is there anything I don't know?

Thank you, Nils


From slink at schokola.de  Sun Apr 18 15:51:55 2010
From: slink at schokola.de (Nils Goroll)
Date: Sun, 18 Apr 2010 17:51:55 +0200
Subject: production of Varnish Documentation
In-Reply-To: <27051.1271018031@critter.freebsd.dk>
References: <27051.1271018031@critter.freebsd.dk>
Message-ID: <4BCB2A9B.5060900@schokola.de>

Hi,

> 	http://docs.python.org/py3k/reference/expressions.html

This look pretty cool. First I had thought this wouldn't be very different from
wiki markup or pod, but it's definitely much more legible in its source form.

Sounds like another very good VDD (Varnish Design Decision :) ).

Nils


From phk at phk.freebsd.dk  Mon Apr 19 06:43:19 2010
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Mon, 19 Apr 2010 06:43:19 +0000
Subject: Alignment of ws allocations 
In-Reply-To: Your message of "Sun, 18 Apr 2010 17:40:39 +0200."
	<4BCB27F7.9040304@schokola.de> 
Message-ID: <5322.1271659399@critter.freebsd.dk>

In message <4BCB27F7.9040304 at schokola.de>, Nils Goroll writes:
>Hi Poul-Henning,
>
>could you provide a brief explanation on your decision to make all WS
>allocations (void *) aligned? http://varnish-cache.org/ticket/665#comment:2
>
>I had considered this solution, but I had thought aligning char * would just be
>a waste of ws space. Or is there anything I don't know?

There are two reasons:

1. It's easier to not have a split-policy

2. Most memcpy() implementations works best with aligned start addresses.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From sky at crucially.net  Tue Apr 20 20:42:14 2010
From: sky at crucially.net (Artur Bergman)
Date: Tue, 20 Apr 2010 13:42:14 -0700
Subject: MTX_CONTESTS
Message-ID: <DD91ED39-AB66-475E-8961-DF842AA3F18C@crucially.net>

Hey,

I just upgraded to 2.1 and noticed the following, two servers getting  
identical traffic, 2.1 was getting about 3.5x more MTX_CONTESTS than  
2.0.6


2.0.6

time varnishlog | grep -i MTX_CONTEST | head -1000 | cut -c 22- | sort  
| uniq -c | sort -rn
     603 "MTX_CONTEST(smf_alloc,storage_file.c,633,&sc->mtx)"
     135 "MTX_CONTEST(smf_trim,storage_file.c,676,&sc->mtx)"
     125 "MTX_CONTEST(smf_free,storage_file.c,698,&sc->mtx)"
      69 "MTX_CONTEST(BAN_NewObj,cache_ban.c,444,&ban_mtx)"
      38 "MTX_CONTEST(EXP_Insert,cache_expire.c,188,&exp_mtx)"
      13 "MTX_CONTEST(SES_Charge,cache_session.c,242,&stat_mtx)"
       3 "MTX_CONTEST(exp_timer,cache_expire.c,343,&exp_mtx)"
       3 "MTX_CONTEST(BAN_DestroyObj,cache_ban.c,477,&ban_mtx)"
       2 "MTX_CONTEST(wrk_thread,cache_pool.c,314,&wq[u]->mtx)"
       2 "MTX_CONTEST(VBE_new_bereq,cache_backend.c,150,&VBE_mtx)"
       2 "MTX_CONTEST(VBE_free_bereq,cache_backend.c,185,&VBE_mtx)"
       2 "MTX_CONTEST(exp_timer,cache_expire.c,288,&exp_mtx)"
       1 "MTX_CONTEST(VBE_RecycleFd,cache_backend.c,387,&b->mtx)"
       1 "MTX_CONTEST(VBE_GetVbe,cache_backend.c,303,&b->mtx)"
       1 "MTX_CONTEST(EXP_Rearm,cache_expire.c,249,&exp_mtx)"

real	0m24.915s
user	0m5.723s
sys	0m0.051s


2.1.0

root at varnish-s1:~# time varnishlog | grep -i MTX_CONTEST | head -1000  
| cut -c 22- | sort | uniq -c | sort -rn
     418 "MTX_CONTEST(smf_alloc,storage_file.c,469,&sc->mtx)"
     207 "MTX_CONTEST(smf_free,storage_file.c,534,&sc->mtx)"
     170 "MTX_CONTEST(hcb_lookup,hash_critbit.c,454,&hcb_mtx)"
     158 "MTX_CONTEST(smf_trim,storage_file.c,512,&sc->mtx)"
      12 "MTX_CONTEST(BAN_NewObj,cache_ban.c,367,&ban_mtx)"
       8 "MTX_CONTEST(hcb_deref,hash_critbit.c,408,&hcb_mtx)"
       7 "MTX_CONTEST(EXP_Insert,cache_expire.c,138,&exp_mtx)"
       7 "MTX_CONTEST(BAN_DestroyObj,cache_ban.c,401,&ban_mtx)"
       2 "MTX_CONTEST(exp_timer,cache_expire.c,265,&exp_mtx)"
       1 "MTX_CONTEST(wrk_thread_real,cache_pool.c,197,&wq[u]->mtx)"
       1 "MTX_CONTEST(WRK_SumStat,cache_pool.c,112,&wstat_mtx)"
       1 "MTX_CONTEST(WRK_Queue,cache_pool.c,255,&wq[u]->mtx)"
       1 "MTX_CONTEST(VBE_ReleaseConn,cache_backend.c,87,&VBE_mtx)"
       1 "MTX_CONTEST(vbe_Healthy,cache_backend.c,279,&b->mtx)"
       1 "MTX_CONTEST(vbe_GetVbe,cache_backend.c,326,&b->mtx)"
       1 "MTX_CONTEST(VBE_DropRefConn,cache_backend_cfg.c,146,&b->mtx)"
       1 "MTX_CONTEST(HSH_Deref,cache_hash.c,639,&oh->mtx)"
       1 "MTX_CONTEST(hcb_lookup,hash_critbit.c,435,&oh->mtx)"
       1 "MTX_CONTEST(EXP_Rearm,cache_expire.c,224,&exp_mtx)"
       1 "MTX_CONTEST(EXP_NukeOne,cache_expire.c,339,&exp_mtx)"

real	0m7.537s
user	0m1.783s
sys	0m0.040s


Artur


From jbq at caraldi.com  Tue Apr 20 22:43:08 2010
From: jbq at caraldi.com (Jean-Baptiste Quenot)
Date: Wed, 21 Apr 2010 00:43:08 +0200
Subject: production of Varnish Documentation
In-Reply-To: <27051.1271018031@critter.freebsd.dk>
References: <27051.1271018031@critter.freebsd.dk>
Message-ID: <n2oae63f8b51004201543ieed8a921hed970d1229f35d93@mail.gmail.com>

2010/4/11 Poul-Henning Kamp <phk at phk.freebsd.dk>:
>
> Yes, if you write documentation for half your workday, that works,
> but if you write code most of your workday, that does not work.
> Trust me on this, I have 25 years of experience avoiding using such
> tools.
>
> I found one project which has thought radically about the problem,
> and their reasoning is interesting, and quite attractive to me:
>
> ? ? ? ?1. .TXT files are the lingua franca of computers, even if
> ? ? ? ?you are logged with TELNET using IP over Avian Carriers
> ? ? ? ?(Which is more widespread in Norway than you would think)
> ? ? ? ?you can read documentation in a .TXT format.
>
> ? ? ? ?2. .TXT is the most restrictive typographical format, so
> ? ? ? ?rather than trying to neuter a high-level format into .TXT,
> ? ? ? ?it is smarter to make the .TXT the source, and reinterpret
> ? ? ? ?it structurally into the more capable formats.
>
> In other words: we are talking about the "ReStructuredText" of the
> Python project.

+1

I use RST+Docutils with Sphinx (a tool to build the documentation with
just a simple 'make') for my project, and I'm quite happy with it.
-- 
Jean-Baptiste Quenot


From jfrias at gmail.com  Thu Apr 22 06:38:52 2010
From: jfrias at gmail.com (Javier Frias)
Date: Thu, 22 Apr 2010 02:38:52 -0400
Subject: Backend Polling + connection timeouts = weird issue
Message-ID: <t2p22964b961004212338yf8c09147z7509950b316a6af2@mail.gmail.com>

So I'm having a weird sporadic issue, and I'm wondering if varnish can
mark a backend down on timed out connections that are not the backend
probe. I ask since I'm not getting failed backend messages in varnish
log, yet i'm getting burst of 503's.

My backend probe looks like this.

backend backend_XXX {
.host = "10.100.118.238";
.port = "80";
.probe = {
	.url = "/robots.txt?__site=XXXXX";
	.timeout = 5000ms;
	.interval = 5s;
	.window = 10;
	.threshold = 4;
	}
}

After one of these timed out connections...
 201 ReqStart     c 10.100.114.196 60540 1416455775
 201 RxRequest    c GET
 201 RxURL        c /XXXXsomeXXXXurl
 201 RxProtocol   c HTTP/1.0
 201 RxHeader     c User-Agent: Wget/1.10.2 (Red Hat modified)
 201 RxHeader     c Accept: */*
 201 RxHeader     c Host: xxxx.xxxxx.com
 201 RxHeader     c Connection: Keep-Alive
 201 VCL_call     c recv
 201 VCL_return   c lookup
 201 VCL_call     c hash
 201 VCL_return   c hash
 201 VCL_call     c miss
 201 VCL_return   c fetch
---->   201 FetchError   c no backend connection <---------
 201 VCL_call     c error
 201 VCL_return   c deliver
 201 Length       c 839
 201 VCL_call     c deliver
 201 VCL_return   c deliver
 201 TxProtocol   c HTTP/1.1
 201 TxStatus     c 503
 201 TxResponse   c Service Unavailable
 201 TxHeader     c Server: Varnish
 201 TxHeader     c Retry-After: 0
 201 TxHeader     c Content-Length: 839
 201 TxHeader     c Date: Thu, 22 Apr 2010 02:03:46 GMT
 201 TxHeader     c X-Varnish: 1416455775
 201 TxHeader     c Age: 0
 201 TxHeader     c Via: 1.1 varnish
 201 TxHeader     c Connection: close
 201 ReqEnd       c 1416455775 1271901825.787379980
1271901826.187721968 0.000025034 0.400313139 0.000028849
 201 SessionClose c error
 201 StatSess     c 10.100.114.196 60540 0 1 1 0 0 0 195 839


a typical timed out connection, which i'm verifying in my backend as
having taken close to 80 seconds looks like this..


825637170 WorkThread   - 924 1271901801.804220915 0.000025988
1.712481976 0.000046015
  64 SessionOpen  c 10.100.114.196 60537 0.0.0.0:80
  64 ReqStart     c 10.100.114.196 60537 1416454378
  64 RxRequest    c GET
  64 RxURL        c /XXXsomeXXXsimilarXXXurl
  64 RxProtocol   c HTTP/1.0
  64 RxHeader     c User-Agent: Wget/1.10.2 (Red Hat modified)
  64 RxHeader     c Accept: */*
  64 RxHeader     c Host: sec.floridatoday.com
  64 RxHeader     c Connection: Keep-Alive
  64 VCL_call     c recv
  64 VCL_return   c lookup
  64 VCL_call     c hash
  64 VCL_return   c hash
  64 VCL_call     c miss
  64 VCL_return   c fetch
 211 BackendOpen  b backend_XXX 10.100.118.149 12967 10.100.118.238 80
  64 Backend      c 211 backend_XXX backend_XXX
 211 TxRequest    b GET
 211 TxURL        b /XXXsomeXXXsimilarXXXurl
 211 TxProtocol   b HTTP/1.1
 211 TxHeader     b User-Agent: Wget/1.10.2 (Red Hat modified)
 211 TxHeader     b Accept: */*
 211 TxHeader     b Host: xxxx.xxxxxx.com
 211 TxHeader     b X-Forwarded-For: 10.100.114.196
 211 TxHeader     b X-Varnish: 1416454378
 117 Debug        c "herding"
 202 Debug        c "herding"

.... never comes back...

( again, going back to the backend, shows this request as having taken
80 seconds, so i'm assuming varnish gave up after X seconds)


Has anyone seen something like this? It does recover... but it seems
like a time out connection is causing the whole backend to be marked
as failed.. which sucks big time for me since my backends are behind a
load balancer.

Should I try connection restarts? Would they even help in this case?


-Javier

PS: Unrelated, but i tried registering on trac, so submit my
varnishncsa extended logging patch, and it seems the registration is
not working on varnish-cache.org


From sebastiaan.jansen at kpn.com  Thu Apr 22 06:54:46 2010
From: sebastiaan.jansen at kpn.com (sebastiaan.jansen at kpn.com)
Date: Thu, 22 Apr 2010 08:54:46 +0200
Subject: Backend Polling + connection timeouts = weird issue
In-Reply-To: <t2p22964b961004212338yf8c09147z7509950b316a6af2@mail.gmail.com>
References: <t2p22964b961004212338yf8c09147z7509950b316a6af2@mail.gmail.com>
Message-ID: <FB7D1FC6C6F5834784EA15DF1E61E5347295B5F9C1@W2056.kpnnl.local>

Hi,

Having bursts of 503's took my attention, 
we had a similar problem on a pool of servers that had problems with getting the content from NFS/NAS.
After placing the content on another NFS server the problem got resolved.

The error description is not exactly the same as yours (our condition are probably different), but as a side note: 
when you don't know your web-servers has a problem with your NFS it's a bit hard to spot.

Good luck with finding the problem.

With kind regard,
Sebastiaan


So I'm having a weird sporadic issue, and I'm wondering if varnish can
mark a backend down on timed out connections that are not the backend
probe. I ask since I'm not getting failed backend messages in varnish
log, yet i'm getting burst of 503's.

My backend probe looks like this.

backend backend_XXX {
.host = "10.100.118.238";
.port = "80";
.probe = {
	.url = "/robots.txt?__site=XXXXX";
	.timeout = 5000ms;
	.interval = 5s;
	.window = 10;
	.threshold = 4;
	}
}

After one of these timed out connections...
 201 ReqStart     c 10.100.114.196 60540 1416455775
 201 RxRequest    c GET
 201 RxURL        c /XXXXsomeXXXXurl
 201 RxProtocol   c HTTP/1.0
 201 RxHeader     c User-Agent: Wget/1.10.2 (Red Hat modified)
 201 RxHeader     c Accept: */*
 201 RxHeader     c Host: xxxx.xxxxx.com
 201 RxHeader     c Connection: Keep-Alive
 201 VCL_call     c recv
 201 VCL_return   c lookup
 201 VCL_call     c hash
 201 VCL_return   c hash
 201 VCL_call     c miss
 201 VCL_return   c fetch
---->   201 FetchError   c no backend connection <---------
 201 VCL_call     c error
 201 VCL_return   c deliver
 201 Length       c 839
 201 VCL_call     c deliver
 201 VCL_return   c deliver
 201 TxProtocol   c HTTP/1.1
 201 TxStatus     c 503
 201 TxResponse   c Service Unavailable
 201 TxHeader     c Server: Varnish
 201 TxHeader     c Retry-After: 0
 201 TxHeader     c Content-Length: 839
 201 TxHeader     c Date: Thu, 22 Apr 2010 02:03:46 GMT
 201 TxHeader     c X-Varnish: 1416455775
 201 TxHeader     c Age: 0
 201 TxHeader     c Via: 1.1 varnish
 201 TxHeader     c Connection: close
 201 ReqEnd       c 1416455775 1271901825.787379980
1271901826.187721968 0.000025034 0.400313139 0.000028849
 201 SessionClose c error
 201 StatSess     c 10.100.114.196 60540 0 1 1 0 0 0 195 839


a typical timed out connection, which i'm verifying in my backend as
having taken close to 80 seconds looks like this..


825637170 WorkThread   - 924 1271901801.804220915 0.000025988
1.712481976 0.000046015
  64 SessionOpen  c 10.100.114.196 60537 0.0.0.0:80
  64 ReqStart     c 10.100.114.196 60537 1416454378
  64 RxRequest    c GET
  64 RxURL        c /XXXsomeXXXsimilarXXXurl
  64 RxProtocol   c HTTP/1.0
  64 RxHeader     c User-Agent: Wget/1.10.2 (Red Hat modified)
  64 RxHeader     c Accept: */*
  64 RxHeader     c Host: sec.floridatoday.com
  64 RxHeader     c Connection: Keep-Alive
  64 VCL_call     c recv
  64 VCL_return   c lookup
  64 VCL_call     c hash
  64 VCL_return   c hash
  64 VCL_call     c miss
  64 VCL_return   c fetch
 211 BackendOpen  b backend_XXX 10.100.118.149 12967 10.100.118.238 80
  64 Backend      c 211 backend_XXX backend_XXX
 211 TxRequest    b GET
 211 TxURL        b /XXXsomeXXXsimilarXXXurl
 211 TxProtocol   b HTTP/1.1
 211 TxHeader     b User-Agent: Wget/1.10.2 (Red Hat modified)
 211 TxHeader     b Accept: */*
 211 TxHeader     b Host: xxxx.xxxxxx.com
 211 TxHeader     b X-Forwarded-For: 10.100.114.196
 211 TxHeader     b X-Varnish: 1416454378
 117 Debug        c "herding"
 202 Debug        c "herding"

.... never comes back...

( again, going back to the backend, shows this request as having taken
80 seconds, so i'm assuming varnish gave up after X seconds)


Has anyone seen something like this? It does recover... but it seems
like a time out connection is causing the whole backend to be marked
as failed.. which sucks big time for me since my backends are behind a
load balancer.

Should I try connection restarts? Would they even help in this case?


-Javier

PS: Unrelated, but i tried registering on trac, so submit my
varnishncsa extended logging patch, and it seems the registration is
not working on varnish-cache.org

_______________________________________________
varnish-dev mailing list
varnish-dev at varnish-cache.org
http://lists.varnish-cache.org/mailman/listinfo/varnish-dev


From jim.salem at acquia.com  Tue Apr 27 14:47:56 2010
From: jim.salem at acquia.com (Jim Salem)
Date: Tue, 27 Apr 2010 10:47:56 -0400
Subject: MIssing log entries
Message-ID: <v2x5ec7874b1004270747w493d1d06g1c515e466937e941@mail.gmail.com>

We're finding a number of cases where the number of requests shown in the
Varnish logs is LOWER that what the logs of the backend Apache web servers
show.  We've tried adjusting the shared memory used for logging but this
hasn't seemed to make a difference.

Is this a bug others have seen?  We're running v2.0.

Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100427/a66e53cc/attachment.html>

From kristian at varnish-software.com  Tue Apr 27 14:53:15 2010
From: kristian at varnish-software.com (=?UTF-8?Q?Kristian_Lyngst=C3=B8l?=)
Date: Tue, 27 Apr 2010 16:53:15 +0200
Subject: MIssing log entries
In-Reply-To: <v2x5ec7874b1004270747w493d1d06g1c515e466937e941@mail.gmail.com>
References: <v2x5ec7874b1004270747w493d1d06g1c515e466937e941@mail.gmail.com>
Message-ID: <j2m46547a3d1004270753l5ead01b8m4ded486149a1b65f@mail.gmail.com>

On Tue, Apr 27, 2010 at 4:47 PM, Jim Salem <jim.salem at acquia.com> wrote:
> We're finding a number of cases where the number of requests shown in the
> Varnish logs is LOWER that what the logs of the backend Apache web servers
> show.? We've tried adjusting the shared memory used for logging but this
> hasn't seemed to make a difference.
>
> Is this a bug others have seen?? We're running v2.0.

Sounds like you are using return(pipe); without Connection: close,
which would not be able to show all connections in the log, since it
only shuffles bytes.

That would be my first guess, at least.

- Kristian


From eran at sandler.co.il  Tue Apr 13 08:58:13 2010
From: eran at sandler.co.il (Eran Sandler)
Date: Tue, 13 Apr 2010 08:58:13 -0000
Subject: setting obj.prefetch
Message-ID: <w2ve6b3e5a61004130157ge00f9f52h2b7021892b36a9f9@mail.gmail.com>

Hello all,

Prior to Varnish 2.1 I was able to set the obj.prefetch in the vcl_fetch
function so that for different items being fetched I have a different
prefetch time.
I'm not sure if that was the correct place for Varnish prior to 2.1 but it
worked.

In Varnish 2.1 obj in vcl_fetch is read only. What is the best place to set
the prefetch value? vcl_recv ?

Thanks,
Eran
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20100413/4fcd1f08/attachment.html>

From augusto at jadedpixel.com  Mon Apr 19 14:21:05 2010
From: augusto at jadedpixel.com (Augusto Becciu)
Date: Mon, 19 Apr 2010 14:21:05 -0000
Subject: Varnish leaking memory?
Message-ID: <o2t148e94211004190720nf454ef48xafcff4b68ca24ba3@mail.gmail.com>

Hey Guys,

I'm running Varnish 2.0.6 in a production linux server with the
following parameters:

varnishd -P /var/run/varnishd.pid -a 0.0.0.0:2000 -T 127.0.0.1:6082 -w
200,2000 -s malloc,14G -p lru_interval=5 -f /etc/varnish/varnish.vcl

The server has a total of 17G of RAM and no swap at all. When the
cache reached its limit of 14G, varnish started nuking objects and
everything looked good. However, it sill continues to allocate more
and more memory without releasing it. Memory usage growth is very slow
though. Also the committed memory is like 24.2G now and keeps growing
steady. Is that normal?

Here's the output of varnishstat -1:

 uptime                 416905          .   Child uptime
client_conn           1445639         3.47 Client connections accepted
client_drop                 0         0.00 Connection dropped, no sess
client_req           10977250        26.33 Client requests received
cache_hit             8316582        19.95 Cache hits
cache_hitpass               2         0.00 Cache hits for pass
cache_miss            2660450         6.38 Cache misses
backend_conn            15761         0.04 Backend conn. success
backend_unhealthy            0         0.00 Backend conn. not attempted
backend_busy                0         0.00 Backend conn. too many
backend_fail          3990792         9.57 Backend conn. failures
backend_reuse         2644716         6.34 Backend conn. reuses
backend_toolate         15693         0.04 Backend conn. was closed
backend_recycle       2660401         6.38 Backend conn. recycles
backend_unused              0         0.00 Backend conn. unused
fetch_head                  0         0.00 Fetch head
fetch_length          2659445         6.38 Fetch with Length
fetch_chunked               0         0.00 Fetch chunked
fetch_eof                   0         0.00 Fetch EOF
fetch_bad                   0         0.00 Fetch had bad headers
fetch_close                 0         0.00 Fetch wanted close
fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
fetch_zero                  0         0.00 Fetch zero len
fetch_failed                0         0.00 Fetch failed
n_srcaddr                   0          .   N struct srcaddr
n_srcaddr_act               0          .   N active struct srcaddr
n_sess_mem                 98          .   N struct sess_mem
n_sess                      9          .   N struct sess
n_object               547004          .   N struct object
n_objecthead           547002          .   N struct objecthead
n_smf                       0          .   N struct smf
n_smf_frag                  0          .   N small free smf
n_smf_large                 0          .   N large free smf
n_vbe_conn       18446744073709551613          .   N struct vbe_conn
n_bereq                    73          .   N struct bereq
n_wrk                     400          .   N worker threads
n_wrk_create              400         0.00 N worker threads created
n_wrk_failed                0         0.00 N worker threads not created
n_wrk_max                   0         0.00 N worker threads limited
n_wrk_queue                 0         0.00 N queued work requests
n_wrk_overflow              0         0.00 N overflowed work requests
n_wrk_drop                  0         0.00 N dropped work requests
n_backend                   5          .   N backends
n_expired              213948          .   N expired objects
n_lru_nuked           1899467          .   N LRU nuked objects
n_lru_saved                 0          .   N LRU saved objects
n_lru_moved           8001207          .   N LRU moved objects
n_deathrow                  0          .   N objects on deathrow
losthdr                   155         0.00 HTTP header overflows
n_objsendfile               0         0.00 Objects sent with sendfile
n_objwrite            9855464        23.64 Objects sent with write
n_objoverflow               0         0.00 Objects overflowing workspace
s_sess                1445638         3.47 Total Sessions
s_req                10977251        26.33 Total Requests
s_pipe                      2         0.00 Total pipe
s_pass                     27         0.00 Total pass
s_fetch               2660410         6.38 Total fetch
s_hdrbytes         4952053163     11878.13 Total header bytes
s_bodybytes      185370067623    444633.83 Total body bytes
sess_closed           1144421         2.75 Session Closed
sess_pipeline               1         0.00 Session Pipeline
sess_readahead              0         0.00 Session Read Ahead
sess_linger          10660820        25.57 Session Linger
sess_herd             3188841         7.65 Session herd
shm_records         608766401      1460.20 SHM records
shm_writes           22498443        53.97 SHM writes
shm_flushes               487         0.00 SHM flushes due to overflow
shm_cont                 9651         0.02 SHM MTX contention
shm_cycles                244         0.00 SHM cycles through buffer
sm_nreq                     0         0.00 allocator requests
sm_nobj                     0          .   outstanding allocations
sm_balloc                   0          .   bytes allocated
sm_bfree                    0          .   bytes free
sma_nreq              7220096        17.32 SMA allocator requests
sma_nobj              1093874          .   SMA outstanding allocations
sma_nbytes        15032380353          .   SMA outstanding bytes
sma_balloc        68029864865          .   SMA bytes allocated
sma_bfree         52997484512          .   SMA bytes free
sms_nreq                  213         0.00 SMS allocator requests
sms_nobj                    0          .   SMS outstanding allocations
sms_nbytes                  0          .   SMS outstanding bytes
sms_balloc              99681          .   SMS bytes allocated
sms_bfree               99681          .   SMS bytes freed
backend_req           2660474         6.38 Backend requests made
n_vcl                       1         0.00 N vcl total
n_vcl_avail                 1         0.00 N vcl available
n_vcl_discard               0         0.00 N vcl discarded
n_purge                     1          .   N total active purges
n_purge_add                 1         0.00 N new purges added
n_purge_retire              0         0.00 N old purges deleted
n_purge_obj_test            0         0.00 N objects tested
n_purge_re_test             0         0.00 N regexps tested against
n_purge_dups                0         0.00 N duplicate purges removed
hcb_nolock                  0         0.00 HCB Lookups without lock
hcb_lock                    0         0.00 HCB Lookups with lock
hcb_insert                  0         0.00 HCB Inserts
esi_parse                   0         0.00 Objects ESI parsed (unlock)
esi_errors                  0         0.00 ESI parse errors (unlock)


Thanks in advance.

Augusto