From slink at schokola.de  Fri Sep  2 18:10:50 2016
From: slink at schokola.de (Nils Goroll)
Date: Fri, 2 Sep 2016 20:10:50 +0200
Subject: hit-for-pass vs. hit-for-miss
Message-ID: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>

(quick brain dump before I need to rush out)

Geoff discovered this interesting consequence of a recent important change of
phk and we just spent an hour to discuss this:

before commit 9f272127c6fba76e6758d7ab7ba6527d9aad98b0, a hit-for-pass object
lead to a pass, not it's a miss. IIUC the discussions we had on a trip to
Amsterdam, phks main motivation was to eliminate the potentially deadly effect
unintentionally created hfps had on cache efficiency: No matter what, for the
lifetime of the hfp, all requests hitting that object became passes.

so, in short

- previously: an uncacheable response wins and sticks for its ttl
- now:        an cacheable response wins and sticks for its ttl

or eben shorter:

- previously:	hit-for-pass
- now:		hit-for-miss

>From the perspective of a cache, the "now" case seems clearly favorable, but now
Geoff has discovered that the reverse is true for a case which is important to
one of our projects:

- varnish is running in "do how the backend says" mode
- backend devs know when to make responses uncacheable
- a huge (600MB) backend response is uncacheable, but client-validatable

so this is the case for the previous semantics:

- 1st request creates the hfp
- 2nd request from client carries INM
  - gets passed with INM
  - 304 from backend goes to client

What we have now is:

- 1st request creates the hfm (hit-for-miss)
- 2nd request is a miss
  - INM gets removed
  - backend sends 600MB unnecessarily

We've thought about a couple of options which I want to write down before they
expire from my cache:

* decide in vcl_hit

   sub vcl_hit {
	if (obj.uncacheable) {
		if (obj.http.Criterium) {
			return (miss);
		} else {
			return (pass);
		}
	   }
  }

* Do not strip INM/IMS for miss and have a bereq property if it was a misspass

  - core code keeps INM/IMS
  - builtin.vcl strips them in vcl_miss
  - can check for hitpass in vcl_miss
  - any 304 backend response forced as uncacheable
    - interesting detail: can it still create a hfp object ?

  BUT: how would we know in vcl_miss if we see
  *client* inm/ims or varnish-generated inm/ims ?

So at this point I only see the YAS option.

Nils


From geoff at uplex.de  Fri Sep  2 19:21:56 2016
From: geoff at uplex.de (Geoff Simmons)
Date: Fri, 2 Sep 2016 21:21:56 +0200
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
Message-ID: <053d7161-6baa-8963-5bfb-4de66f757575@uplex.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 09/02/2016 08:10 PM, Nils Goroll wrote:
> 
> - varnish is running in "do how the backend says" mode - backend
> devs know when to make responses uncacheable

Just so that you know what that's all about, to understand the use case:

We have a project at which TTLs for caching are determined almost
exclusively from Cache-Control (there are some exceptions, but that is
far and away the most common case).

This means that we run Varnish with -t 0, and set beresp.ttl=0s if
neither of Cache-Control or Expires are present in a response. So
setting a TTL for caching is entirely the responsibility of the devs.

That means in turn that we do not have long sequences of regexen in
vcl_recv to decide: lookup for these and pass for those. I'm sure
everyone knows how unmaintainable that sort of thing can become, and
for us, it's absolutely out of the question. There are far too many
dev teams, choosing and changing their URL patterns all the time, we
could never keep the patterns up to date, nor do we want Varnish to do
all that regexing on every request.

If you think about it, this is the way the world should be -- devs are
fully responsible for setting TTLs with Cache-Control.

But what that means is that we have no way of knowing in advance which
requests will be passed -- meaning, we don't know it in vcl_recv
solely on the basis of the client request. I'd have to check, but it
wouldn't surprise me if "return(pass)" does not appear anywhere in our
VCL.

Setting beresp.uncacheable is the only way we can determine that
subsequent requests for the same URL can be passed.

That leads us to the problem that Nils described -- prior to the
change that he mentioned, requests with bereq.uncacheable were pass,
so req headers were filtered into the bereq as in the pass case. In
particular, conditional headers (IMS/INM) were not filtered out. "If
the client asks for validation, and we're passing, then we ask the
backend for validation, and if the backend says 304, pass it along."

After the change, requests with bereq.uncacheable are misses, so the
conditional headers are filtered out.

So our situation now is:

* Setting beresp.uncacheable is the only way we can know that a
request can be passed.

* But now it has become impossible to pass along conditional requests
under these circumstances, even if both the client and backend are
able to everything right with IMS/INM and Last-Modified/ETag.

I'll let Nils' mail describe the rest. What we're missing now is: set
up the bereq for pass, in particular allowing the conditional headers,
not because we've set it to pass in vcl_recv (which we can't do), but
because we've set beresp.uncacheable for a previous beresp.


Thanks,
Geoff
- -- 
** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstra?e 32
22301 Hamburg

Tel +49 40 2880 5731
Mob +49 176 636 90917
Fax +49 40 42949753

http://uplex.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBCAAGBQJXydFUAAoJEOUwvh9pJNURK6EP/As7G39CCizW+wHaTGaXbq2w
JjzXRpMPgnsmlTZ24PjUh/e+kih9PxXgBruQdsZ0egMIlQN3tXI77ZyOol0oFfBk
TspB1ANGPw95JS4rTRx4PtTCPHlfXtyY+b07T/QRGRmKkosxOBDVJrtEaQ2xaJ88
GF0p208GROf3VFYtQ05QYfwvyiwX09dopEQmMyrEdkoq9uH3Xrf1d5FWT5OYbEFy
LMHtoRkdVp78qt4hgV8LYG1BJ4AYRKOhbIIVlX7itw12PZuMNoAf+Z+r4RzQX74t
a0fAPkwvkym3XxfxhPu9CceEsKoTOb+zMOUV2+GNO1NP7r25xLRCJ62zoQHzpKs4
SacB2zcecRVnw+/WYtlxseSmUbmzizZu3MLyiCIBlCsTIktgZOMlrUP4++WFckY7
C4uuqd2OrxyhzHTP7BBVSr9+itITU1tNGWE87wZy7yOh9ITb9YFd9esVStTqyItd
fb0Ps+yqaqfxZHDpo+vOh1RIc+GPzSBxt9eDd0nPWBFFjLyr74/gCjYOh12oxwUK
j+gkO31mmfy9umw0bUvF9ZLGytnOGVD0hFVuAQ8DcRt1cGvPOl87nwr5iRXtTxPP
O/4qbrgy66X98kWkB3A+eaCzG1WobinNvwfrsxRkF1FpFXi5xmCVmjSRaF1S6hOW
/IY8lsypBwb5CX7fCvsh
=1Siq
-----END PGP SIGNATURE-----


From slink at schokola.de  Wed Sep  7 17:01:53 2016
From: slink at schokola.de (Nils Goroll)
Date: Wed, 7 Sep 2016 19:01:53 +0200
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
Message-ID: <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>

Hi,

TL;DR please shout if you think you need the choice between hit-for-pass and
hit-for-miss.


On 02/09/16 20:10, Nils Goroll wrote:
> - previously:		hit-for-pass
> - now:		hit-for-miss

on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *)

I would like to understand if bringing back hit-for-pass is a specific
requirement we have (in which case a vmod producing quite some overhead would be
the right thing to do) or if others have more cases which would justify a
generic solution in varnish core like this one:

>    sub vcl_hit {
> 	if (obj.uncacheable) {
> 		if (obj.http.Criterium) {
> 			return (miss);
> 		} else {
> 			return (pass);
> 		}
> 	   }
>   }

Thank you, Nils


*) using a secondary cache index (maybe as in the xkey vmod), mark objects we
want to pass for in backend_response, check in recv if the object is marked and
return(pass) if so.


From geoff at uplex.de  Wed Sep  7 19:41:30 2016
From: geoff at uplex.de (Geoffrey Simmons)
Date: Wed, 7 Sep 2016 15:41:30 -0400
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
Message-ID: <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>

This is worth responding to from vacation.

The "specific requirement we have" is a consequence of applying the HTTP protocol the way it was meant to be used -- responses specify their cacheability, without VCL having to intervene to classify which requests go to lookup or pass. Typically with a sequence of regex matched against URL patterns in vcl_recv.

That may indeed be unusual, but I see that a sad commentary on the state of web developers' knowledge about caching and HTTP. Not as somebody's peculiar requirement.

It would strike me as rather odd if a caching proxy has to treat it as a special case when backends actually do something the right way (like always set Cache-Control to determine TTLs).


Geoff

Sent from my iPhone

> On Sep 7, 2016, at 1:01 PM, Nils Goroll <slink at schokola.de> wrote:
> 
> Hi,
> 
> TL;DR please shout if you think you need the choice between hit-for-pass and
> hit-for-miss.
> 
> 
> 
>> On 02/09/16 20:10, Nils Goroll wrote:
>> - previously:        hit-for-pass
>> - now:        hit-for-miss
> 
> on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *)
> 
> I would like to understand if bringing back hit-for-pass is a specific
> requirement we have (in which case a vmod producing quite some overhead would be
> the right thing to do) or if others have more cases which would justify a
> generic solution in varnish core like this one:
> 
>>   sub vcl_hit {
>>    if (obj.uncacheable) {
>>        if (obj.http.Criterium) {
>>            return (miss);
>>        } else {
>>            return (pass);
>>        }
>>       }
>>  }
> 
> Thank you, Nils
> 
> 
> *) using a secondary cache index (maybe as in the xkey vmod), mark objects we
> want to pass for in backend_response, check in recv if the object is marked and
> return(pass) if so.
> 
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev


From phk at phk.freebsd.dk  Wed Sep  7 20:00:17 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Wed, 07 Sep 2016 20:00:17 +0000
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
Message-ID: <59498.1473278417@critter.freebsd.dk>

--------
In message <4E80A314-241C-4566-B6BA-61B72DC2CAA6 at uplex.de>, Geoffrey Simmons wr
ites:

>That may indeed be unusual, but I see that a sad commentary on the
>state of web developers' knowledge about caching and HTTP. Not as
>somebody's peculiar requirement.

Oh, man, if you think that is a sad commentary, wait till I get started...

>It would strike me as rather odd if a caching proxy has to treat
>it as a special case when backends actually do something the right
>way (like always set Cache-Control to determine TTLs).

IMO the unusual detail, is that it takes several minutes to fetch
the object from the backend, and that people are trying to find
a way to mitigate/work around that special case.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From phk at phk.freebsd.dk  Wed Sep  7 20:02:07 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Wed, 07 Sep 2016 20:02:07 +0000
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
Message-ID: <59952.1473278527@critter.freebsd.dk>

--------
In message <98489638-edc5-617d-a467-eae7dddd8d9e at schokola.de>, Nils Goroll writ
es:

>on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *)

Only as a workaround for this corner case where the backend takes minutes
to reply.

It is not my impression that there is an issue with "normal" backend
response times, but correct me if I'm wrong ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From slink at schokola.de  Wed Sep  7 20:10:08 2016
From: slink at schokola.de (Nils Goroll)
Date: Wed, 7 Sep 2016 22:10:08 +0200
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <59952.1473278527@critter.freebsd.dk>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <59952.1473278527@critter.freebsd.dk>
Message-ID: <30d24015-cd7f-709f-0160-566d9a6609bf@schokola.de>

On 07/09/16 22:02, Poul-Henning Kamp wrote:
> It is not my impression that there is an issue with "normal" backend
> response times, but correct me if I'm wrong ?

The general issue is that for a miss, we should remove IMS/INM, but not for a pass.

So any backend function where validation is efficient while delivery is not will
be hit.

Nils


From geoff at uplex.de  Wed Sep  7 20:32:29 2016
From: geoff at uplex.de (Geoffrey Simmons)
Date: Wed, 7 Sep 2016 16:32:29 -0400
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <59498.1473278417@critter.freebsd.dk>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
Message-ID: <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>

For one thing, I wouldn't call it workaround to use Last-Modified/ETag and IMS/INM to avoid sending a very large response unless it's necessary. Validation is meant to solve just that sort of problem.

But that's not the main thing I'm talking about. We have required devs to always decide about TTLs for caching themselves, including TTL=0, by setting Cache-Control. I know that's unusual, but it's the way things really ought to be.

However, that eliminates the possibility of knowing at recv time that a request can be passed. Again, I know it's common to have something like lists of regexen in vcl_recv to decide what goes to lookup and what goes to pass. But it shouldn't *have* to be that way -- ideally, it shouldn't be necessary at all.

Without HFP, we're left with no way at all of knowing which requests can be passed. Which is bad enough, but it also in turn eliminates the possibility of using IMS/INM for uncacheable responses.

Altogether, it means that we ask devs to use HTTP for caching as the protocol intends, but then it's a problem case for a caching proxy for HTTP (and Nils has indicated that a VMOD might have performance problems). That IMO gets us into Alice in Wonderland territory.

Sent from my iPhone

> On Sep 7, 2016, at 4:00 PM, Poul-Henning Kamp <phk at phk.freebsd.dk> wrote:
> 
> --------
> In message <4E80A314-241C-4566-B6BA-61B72DC2CAA6 at uplex.de>, Geoffrey Simmons wr
> ites:
> 
>> That may indeed be unusual, but I see that a sad commentary on the
>> state of web developers' knowledge about caching and HTTP. Not as
>> somebody's peculiar requirement.
> 
> Oh, man, if you think that is a sad commentary, wait till I get started...
> 
>> It would strike me as rather odd if a caching proxy has to treat
>> it as a special case when backends actually do something the right
>> way (like always set Cache-Control to determine TTLs).
> 
> IMO the unusual detail, is that it takes several minutes to fetch
> the object from the backend, and that people are trying to find
> a way to mitigate/work around that special case.
> 
> -- 
> Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
> phk at FreeBSD.ORG         | TCP/IP since RFC 956
> FreeBSD committer       | BSD since 4.3-tahoe    
> Never attribute to malice what can adequately be explained by incompetence.


From phk at phk.freebsd.dk  Wed Sep  7 21:20:02 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Wed, 07 Sep 2016 21:20:02 +0000
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
Message-ID: <69098.1473283202@critter.freebsd.dk>

--------
In message <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67 at uplex.de>, Geoffrey Simmons wr
ites:

The problem with HFP as opposed to HFM is that it is waaaay outside the
specs, our trouble assigning TTLs being a very blunt hint about that.

I'm all for improving stuff in general, and any ideas/patches are
welcome, just dont try to claim that HFP was more RFC- or for that
matter POLA-compliant than HFM...

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From slink at schokola.de  Thu Sep  8 08:26:28 2016
From: slink at schokola.de (Nils Goroll)
Date: Thu, 8 Sep 2016 10:26:28 +0200
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <69098.1473283202@critter.freebsd.dk>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
 <69098.1473283202@critter.freebsd.dk>
Message-ID: <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>

On 07/09/16 23:20, Poul-Henning Kamp wrote:
> I'm all for improving stuff in general, and any ideas/patches are
> welcome

what's your opinion about the suggestion to add obj.uncacheable + return
miss/pass in vcl_hit (see vcl mock in my initial email).

The hard part about this is that we currently have an unsolved issue with miss
from hit anyway, which we have last discussed on Monday. A transcript and
summary of my understanding is in here:
https://github.com/varnishcache/varnish-cache/issues/1799

At this point I think that solving this will be key also to getting vcl control
over hfp/hfm.

Nils


From phk at phk.freebsd.dk  Thu Sep  8 08:33:39 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Thu, 08 Sep 2016 08:33:39 +0000
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
 <69098.1473283202@critter.freebsd.dk>
 <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>
Message-ID: <97505.1473323619@critter.freebsd.dk>

--------
In message <abfa5b5e-0874-cd37-e570-7922f80762ec at schokola.de>, Nils Goroll writ
es:
>On 07/09/16 23:20, Poul-Henning Kamp wrote:
>> I'm all for improving stuff in general, and any ideas/patches are
>> welcome
>
>what's your opinion about the suggestion to add obj.uncacheable + return
>miss/pass in vcl_hit (see vcl mock in my initial email).

It doesn't "feel" quite right, and it certainly does not seem like
something which is so obviously correct that I feel comfortable
slamming it in a week before a major release...

>The hard part about this is that we currently have an unsolved issue with miss
>from hit anyway, [...]

Yes, that is the tricky one, and like the other one, I don't think we
have anything which is good enough to stuff it in a week before the
release.

The good news is that there is only 6 months until the next release...

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From slink at schokola.de  Thu Sep  8 08:47:32 2016
From: slink at schokola.de (Nils Goroll)
Date: Thu, 8 Sep 2016 10:47:32 +0200
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <97505.1473323619@critter.freebsd.dk>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
 <69098.1473283202@critter.freebsd.dk>
 <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>
 <97505.1473323619@critter.freebsd.dk>
Message-ID: <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43@schokola.de>

>> On 07/09/16 23:20, Poul-Henning Kamp wrote:
>>> I'm all for improving stuff in general, and any ideas/patches are
>>> welcome
>>
>> what's your opinion about the suggestion to add obj.uncacheable + return
>> miss/pass in vcl_hit (see vcl mock in my initial email).
> 
> It doesn't "feel" quite right

What exactly doesn't?

> it certainly does not seem like
> something which is so obviously correct that I feel comfortable
> slamming it in a week before a major release...

This is nothing personal, but the change to hfm with no fallback to the previous
logic doesn't seem so obviously correct that I feel comfortable having a major
release with it.

Nils


From phk at phk.freebsd.dk  Thu Sep  8 08:50:45 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Thu, 08 Sep 2016 08:50:45 +0000
Subject: 5.0 release one week away
Message-ID: <2075.1473324645@critter.freebsd.dk>

We are one week from the 5.0 release, and that means that we should
have been in a "code-slush" for some time and in a hard "code-freeze"
for a week.

Life didn't happen that way, and 5.0 isn't going to be anybodys
favourite release, but we're releasing it next thursday anyway.

Focus the next seven days should be on bugs, docs and relnotes.

I'll update to trunk on v-c.o in a few moments, if anybody else
has sites they could stick -trunk on, please do, and let us find
most of the silly bugs before the release.

H2 is not even close to production ready, but at least one can
play with it a bit.

I'm not sure what the march'17 release should be yet, 5.1 or 6.0,
but I suggest we try to decide it before november 1st.

The panic related to my house-building project has resolved itself,
and since the concrete elements won't arrive until late february
I'll have a chance to concentrate more on the next release.

Which reminds me: if anybody has ideas/leads about who I should hit
up for some VML money, please drop me an email, I'm running about
3000 EUR/month below the usual activity level, and that isn't helping.

Poul-Henning


-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


From phk at phk.freebsd.dk  Thu Sep  8 08:53:30 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Thu, 08 Sep 2016 08:53:30 +0000
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
 <69098.1473283202@critter.freebsd.dk>
 <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>
 <97505.1473323619@critter.freebsd.dk>
 <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43@schokola.de>
Message-ID: <2874.1473324810@critter.freebsd.dk>

--------
In message <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43 at schokola.de>, Nils Goroll writ
es:

>This is nothing personal, but the change to hfm with no fallback
>to the previous logic doesn't seem so obviously correct that I
>feel comfortable having a major release with it.

As I just said:  5.0 isn't going to be anybodys favourite release,
but it is happening anyway, because there are people out there
waiting for the part of it that actually works.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From slink at schokola.de  Thu Sep  8 09:01:23 2016
From: slink at schokola.de (Nils Goroll)
Date: Thu, 8 Sep 2016 11:01:23 +0200
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <2874.1473324810@critter.freebsd.dk>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
 <69098.1473283202@critter.freebsd.dk>
 <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>
 <97505.1473323619@critter.freebsd.dk>
 <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43@schokola.de>
 <2874.1473324810@critter.freebsd.dk>
Message-ID: <a3d0e091-0aef-8ae1-1ea1-2ab32f744ddf@schokola.de>

Let's get this back on track:

Personally I don't care whether or not a solution to the hfp vs. hfm problem is
going to be in 5.0 or not. For those people for whom this matters I have raised
concerns, so at least the dev community should be aware of it.

As many people know, I love running master code and I care about having good
code in master, no matter what magic date we have for a certain git branch
command. For the specific project we talked about we have the option to stick to 4.1

So I'm fine with postponing this for a bit, but I want to see #1799 and this one
solved and I'd hope to have made sound suggestions to get us there. As always,
if there are better suggestions, I'm all ears.


Nils


From phk at phk.freebsd.dk  Thu Sep  8 09:03:13 2016
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Thu, 08 Sep 2016 09:03:13 +0000
Subject: hit-for-pass vs. hit-for-miss
In-Reply-To: <a3d0e091-0aef-8ae1-1ea1-2ab32f744ddf@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
 <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de>
 <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de>
 <59498.1473278417@critter.freebsd.dk>
 <D6B56BB5-D2D2-4AC1-9DD7-3FCD1482CE67@uplex.de>
 <69098.1473283202@critter.freebsd.dk>
 <abfa5b5e-0874-cd37-e570-7922f80762ec@schokola.de>
 <97505.1473323619@critter.freebsd.dk>
 <f6301ea5-b23b-fc7d-6eba-3bf9a3787b43@schokola.de>
 <2874.1473324810@critter.freebsd.dk>
 <a3d0e091-0aef-8ae1-1ea1-2ab32f744ddf@schokola.de>
Message-ID: <5517.1473325393@critter.freebsd.dk>

--------
In message <a3d0e091-0aef-8ae1-1ea1-2ab32f744ddf at schokola.de>, Nils Goroll writ
es:

>So I'm fine with postponing this for a bit, but I want to see #1799 and this one
>solved and I'd hope to have made sound suggestions to get us there. As always,
>if there are better suggestions, I'm all ears.

Agreed.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From dridi at varni.sh  Thu Sep  8 11:27:29 2016
From: dridi at varni.sh (Dridi Boukelmoune)
Date: Thu, 8 Sep 2016 13:27:29 +0200
Subject: [master] 169e162 Enforce that VCL names must be C-language
 identifiers ([A-Za-z][A-Za-z0-9_]*)
In-Reply-To: <E1bhxOJ-0006VQ-Qt@project.varnish-software.com>
References: <E1bhxOJ-0006VQ-Qt@project.varnish-software.com>
Message-ID: <CABoVN9CJGv2Gg5dYCAhFAdLJ_ZuH=rP86RCRo6q5MEXU7SFxRg@mail.gmail.com>

On Thu, Sep 8, 2016 at 1:21 PM, Poul-Henning Kamp <phk at freebsd.org> wrote:
>
> commit 169e162c7506614090a33faef7a0df38c6ffac7a
> Author: Poul-Henning Kamp <phk at FreeBSD.org>
> Date:   Thu Sep 8 11:19:39 2016 +0000
>
>     Enforce that VCL names must be C-language identifiers ([A-Za-z][A-Za-z0-9_]*)

I think this commit breaks our packaging and should be solved before
the release:

https://github.com/varnishcache/pkg-varnish-cache/blob/master/redhat/varnish_reload_vcl#L79

It looks trivial enough.

Best,
Dridi


From dridi at varni.sh  Tue Sep 13 21:27:39 2016
From: dridi at varni.sh (Dridi Boukelmoune)
Date: Tue, 13 Sep 2016 23:27:39 +0200
Subject: New autoconf macros for VMOD maintainers
Message-ID: <CABoVN9DUi84Ad0D3Wvn3VvLisHjb_E_cuXXE-rzGY=cpA3xASw@mail.gmail.com>

Hello everyone,

I just pushed 4 commits [1] to master for inclusion in Varnish 5.0
with what I hope is proper documentation [2,3]. I immediately back-ported
the commits to the 4.1 branch, here is why:

New macros are documented as "since 4.1.4", but future hypothetical
changes to the macros will not be back-ported to 4.1 to ensure a clean
upgrade path. For instance, something that would be in 4.1.5 and 5.0.1
but not in 5.0.0. Not getting it in the next 4.1 would be a loss IMO since
it is still supported.

Regarding the macros themselves, they only introduce new facilities
for out-of-tree VMOD maintainers, and although it's been tested on
many platforms, I hope maintainers will let us know if they break in
unexpected ways. Expected failures are basically old versions of
autotools installations.

The "us" in "let us know" can safely be turned into me. From now on
I volunteer for autotools maintenance. This libvmod-painkiller journey
taught me a lot about this stuff, and also I don't quite like it, I can now
appreciate it better.

Once 5.0 is branched, I'll start working on cleaning master up so that
only Varnish developers/contributors should be affected in which case
they can throw their crap at me (subtle joke hidden inside that last
sentence).

If anyone has objections, let me know. It's not too late to revert the
commits too.

Best,
Dridi

[1] https://github.com/varnishcache/varnish-cache/compare/edd4ebe...cd29eda
[2] https://github.com/varnishcache/varnish-cache/blob/cd29eda/varnish.m4#L160-L237
[3] https://github.com/varnishcache/varnish-cache/blob/cd29eda/varnish.m4#L244-L253


From slink at schokola.de  Sat Sep 24 13:33:38 2016
From: slink at schokola.de (Nils Goroll)
Date: Sat, 24 Sep 2016 15:33:38 +0200
Subject: vcl_init, VMODs & workspaces
In-Reply-To: <8F64A004-1B72-4808-A402-887645B6D151@gmail.com>
References: <8F64A004-1B72-4808-A402-887645B6D151@gmail.com>
Message-ID: <c3229086-a4ac-70ff-7a17-a86eedf0e973@schokola.de>

just to follow up, we've got both workspace and PRIV_TASK for vcl events
(including vcl_init) now in 5.0/master.

Nils


From slink at schokola.de  Sat Sep 24 14:41:02 2016
From: slink at schokola.de (Nils Goroll)
Date: Sat, 24 Sep 2016 16:41:02 +0200
Subject: blob iterators, body iterators in particular
Message-ID: <beb2a8fe-bba2-7708-ac6f-eb85cc6a84ea@schokola.de>

I've been pondering a bit if we could have a generalized vmod interface to
iterate over blob lists and bodies in particular. Ideally, I'd like to have a
single interface for all of the following pseudo-vcl examples.

It's not that I'd personally need all of these now, the hash(req.body) case is
really the only one I need to get working ASAP (and the plan is to fix the
bodyaccess vmod). But being at it, I couldn't avoid reflecting on how this could
be solved for the general case.

So here's a vcl mock:

vcl_init {
	# vmod-re exists
	new re_evil = re.regex("SQL.*INJECTION");
}

vcl_recv {
	cache_req_body(1MB);

	# .match(STRING) exists, .matchb(BODY) doesn't
	if (re_evil.match(req.url) || re_evil.matchb(req.body)) {
		return (synth(400, "you're evil"));
	}
}

vcl_hash {
	if (req.method != "GET" && req.method != "HEAD") {
		# not possible ATM
		hash_blob(req.body);
	}
}

vcl_backend_response {
	# may be a stupid example, could not come up with anything better
	if (beresp.http.Content-Type == "image/png") {
		image.recompress(beresp.body);
	}

	# blobcode/blobdigest exist, but hashing a body is not
	# possible
	set beresp.http.Etag = blobcode.encode(BASE64,
				blobdigest.hashb(MD5, beresp.body));
}

vcl_deliver {
	if (req.http.Cookie ~ "loggedin=true") {
		if (resp.http.Content-Type == "audio/mp3") {
			# another stupid example
			mp3.watermark(resp.body, req.http.UserId);
		}
	}
}

so the VCC declarations for all of the vmod methods/functions could use a common
BLOB_LIST type

# libvmod-re
$Method BOOL .matchb(BLOB_LIST)

# libvmod-image
$Function VOID .recompress(BLOB_LIST)

# libvmod-blobdigest
$Function BLOB hashb(ENUM {MD5, ...}, BLOB_LIST)

# libvmod-mp3
$Function VOID .watermark(BLOB_LIST, STRING)

only one BLOB_LIST argument would be allowed per Function/Method

The VCL/VMOD interface should have an init call, an iterator and a fini call.
The thing passed when iterating could be the existing vmod_priv

struct vmod_priv_iter;
typedef struct vmod_priv *vmod_priv_iter_f(const struct vmod_priv_iter *, const
struct vmod_priv *);

enum vmod_priv_iter_state_e {
	VI_INIT,
	VI_ITER,
	VI_FINI
};

struct vmod_priv_iter {
	void				*priv;
	enum vmod_priv_iter_state_e	state;
	vmod_priv_iter_f		*func;
};

The C type of BLOB_LIST would be struct vmod_priv iter *

the compiled VCL would then:

	- alloc the vmod_priv_iter (on the stack?)
	- zero it and set state=VI_INIT
	- call the vmod function once, ignoring the return value
	  - the vmod function would alloc/init its priv data and fill
	    in the priv and func members of the struct vmod_priv_iter
	- compiled VCL would set VI_ITER and loop through the object,
	  calling the vmod_priv_iter_f
	  - NULL return from iterator means "have not changed"
	  - otherwise the iterator function MAY modify the object
	    (if writable form the context) by referencing or copying the 	
            returned vmod_priv or copying/freeing it, as applicable
	- compiled VCL would set state=VI_FINI and call the vmod
	  function the last time, using the return value unless VOID

Regarding the interfaces with varnish core we need to differentiate the use cases:

* vcl_recv {} / vcl_hash {} req.body access

  We got this as a storage object, so the iterator would wrap the
  vmod iterator in a objiterate_f -> _should_ be easy I think

* vcl_backend_response { }

  Trouble here is that we do not have the body, so in principle I see
  a couple of options and I am having a hard time making up my mind
  which would be best

	- early fetch of the body, wrap the vmod iterator in
	  a vfp (but where in the vfp stack would we put it?)

	- early fetch of the body, use objiterate_f when done

	Both would disable streaming for anything but a VOID
	return of the vmod iterator, the vfp option would allow
	to stream for VOID return

* vcl_deliver { }

  Here we could use the objiterate_f again, but we would need to
  create some dummy OC_F_PRIVATE object, filling in the modified bits.

Nils


From slink at schokola.de  Fri Sep 30 12:52:51 2016
From: slink at schokola.de (Nils Goroll)
Date: Fri, 30 Sep 2016 14:52:51 +0200
Subject: suggesting bugwash topic: hit-for-pass vs. hit-for-miss
In-Reply-To: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de>
Message-ID: <af719b48-41a3-5317-9c9d-e926c3ca0f14@schokola.de>

Hi,

Geoff any myself would want to get ahead with the design aspects of this issue,
so we'd appreciate if we could discuss it during the next bugwash.

It would be great if anyone interested could review the background info - my
initial email, quoted in full below, has the top-level overview.

For the option of a vcl_hit based decision,
https://github.com/varnishcache/varnish-cache/issues/1799 is relevant also.

Thank you, Nils

On 02/09/16 20:10, Nils Goroll wrote:
> (quick brain dump before I need to rush out)
> 
> Geoff discovered this interesting consequence of a recent important change of
> phk and we just spent an hour to discuss this:
> 
> before commit 9f272127c6fba76e6758d7ab7ba6527d9aad98b0, a hit-for-pass object
> lead to a pass, not it's a miss. IIUC the discussions we had on a trip to
> Amsterdam, phks main motivation was to eliminate the potentially deadly effect
> unintentionally created hfps had on cache efficiency: No matter what, for the
> lifetime of the hfp, all requests hitting that object became passes.
> 
> so, in short
> 
> - previously: an uncacheable response wins and sticks for its ttl
> - now:        an cacheable response wins and sticks for its ttl
> 
> or eben shorter:
> 
> - previously:	hit-for-pass
> - now:		hit-for-miss
> 
> From the perspective of a cache, the "now" case seems clearly favorable, but now
> Geoff has discovered that the reverse is true for a case which is important to
> one of our projects:
> 
> - varnish is running in "do how the backend says" mode
> - backend devs know when to make responses uncacheable
> - a huge (600MB) backend response is uncacheable, but client-validatable
> 
> so this is the case for the previous semantics:
> 
> - 1st request creates the hfp
> - 2nd request from client carries INM
>   - gets passed with INM
>   - 304 from backend goes to client
> 
> What we have now is:
> 
> - 1st request creates the hfm (hit-for-miss)
> - 2nd request is a miss
>   - INM gets removed
>   - backend sends 600MB unnecessarily
> 
> We've thought about a couple of options which I want to write down before they
> expire from my cache:
> 
> * decide in vcl_hit
> 
>    sub vcl_hit {
> 	if (obj.uncacheable) {
> 		if (obj.http.Criterium) {
> 			return (miss);
> 		} else {
> 			return (pass);
> 		}
> 	   }
>   }
> 
> * Do not strip INM/IMS for miss and have a bereq property if it was a misspass
> 
>   - core code keeps INM/IMS
>   - builtin.vcl strips them in vcl_miss
>   - can check for hitpass in vcl_miss
>   - any 304 backend response forced as uncacheable
>     - interesting detail: can it still create a hfp object ?
> 
>   BUT: how would we know in vcl_miss if we see
>   *client* inm/ims or varnish-generated inm/ims ?
> 
> So at this point I only see the YAS option.
> 
> Nils
> 
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>