From slink at schokola.de Fri Sep 2 18:10:50 2016 From: slink at schokola.de (Nils Goroll) Date: Fri, 2 Sep 2016 20:10:50 +0200 Subject: hit-for-pass vs. hit-for-miss Message-ID: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> (quick brain dump before I need to rush out) Geoff discovered this interesting consequence of a recent important change of phk and we just spent an hour to discuss this: before commit 9f272127c6fba76e6758d7ab7ba6527d9aad98b0, a hit-for-pass object lead to a pass, not it's a miss. IIUC the discussions we had on a trip to Amsterdam, phks main motivation was to eliminate the potentially deadly effect unintentionally created hfps had on cache efficiency: No matter what, for the lifetime of the hfp, all requests hitting that object became passes. so, in short - previously: an uncacheable response wins and sticks for its ttl - now: an cacheable response wins and sticks for its ttl or eben shorter: - previously: hit-for-pass - now: hit-for-miss >From the perspective of a cache, the "now" case seems clearly favorable, but now Geoff has discovered that the reverse is true for a case which is important to one of our projects: - varnish is running in "do how the backend says" mode - backend devs know when to make responses uncacheable - a huge (600MB) backend response is uncacheable, but client-validatable so this is the case for the previous semantics: - 1st request creates the hfp - 2nd request from client carries INM - gets passed with INM - 304 from backend goes to client What we have now is: - 1st request creates the hfm (hit-for-miss) - 2nd request is a miss - INM gets removed - backend sends 600MB unnecessarily We've thought about a couple of options which I want to write down before they expire from my cache: * decide in vcl_hit sub vcl_hit { if (obj.uncacheable) { if (obj.http.Criterium) { return (miss); } else { return (pass); } } } * Do not strip INM/IMS for miss and have a bereq property if it was a misspass - core code keeps INM/IMS - builtin.vcl strips them in vcl_miss - can check for hitpass in vcl_miss - any 304 backend response forced as uncacheable - interesting detail: can it still create a hfp object ? BUT: how would we know in vcl_miss if we see *client* inm/ims or varnish-generated inm/ims ? So at this point I only see the YAS option. Nils From geoff at uplex.de Fri Sep 2 19:21:56 2016 From: geoff at uplex.de (Geoff Simmons) Date: Fri, 2 Sep 2016 21:21:56 +0200 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> Message-ID: <053d7161-6baa-8963-5bfb-4de66f757575@uplex.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 09/02/2016 08:10 PM, Nils Goroll wrote: > > - varnish is running in "do how the backend says" mode - backend > devs know when to make responses uncacheable Just so that you know what that's all about, to understand the use case: We have a project at which TTLs for caching are determined almost exclusively from Cache-Control (there are some exceptions, but that is far and away the most common case). This means that we run Varnish with -t 0, and set beresp.ttl=0s if neither of Cache-Control or Expires are present in a response. So setting a TTL for caching is entirely the responsibility of the devs. That means in turn that we do not have long sequences of regexen in vcl_recv to decide: lookup for these and pass for those. I'm sure everyone knows how unmaintainable that sort of thing can become, and for us, it's absolutely out of the question. There are far too many dev teams, choosing and changing their URL patterns all the time, we could never keep the patterns up to date, nor do we want Varnish to do all that regexing on every request. If you think about it, this is the way the world should be -- devs are fully responsible for setting TTLs with Cache-Control. But what that means is that we have no way of knowing in advance which requests will be passed -- meaning, we don't know it in vcl_recv solely on the basis of the client request. I'd have to check, but it wouldn't surprise me if "return(pass)" does not appear anywhere in our VCL. Setting beresp.uncacheable is the only way we can determine that subsequent requests for the same URL can be passed. That leads us to the problem that Nils described -- prior to the change that he mentioned, requests with bereq.uncacheable were pass, so req headers were filtered into the bereq as in the pass case. In particular, conditional headers (IMS/INM) were not filtered out. "If the client asks for validation, and we're passing, then we ask the backend for validation, and if the backend says 304, pass it along." After the change, requests with bereq.uncacheable are misses, so the conditional headers are filtered out. So our situation now is: * Setting beresp.uncacheable is the only way we can know that a request can be passed. * But now it has become impossible to pass along conditional requests under these circumstances, even if both the client and backend are able to everything right with IMS/INM and Last-Modified/ETag. I'll let Nils' mail describe the rest. What we're missing now is: set up the bereq for pass, in particular allowing the conditional headers, not because we've set it to pass in vcl_recv (which we can't do), but because we've set beresp.uncacheable for a previous beresp. Thanks, Geoff - -- ** * * UPLEX - Nils Goroll Systemoptimierung Scheffelstra?e 32 22301 Hamburg Tel +49 40 2880 5731 Mob +49 176 636 90917 Fax +49 40 42949753 http://uplex.de -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJXydFUAAoJEOUwvh9pJNURK6EP/As7G39CCizW+wHaTGaXbq2w JjzXRpMPgnsmlTZ24PjUh/e+kih9PxXgBruQdsZ0egMIlQN3tXI77ZyOol0oFfBk TspB1ANGPw95JS4rTRx4PtTCPHlfXtyY+b07T/QRGRmKkosxOBDVJrtEaQ2xaJ88 GF0p208GROf3VFYtQ05QYfwvyiwX09dopEQmMyrEdkoq9uH3Xrf1d5FWT5OYbEFy LMHtoRkdVp78qt4hgV8LYG1BJ4AYRKOhbIIVlX7itw12PZuMNoAf+Z+r4RzQX74t a0fAPkwvkym3XxfxhPu9CceEsKoTOb+zMOUV2+GNO1NP7r25xLRCJ62zoQHzpKs4 SacB2zcecRVnw+/WYtlxseSmUbmzizZu3MLyiCIBlCsTIktgZOMlrUP4++WFckY7 C4uuqd2OrxyhzHTP7BBVSr9+itITU1tNGWE87wZy7yOh9ITb9YFd9esVStTqyItd fb0Ps+yqaqfxZHDpo+vOh1RIc+GPzSBxt9eDd0nPWBFFjLyr74/gCjYOh12oxwUK j+gkO31mmfy9umw0bUvF9ZLGytnOGVD0hFVuAQ8DcRt1cGvPOl87nwr5iRXtTxPP O/4qbrgy66X98kWkB3A+eaCzG1WobinNvwfrsxRkF1FpFXi5xmCVmjSRaF1S6hOW /IY8lsypBwb5CX7fCvsh =1Siq -----END PGP SIGNATURE----- From slink at schokola.de Wed Sep 7 17:01:53 2016 From: slink at schokola.de (Nils Goroll) Date: Wed, 7 Sep 2016 19:01:53 +0200 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> Message-ID: <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> Hi, TL;DR please shout if you think you need the choice between hit-for-pass and hit-for-miss. On 02/09/16 20:10, Nils Goroll wrote: > - previously: hit-for-pass > - now: hit-for-miss on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *) I would like to understand if bringing back hit-for-pass is a specific requirement we have (in which case a vmod producing quite some overhead would be the right thing to do) or if others have more cases which would justify a generic solution in varnish core like this one: > sub vcl_hit { > if (obj.uncacheable) { > if (obj.http.Criterium) { > return (miss); > } else { > return (pass); > } > } > } Thank you, Nils *) using a secondary cache index (maybe as in the xkey vmod), mark objects we want to pass for in backend_response, check in recv if the object is marked and return(pass) if so. From geoff at uplex.de Wed Sep 7 19:41:30 2016 From: geoff at uplex.de (Geoffrey Simmons) Date: Wed, 7 Sep 2016 15:41:30 -0400 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> Message-ID: <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> This is worth responding to from vacation. The "specific requirement we have" is a consequence of applying the HTTP protocol the way it was meant to be used -- responses specify their cacheability, without VCL having to intervene to classify which requests go to lookup or pass. Typically with a sequence of regex matched against URL patterns in vcl_recv. That may indeed be unusual, but I see that a sad commentary on the state of web developers' knowledge about caching and HTTP. Not as somebody's peculiar requirement. It would strike me as rather odd if a caching proxy has to treat it as a special case when backends actually do something the right way (like always set Cache-Control to determine TTLs). Geoff Sent from my iPhone > On Sep 7, 2016, at 1:01 PM, Nils Goroll wrote: > > Hi, > > TL;DR please shout if you think you need the choice between hit-for-pass and > hit-for-miss. > > > >> On 02/09/16 20:10, Nils Goroll wrote: >> - previously: hit-for-pass >> - now: hit-for-miss > > on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *) > > I would like to understand if bringing back hit-for-pass is a specific > requirement we have (in which case a vmod producing quite some overhead would be > the right thing to do) or if others have more cases which would justify a > generic solution in varnish core like this one: > >> sub vcl_hit { >> if (obj.uncacheable) { >> if (obj.http.Criterium) { >> return (miss); >> } else { >> return (pass); >> } >> } >> } > > Thank you, Nils > > > *) using a secondary cache index (maybe as in the xkey vmod), mark objects we > want to pass for in backend_response, check in recv if the object is marked and > return(pass) if so. > > _______________________________________________ > varnish-dev mailing list > varnish-dev at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev From phk at phk.freebsd.dk Wed Sep 7 20:00:17 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Wed, 07 Sep 2016 20:00:17 +0000 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> Message-ID: <59498.1473278417@critter.freebsd.dk> -------- In message <4E80A314-241C-4566-B6BA-61B72DC2CAA6 at uplex.de>, Geoffrey Simmons wr ites: >That may indeed be unusual, but I see that a sad commentary on the >state of web developers' knowledge about caching and HTTP. Not as >somebody's peculiar requirement. Oh, man, if you think that is a sad commentary, wait till I get started... >It would strike me as rather odd if a caching proxy has to treat >it as a special case when backends actually do something the right >way (like always set Cache-Control to determine TTLs). IMO the unusual detail, is that it takes several minutes to fetch the object from the backend, and that people are trying to find a way to mitigate/work around that special case. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Wed Sep 7 20:02:07 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Wed, 07 Sep 2016 20:02:07 +0000 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> Message-ID: <59952.1473278527@critter.freebsd.dk> -------- In message <98489638-edc5-617d-a467-eae7dddd8d9e at schokola.de>, Nils Goroll writ es: >on IRC, phk has suggested that we could bring back hit-for-pass in a vmod *) Only as a workaround for this corner case where the backend takes minutes to reply. It is not my impression that there is an issue with "normal" backend response times, but correct me if I'm wrong ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From slink at schokola.de Wed Sep 7 20:10:08 2016 From: slink at schokola.de (Nils Goroll) Date: Wed, 7 Sep 2016 22:10:08 +0200 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <59952.1473278527@critter.freebsd.dk> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <59952.1473278527@critter.freebsd.dk> Message-ID: <30d24015-cd7f-709f-0160-566d9a6609bf@schokola.de> On 07/09/16 22:02, Poul-Henning Kamp wrote: > It is not my impression that there is an issue with "normal" backend > response times, but correct me if I'm wrong ? The general issue is that for a miss, we should remove IMS/INM, but not for a pass. So any backend function where validation is efficient while delivery is not will be hit. Nils From geoff at uplex.de Wed Sep 7 20:32:29 2016 From: geoff at uplex.de (Geoffrey Simmons) Date: Wed, 7 Sep 2016 16:32:29 -0400 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <59498.1473278417@critter.freebsd.dk> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> Message-ID: For one thing, I wouldn't call it workaround to use Last-Modified/ETag and IMS/INM to avoid sending a very large response unless it's necessary. Validation is meant to solve just that sort of problem. But that's not the main thing I'm talking about. We have required devs to always decide about TTLs for caching themselves, including TTL=0, by setting Cache-Control. I know that's unusual, but it's the way things really ought to be. However, that eliminates the possibility of knowing at recv time that a request can be passed. Again, I know it's common to have something like lists of regexen in vcl_recv to decide what goes to lookup and what goes to pass. But it shouldn't *have* to be that way -- ideally, it shouldn't be necessary at all. Without HFP, we're left with no way at all of knowing which requests can be passed. Which is bad enough, but it also in turn eliminates the possibility of using IMS/INM for uncacheable responses. Altogether, it means that we ask devs to use HTTP for caching as the protocol intends, but then it's a problem case for a caching proxy for HTTP (and Nils has indicated that a VMOD might have performance problems). That IMO gets us into Alice in Wonderland territory. Sent from my iPhone > On Sep 7, 2016, at 4:00 PM, Poul-Henning Kamp wrote: > > -------- > In message <4E80A314-241C-4566-B6BA-61B72DC2CAA6 at uplex.de>, Geoffrey Simmons wr > ites: > >> That may indeed be unusual, but I see that a sad commentary on the >> state of web developers' knowledge about caching and HTTP. Not as >> somebody's peculiar requirement. > > Oh, man, if you think that is a sad commentary, wait till I get started... > >> It would strike me as rather odd if a caching proxy has to treat >> it as a special case when backends actually do something the right >> way (like always set Cache-Control to determine TTLs). > > IMO the unusual detail, is that it takes several minutes to fetch > the object from the backend, and that people are trying to find > a way to mitigate/work around that special case. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk at FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Wed Sep 7 21:20:02 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Wed, 07 Sep 2016 21:20:02 +0000 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> Message-ID: <69098.1473283202@critter.freebsd.dk> -------- In message , Geoffrey Simmons wr ites: The problem with HFP as opposed to HFM is that it is waaaay outside the specs, our trouble assigning TTLs being a very blunt hint about that. I'm all for improving stuff in general, and any ideas/patches are welcome, just dont try to claim that HFP was more RFC- or for that matter POLA-compliant than HFM... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From slink at schokola.de Thu Sep 8 08:26:28 2016 From: slink at schokola.de (Nils Goroll) Date: Thu, 8 Sep 2016 10:26:28 +0200 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <69098.1473283202@critter.freebsd.dk> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> <69098.1473283202@critter.freebsd.dk> Message-ID: On 07/09/16 23:20, Poul-Henning Kamp wrote: > I'm all for improving stuff in general, and any ideas/patches are > welcome what's your opinion about the suggestion to add obj.uncacheable + return miss/pass in vcl_hit (see vcl mock in my initial email). The hard part about this is that we currently have an unsolved issue with miss from hit anyway, which we have last discussed on Monday. A transcript and summary of my understanding is in here: https://github.com/varnishcache/varnish-cache/issues/1799 At this point I think that solving this will be key also to getting vcl control over hfp/hfm. Nils From phk at phk.freebsd.dk Thu Sep 8 08:33:39 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 08 Sep 2016 08:33:39 +0000 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> <69098.1473283202@critter.freebsd.dk> Message-ID: <97505.1473323619@critter.freebsd.dk> -------- In message , Nils Goroll writ es: >On 07/09/16 23:20, Poul-Henning Kamp wrote: >> I'm all for improving stuff in general, and any ideas/patches are >> welcome > >what's your opinion about the suggestion to add obj.uncacheable + return >miss/pass in vcl_hit (see vcl mock in my initial email). It doesn't "feel" quite right, and it certainly does not seem like something which is so obviously correct that I feel comfortable slamming it in a week before a major release... >The hard part about this is that we currently have an unsolved issue with miss >from hit anyway, [...] Yes, that is the tricky one, and like the other one, I don't think we have anything which is good enough to stuff it in a week before the release. The good news is that there is only 6 months until the next release... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From slink at schokola.de Thu Sep 8 08:47:32 2016 From: slink at schokola.de (Nils Goroll) Date: Thu, 8 Sep 2016 10:47:32 +0200 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <97505.1473323619@critter.freebsd.dk> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> <69098.1473283202@critter.freebsd.dk> <97505.1473323619@critter.freebsd.dk> Message-ID: >> On 07/09/16 23:20, Poul-Henning Kamp wrote: >>> I'm all for improving stuff in general, and any ideas/patches are >>> welcome >> >> what's your opinion about the suggestion to add obj.uncacheable + return >> miss/pass in vcl_hit (see vcl mock in my initial email). > > It doesn't "feel" quite right What exactly doesn't? > it certainly does not seem like > something which is so obviously correct that I feel comfortable > slamming it in a week before a major release... This is nothing personal, but the change to hfm with no fallback to the previous logic doesn't seem so obviously correct that I feel comfortable having a major release with it. Nils From phk at phk.freebsd.dk Thu Sep 8 08:50:45 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 08 Sep 2016 08:50:45 +0000 Subject: 5.0 release one week away Message-ID: <2075.1473324645@critter.freebsd.dk> We are one week from the 5.0 release, and that means that we should have been in a "code-slush" for some time and in a hard "code-freeze" for a week. Life didn't happen that way, and 5.0 isn't going to be anybodys favourite release, but we're releasing it next thursday anyway. Focus the next seven days should be on bugs, docs and relnotes. I'll update to trunk on v-c.o in a few moments, if anybody else has sites they could stick -trunk on, please do, and let us find most of the silly bugs before the release. H2 is not even close to production ready, but at least one can play with it a bit. I'm not sure what the march'17 release should be yet, 5.1 or 6.0, but I suggest we try to decide it before november 1st. The panic related to my house-building project has resolved itself, and since the concrete elements won't arrive until late february I'll have a chance to concentrate more on the next release. Which reminds me: if anybody has ideas/leads about who I should hit up for some VML money, please drop me an email, I'm running about 3000 EUR/month below the usual activity level, and that isn't helping. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Thu Sep 8 08:53:30 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 08 Sep 2016 08:53:30 +0000 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> <69098.1473283202@critter.freebsd.dk> <97505.1473323619@critter.freebsd.dk> Message-ID: <2874.1473324810@critter.freebsd.dk> -------- In message , Nils Goroll writ es: >This is nothing personal, but the change to hfm with no fallback >to the previous logic doesn't seem so obviously correct that I >feel comfortable having a major release with it. As I just said: 5.0 isn't going to be anybodys favourite release, but it is happening anyway, because there are people out there waiting for the part of it that actually works. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From slink at schokola.de Thu Sep 8 09:01:23 2016 From: slink at schokola.de (Nils Goroll) Date: Thu, 8 Sep 2016 11:01:23 +0200 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: <2874.1473324810@critter.freebsd.dk> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> <69098.1473283202@critter.freebsd.dk> <97505.1473323619@critter.freebsd.dk> <2874.1473324810@critter.freebsd.dk> Message-ID: Let's get this back on track: Personally I don't care whether or not a solution to the hfp vs. hfm problem is going to be in 5.0 or not. For those people for whom this matters I have raised concerns, so at least the dev community should be aware of it. As many people know, I love running master code and I care about having good code in master, no matter what magic date we have for a certain git branch command. For the specific project we talked about we have the option to stick to 4.1 So I'm fine with postponing this for a bit, but I want to see #1799 and this one solved and I'd hope to have made sound suggestions to get us there. As always, if there are better suggestions, I'm all ears. Nils From phk at phk.freebsd.dk Thu Sep 8 09:03:13 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 08 Sep 2016 09:03:13 +0000 Subject: hit-for-pass vs. hit-for-miss In-Reply-To: References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> <98489638-edc5-617d-a467-eae7dddd8d9e@schokola.de> <4E80A314-241C-4566-B6BA-61B72DC2CAA6@uplex.de> <59498.1473278417@critter.freebsd.dk> <69098.1473283202@critter.freebsd.dk> <97505.1473323619@critter.freebsd.dk> <2874.1473324810@critter.freebsd.dk> Message-ID: <5517.1473325393@critter.freebsd.dk> -------- In message , Nils Goroll writ es: >So I'm fine with postponing this for a bit, but I want to see #1799 and this one >solved and I'd hope to have made sound suggestions to get us there. As always, >if there are better suggestions, I'm all ears. Agreed. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From dridi at varni.sh Thu Sep 8 11:27:29 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 8 Sep 2016 13:27:29 +0200 Subject: [master] 169e162 Enforce that VCL names must be C-language identifiers ([A-Za-z][A-Za-z0-9_]*) In-Reply-To: References: Message-ID: On Thu, Sep 8, 2016 at 1:21 PM, Poul-Henning Kamp wrote: > > commit 169e162c7506614090a33faef7a0df38c6ffac7a > Author: Poul-Henning Kamp > Date: Thu Sep 8 11:19:39 2016 +0000 > > Enforce that VCL names must be C-language identifiers ([A-Za-z][A-Za-z0-9_]*) I think this commit breaks our packaging and should be solved before the release: https://github.com/varnishcache/pkg-varnish-cache/blob/master/redhat/varnish_reload_vcl#L79 It looks trivial enough. Best, Dridi From dridi at varni.sh Tue Sep 13 21:27:39 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 13 Sep 2016 23:27:39 +0200 Subject: New autoconf macros for VMOD maintainers Message-ID: Hello everyone, I just pushed 4 commits [1] to master for inclusion in Varnish 5.0 with what I hope is proper documentation [2,3]. I immediately back-ported the commits to the 4.1 branch, here is why: New macros are documented as "since 4.1.4", but future hypothetical changes to the macros will not be back-ported to 4.1 to ensure a clean upgrade path. For instance, something that would be in 4.1.5 and 5.0.1 but not in 5.0.0. Not getting it in the next 4.1 would be a loss IMO since it is still supported. Regarding the macros themselves, they only introduce new facilities for out-of-tree VMOD maintainers, and although it's been tested on many platforms, I hope maintainers will let us know if they break in unexpected ways. Expected failures are basically old versions of autotools installations. The "us" in "let us know" can safely be turned into me. From now on I volunteer for autotools maintenance. This libvmod-painkiller journey taught me a lot about this stuff, and also I don't quite like it, I can now appreciate it better. Once 5.0 is branched, I'll start working on cleaning master up so that only Varnish developers/contributors should be affected in which case they can throw their crap at me (subtle joke hidden inside that last sentence). If anyone has objections, let me know. It's not too late to revert the commits too. Best, Dridi [1] https://github.com/varnishcache/varnish-cache/compare/edd4ebe...cd29eda [2] https://github.com/varnishcache/varnish-cache/blob/cd29eda/varnish.m4#L160-L237 [3] https://github.com/varnishcache/varnish-cache/blob/cd29eda/varnish.m4#L244-L253 From slink at schokola.de Sat Sep 24 13:33:38 2016 From: slink at schokola.de (Nils Goroll) Date: Sat, 24 Sep 2016 15:33:38 +0200 Subject: vcl_init, VMODs & workspaces In-Reply-To: <8F64A004-1B72-4808-A402-887645B6D151@gmail.com> References: <8F64A004-1B72-4808-A402-887645B6D151@gmail.com> Message-ID: just to follow up, we've got both workspace and PRIV_TASK for vcl events (including vcl_init) now in 5.0/master. Nils From slink at schokola.de Sat Sep 24 14:41:02 2016 From: slink at schokola.de (Nils Goroll) Date: Sat, 24 Sep 2016 16:41:02 +0200 Subject: blob iterators, body iterators in particular Message-ID: I've been pondering a bit if we could have a generalized vmod interface to iterate over blob lists and bodies in particular. Ideally, I'd like to have a single interface for all of the following pseudo-vcl examples. It's not that I'd personally need all of these now, the hash(req.body) case is really the only one I need to get working ASAP (and the plan is to fix the bodyaccess vmod). But being at it, I couldn't avoid reflecting on how this could be solved for the general case. So here's a vcl mock: vcl_init { # vmod-re exists new re_evil = re.regex("SQL.*INJECTION"); } vcl_recv { cache_req_body(1MB); # .match(STRING) exists, .matchb(BODY) doesn't if (re_evil.match(req.url) || re_evil.matchb(req.body)) { return (synth(400, "you're evil")); } } vcl_hash { if (req.method != "GET" && req.method != "HEAD") { # not possible ATM hash_blob(req.body); } } vcl_backend_response { # may be a stupid example, could not come up with anything better if (beresp.http.Content-Type == "image/png") { image.recompress(beresp.body); } # blobcode/blobdigest exist, but hashing a body is not # possible set beresp.http.Etag = blobcode.encode(BASE64, blobdigest.hashb(MD5, beresp.body)); } vcl_deliver { if (req.http.Cookie ~ "loggedin=true") { if (resp.http.Content-Type == "audio/mp3") { # another stupid example mp3.watermark(resp.body, req.http.UserId); } } } so the VCC declarations for all of the vmod methods/functions could use a common BLOB_LIST type # libvmod-re $Method BOOL .matchb(BLOB_LIST) # libvmod-image $Function VOID .recompress(BLOB_LIST) # libvmod-blobdigest $Function BLOB hashb(ENUM {MD5, ...}, BLOB_LIST) # libvmod-mp3 $Function VOID .watermark(BLOB_LIST, STRING) only one BLOB_LIST argument would be allowed per Function/Method The VCL/VMOD interface should have an init call, an iterator and a fini call. The thing passed when iterating could be the existing vmod_priv struct vmod_priv_iter; typedef struct vmod_priv *vmod_priv_iter_f(const struct vmod_priv_iter *, const struct vmod_priv *); enum vmod_priv_iter_state_e { VI_INIT, VI_ITER, VI_FINI }; struct vmod_priv_iter { void *priv; enum vmod_priv_iter_state_e state; vmod_priv_iter_f *func; }; The C type of BLOB_LIST would be struct vmod_priv iter * the compiled VCL would then: - alloc the vmod_priv_iter (on the stack?) - zero it and set state=VI_INIT - call the vmod function once, ignoring the return value - the vmod function would alloc/init its priv data and fill in the priv and func members of the struct vmod_priv_iter - compiled VCL would set VI_ITER and loop through the object, calling the vmod_priv_iter_f - NULL return from iterator means "have not changed" - otherwise the iterator function MAY modify the object (if writable form the context) by referencing or copying the returned vmod_priv or copying/freeing it, as applicable - compiled VCL would set state=VI_FINI and call the vmod function the last time, using the return value unless VOID Regarding the interfaces with varnish core we need to differentiate the use cases: * vcl_recv {} / vcl_hash {} req.body access We got this as a storage object, so the iterator would wrap the vmod iterator in a objiterate_f -> _should_ be easy I think * vcl_backend_response { } Trouble here is that we do not have the body, so in principle I see a couple of options and I am having a hard time making up my mind which would be best - early fetch of the body, wrap the vmod iterator in a vfp (but where in the vfp stack would we put it?) - early fetch of the body, use objiterate_f when done Both would disable streaming for anything but a VOID return of the vmod iterator, the vfp option would allow to stream for VOID return * vcl_deliver { } Here we could use the objiterate_f again, but we would need to create some dummy OC_F_PRIVATE object, filling in the modified bits. Nils From slink at schokola.de Fri Sep 30 12:52:51 2016 From: slink at schokola.de (Nils Goroll) Date: Fri, 30 Sep 2016 14:52:51 +0200 Subject: suggesting bugwash topic: hit-for-pass vs. hit-for-miss In-Reply-To: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> References: <9417fd0c-b569-76fe-0103-d4ba67027ce8@schokola.de> Message-ID: Hi, Geoff any myself would want to get ahead with the design aspects of this issue, so we'd appreciate if we could discuss it during the next bugwash. It would be great if anyone interested could review the background info - my initial email, quoted in full below, has the top-level overview. For the option of a vcl_hit based decision, https://github.com/varnishcache/varnish-cache/issues/1799 is relevant also. Thank you, Nils On 02/09/16 20:10, Nils Goroll wrote: > (quick brain dump before I need to rush out) > > Geoff discovered this interesting consequence of a recent important change of > phk and we just spent an hour to discuss this: > > before commit 9f272127c6fba76e6758d7ab7ba6527d9aad98b0, a hit-for-pass object > lead to a pass, not it's a miss. IIUC the discussions we had on a trip to > Amsterdam, phks main motivation was to eliminate the potentially deadly effect > unintentionally created hfps had on cache efficiency: No matter what, for the > lifetime of the hfp, all requests hitting that object became passes. > > so, in short > > - previously: an uncacheable response wins and sticks for its ttl > - now: an cacheable response wins and sticks for its ttl > > or eben shorter: > > - previously: hit-for-pass > - now: hit-for-miss > > From the perspective of a cache, the "now" case seems clearly favorable, but now > Geoff has discovered that the reverse is true for a case which is important to > one of our projects: > > - varnish is running in "do how the backend says" mode > - backend devs know when to make responses uncacheable > - a huge (600MB) backend response is uncacheable, but client-validatable > > so this is the case for the previous semantics: > > - 1st request creates the hfp > - 2nd request from client carries INM > - gets passed with INM > - 304 from backend goes to client > > What we have now is: > > - 1st request creates the hfm (hit-for-miss) > - 2nd request is a miss > - INM gets removed > - backend sends 600MB unnecessarily > > We've thought about a couple of options which I want to write down before they > expire from my cache: > > * decide in vcl_hit > > sub vcl_hit { > if (obj.uncacheable) { > if (obj.http.Criterium) { > return (miss); > } else { > return (pass); > } > } > } > > * Do not strip INM/IMS for miss and have a bereq property if it was a misspass > > - core code keeps INM/IMS > - builtin.vcl strips them in vcl_miss > - can check for hitpass in vcl_miss > - any 304 backend response forced as uncacheable > - interesting detail: can it still create a hfp object ? > > BUT: how would we know in vcl_miss if we see > *client* inm/ims or varnish-generated inm/ims ? > > So at this point I only see the YAS option. > > Nils > > _______________________________________________ > varnish-dev mailing list > varnish-dev at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev >