From fgtham at gmail.com Thu Dec 1 09:47:01 2016 From: fgtham at gmail.com (Florian Tham) Date: Thu, 1 Dec 2016 10:47:01 +0100 Subject: Varnish5 PROXY protocol and health probes Message-ID: Hi, I'm using varnish-5.0.0 with PROXY protocol enabled on the backends thanks to the new .proxy_header attribute. This works great as long as the backends don't have health probes defined. When a backend has both a .proxy_header attribute and a probe defined, varnish fails to properly detect the backend health state and marks the backend as sick. The problem seems to be that varnish does not use the PROXY protocol when sending health probes. If the backend correctly implements the PROXY protocol, it is required to drop the connection, as it cannot identify it as valid PROXY protocol v1 nor v2. Thus, from varnish's pov the backend is unreachable. I think it would be pretty useful to have health probes handle the PROXY protocol as well. What do you think? Are there any plans to enhance varnish accordingly? Best regards, Florian From phk at phk.freebsd.dk Thu Dec 1 09:55:08 2016 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 01 Dec 2016 09:55:08 +0000 Subject: Varnish5 PROXY protocol and health probes In-Reply-To: References: Message-ID: <22342.1480586108@critter.freebsd.dk> -------- In message , Florian Tham writes: That's a bug, please open a ticket: https://github.com/varnishcache/varnish-cache -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From fgtham at gmail.com Thu Dec 1 10:15:25 2016 From: fgtham at gmail.com (Florian Tham) Date: Thu, 1 Dec 2016 11:15:25 +0100 Subject: Varnish5 PROXY protocol and health probes In-Reply-To: <22342.1480586108@critter.freebsd.dk> References: <22342.1480586108@critter.freebsd.dk> Message-ID: For reference, ticket is here: https://github.com/varnishcache/varnish-cache/issues/2151 Best regards, Florian On Thu, Dec 1, 2016 at 10:55 AM, Poul-Henning Kamp wrote: > -------- > In message > , Florian Tham writes: > > That's a bug, please open a ticket: > > https://github.com/varnishcache/varnish-cache > > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk at FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. From stardothosting at gmail.com Thu Dec 1 22:23:19 2016 From: stardothosting at gmail.com (Star Dot) Date: Thu, 1 Dec 2016 17:23:19 -0500 Subject: Is Varnish 4 able to follow a redirect and cache the destination? Message-ID: Hello! I am dealing with Varnish 4 piping requests to a backend server. The backend server processes the request and redirects to Amazon S3 to serve the actual file. So in short : Request -> Varnish -> Nginx 302 Redirect -> Amazon S3 file Varnish is happily caching the 302 response only, but I'm curious if I can somehow follow the redirect and cache the destination file completely? This will alleviate load off the nginx server obviously. I've seen several topics dance around this issue but I'm curious if osmeone can point me in the right direction! Thanks -- StackStar Managed Hosting Services : https://www.stackstar.com Shift8 Web Design in Toronto : https://www.shift8web.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: From guillaume at varnish-software.com Thu Dec 1 22:57:13 2016 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Thu, 1 Dec 2016 23:57:13 +0100 Subject: Is Varnish 4 able to follow a redirect and cache the destination? In-Reply-To: References: Message-ID: It can, you will have to play a bit with the retry/restart mechanics after changing and cleaning bereq.url. On Dec 1, 2016 23:49, "Star Dot" wrote: > Hello! > > I am dealing with Varnish 4 piping requests to a backend server. The > backend server processes the request and redirects to Amazon S3 to serve > the actual file. > > So in short : > > Request -> Varnish -> Nginx 302 Redirect -> Amazon S3 file > > Varnish is happily caching the 302 response only, but I'm curious if I can > somehow follow the redirect and cache the destination file completely? This > will alleviate load off the nginx server obviously. > > I've seen several topics dance around this issue but I'm curious if > osmeone can point me in the right direction! > > Thanks > -- > > > StackStar Managed Hosting Services : https://www.stackstar.com > Shift8 Web Design in Toronto : https://www.shift8web.ca > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peterloron at gmail.com Fri Dec 2 08:11:37 2016 From: peterloron at gmail.com (Peter Loron) Date: Fri, 2 Dec 2016 00:11:37 -0800 Subject: Grace mode with 404 responses? Message-ID: <57811D62-4862-4953-A5D1-A4B4D12BF631@gmail.com> Hello, folks. I?m testing a Varnish 4.1.1 setup. Specifically I?m working with shielding a client service from an unreliable remote service by proxying and caching responses from the remote system. Grace mode is working fine for situations where the remote server is unresponsive (e.g. apache turned off), but does not seem to work when a 404 is returned for a resource. I immediately get the 404 on the client when the TTL (not the TTL + grace) is expired. I?m assuming this is something I?m not doing right in my VCL file, but I can?t figure out what. Is there a canonical example of how to handle this properly? Thanks. -Pete From apj at mutt.dk Fri Dec 2 08:51:38 2016 From: apj at mutt.dk (Andreas Plesner) Date: Fri, 2 Dec 2016 09:51:38 +0100 Subject: Is Varnish 4 able to follow a redirect and cache the destination? In-Reply-To: References: Message-ID: <20161202085138.GG22017@nerd.dk> On Thu, Dec 01, 2016 at 05:23:19PM -0500, Star Dot wrote: > Varnish is happily caching the 302 response only, but I'm curious if I can > somehow follow the redirect and cache the destination file completely? This > will alleviate load off the nginx server obviously. Yes. This is what we do: sub vcl_deliver { if ((resp.status == 301) || (resp.status == 302)) { set req.url = regsub(resp.http.Location,"^http://[^/]+(.*)","\1"); return(restart); } } Handling of the host header (and all the possible backends) is left as an exercise for the reader. What this does is that it caches the redirect as well as the destination, thus sort-of normalizing the request, while still caching the ultimate destination only once. -- Andreas From stardothosting at gmail.com Fri Dec 2 13:36:04 2016 From: stardothosting at gmail.com (Star Dot) Date: Fri, 2 Dec 2016 08:36:04 -0500 Subject: Is Varnish 4 able to follow a redirect and cache the destination? In-Reply-To: <20161202085138.GG22017@nerd.dk> References: <20161202085138.GG22017@nerd.dk> Message-ID: On Fri, Dec 2, 2016 at 3:51 AM, Andreas Plesner wrote: > > Yes. This is what we do: > > sub vcl_deliver { > if ((resp.status == 301) || (resp.status == 302)) { > set req.url = regsub(resp.http.Location,"^http://[^/]+(.*)","\1"); > return(restart); > } > } > > Handling of the host header (and all the possible backends) is left as an > exercise for the reader. > > What this does is that it caches the redirect as well as the destination, > thus > sort-of normalizing the request, while still caching the ultimate > destination > only once. > > Tentatively your rules work for me. Is there any way for varnish to cache the 302 redirect if its on a completely different host? In my example we are hitting media.domain.com which hits an nginx server that redirects it to a completely different URL (S3) that varnish is not caching. After testing your rule, I see that its definitely skipping a step and providing the destination URL (S3) to the browser. But that destination is not being cached. Is there a way to hide that hostname/url change completely so its transparent to the end user? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cservin-varnish at cromagnon.com Fri Dec 2 15:25:12 2016 From: cservin-varnish at cromagnon.com (Craig Servin) Date: Fri, 02 Dec 2016 09:25:12 -0600 Subject: n_lru_nuked per stevedore Message-ID: <2c6c71543b0a5413699b06d13f723bc7@www.cromagnon.com> Hi All, Is there a way to get at the number of lru evictions per stevedore? Thanks, Craig From dridi at varni.sh Fri Dec 2 16:09:05 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Fri, 2 Dec 2016 17:09:05 +0100 Subject: n_lru_nuked per stevedore In-Reply-To: <2c6c71543b0a5413699b06d13f723bc7@www.cromagnon.com> References: <2c6c71543b0a5413699b06d13f723bc7@www.cromagnon.com> Message-ID: On Fri, Dec 2, 2016 at 4:25 PM, Craig Servin wrote: > Hi All, > > Is there a way to get at the number of lru evictions per stevedore? Hi, Unfortunately no, although it has been discussed at least once by the dev team. There's nothing against having them but it's currently not on the agenda. You can sort-of get this information from varnishlog, but to get a result similar to varnishstat you'd need to write something on top to crunch the data. Cheers From admin at beckspaced.com Fri Dec 2 18:30:01 2016 From: admin at beckspaced.com (Admin Beckspaced) Date: Fri, 2 Dec 2016 19:30:01 +0100 Subject: varnishncsa logs split per domain In-Reply-To: <912a8f40-e02b-d578-585c-e3d9b8fdd0a2@beckspaced.com> References:

<976bff90-7c3e-e9c2-3517-6bc30a937de5@beckspaced.com> <4aea1d32-f9b8-ada4-fb9d-532f605ae5da@beckspaced.com> <5aa0e368-0121-8542-70bc-27aeae96e4e9@beckspaced.com>

<912a8f40-e02b-d578-585c-e3d9b8fdd0a2@beckspaced.com> Message-ID: <39257342-24c8-5dd8-9158-6b150de8893a@beckspaced.com> On 15.11.2016 17:57, Admin Beckspaced wrote: > > Am 15.11.2016 um 16:51 schrieb Dridi Boukelmoune: >>> a bit more hints and info would be nice ;) >> man vmod_std >> man varnishncsa >> >> That's how much "nice" I'm willing to do :p >> >> Dridi >> >> > ok. first I want to say thanks for being nice and pointing me to the > man pages. > > after a bit of reading I finally found the parts I was looking for: > > import std; > > sub vcl_recv { > > std.log(?myhost:? + regsub(req.http.Host, "^www\.", "") ); > > } > > in varnishncsa: > > # varnishncsa -F ?%h %l %u %t ?%r? %s %b ?%{Referer}i? > ?%{User-agent}i %{VCL_Log:myhost}x? > > > not yet tested but I think this is what Dridi was pointing to? > Hello again, sorry, it has been a while but I just thought to finish the thread I started and point to the solution I decided to go with at last. The question in the beginning was: How can I split the varnishncsa logs per domain. My first thinking was to use the query -q option, e.g. varnishncsa -q "ReqHeader ~ '^Host: .*example.com'" But this approach would end up in a lot varnishncsa instances, as pointed out by Andrei, and also the problem with not being able to normalize the Request Host header. Then Dridi pointed me to man vmod_std and using std.log in VCL, which was the final bit needed ;) So here's my current solution: I run a single instance of varnishncsa with the following params: VARNISHLOG_PARAMS="-f /etc/varnish/varnishncsa-log-format-string -a -w /var/log/varnish/varnish.log" the varnishncsa-log-format-string is as follows: %{VCL_Log:myhost}x %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" at the beginning VCL_Log:key The value set by std.log("key:value") in VCL, more on that later My varnish sits in front of Apache with 30 something different domains. Currently I don't use varnish for all domains, but have setup varnish VCL in such a way that I can filter on the domains and decide to cache with varnish or just skip caching and pass to the apache backend. My varnish VCL is based on the varnish boilerplate which I found on the net: http://verticalprogramming.com/2013/09/15/varnish-virtual-host-boilerplate So if I decide to cache a particular domain with varnish I can normalize the Request Host header in VCL sub vcl_recv { if (req.http.host ~ "somedomain\.com") { # tilde ~ uses regex if (req.http.host ~ "^(www.)?somedomain\.com$") { //normalize the req.http.host set req.http.host = regsub(req.http.Host, "^www\.", ""); std.log("myhost:" + req.http.Host ); ... so this std.log(myhost:somedomian.com) gets picked up by varnishncsa and the custom format string, see above. which then produces a nice & steady varnish.log file with a normalized host at the beginning for domains I want to cache and an empty space if I don't want to. then we got the split-logfile from apache: https://httpd.apache.org/docs/2.4/programs/split-logfile.html which was exactly made for a setup with the host names at the very beginning of the log file. Only thing someone needs to take care of is that the logfiles will get created in the directory where the script is run. so therefore I created a small bash script wrapper split-logfile.sh which first changes to the right working directory: #!/bin/bash cd /var/log/varnish /usr/bin/split-logfile < varnish.log and on the daily logrotate on the /var/log/varnish/varnish.log I added the following: /var/log/varnish/varnish.log { ... prerotate /var/log/varnish/split-logfile.sh endscript ... } so before the varnish.log gets rotated split-logfile.sh gets called and creates the different log files per normalized host and an access.log for all the requests without a hostname at the beginning. After a view logrotate runs the /var/log/varnish/ could look like that: mydomain.com.log myotherdomain.com.log access.log varnish.log varnish.log-20161130 varnish.log-20161201 varnish.log-20161202 which finally give me exactly what I wanted! A single instance of varnishncsa producing a varnish.log for all domains. a per domain split via split-logfile.sh on each logrotate run resulting in log files per domain ready to use with webalizer, which gets also called on prerotate the logfile: /var/log/varnish/mydomain.com.log { ... prerotate /usr/bin/webalizer -qc /etc/webalizer/mydomain.conf endscript ... } perhaps this might help someone here looking for something similar? thanks & greetings becki From guillaume at varnish-software.com Fri Dec 2 19:49:35 2016 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Fri, 2 Dec 2016 20:49:35 +0100 Subject: varnishncsa logs split per domain In-Reply-To: <39257342-24c8-5dd8-9158-6b150de8893a@beckspaced.com> References:

<976bff90-7c3e-e9c2-3517-6bc30a937de5@beckspaced.com> <4aea1d32-f9b8-ada4-fb9d-532f605ae5da@beckspaced.com> <5aa0e368-0121-8542-70bc-27aeae96e4e9@beckspaced.com>

<912a8f40-e02b-d578-585c-e3d9b8fdd0a2@beckspaced.com> <39257342-24c8-5dd8-9158-6b150de8893a@beckspaced.com> Message-ID: Thanks for reporting back, great job! On Dec 2, 2016 19:57, "Admin Beckspaced" wrote: > > On 15.11.2016 17:57, Admin Beckspaced wrote: > >> >> Am 15.11.2016 um 16:51 schrieb Dridi Boukelmoune: >> >>> a bit more hints and info would be nice ;) >>>> >>> man vmod_std >>> man varnishncsa >>> >>> That's how much "nice" I'm willing to do :p >>> >>> Dridi >>> >>> >>> ok. first I want to say thanks for being nice and pointing me to the man >> pages. >> >> after a bit of reading I finally found the parts I was looking for: >> >> import std; >> >> sub vcl_recv { >> >> std.log(?myhost:? + regsub(req.http.Host, "^www\.", "") ); >> >> } >> >> in varnishncsa: >> >> # varnishncsa -F ?%h %l %u %t ?%r? %s %b ?%{Referer}i? >> ?%{User-agent}i %{VCL_Log:myhost}x? >> >> >> not yet tested but I think this is what Dridi was pointing to? >> >> Hello again, > > sorry, it has been a while but I just thought to finish the thread I > started and point to the solution I decided to go with at last. > > The question in the beginning was: How can I split the varnishncsa logs > per domain. > > My first thinking was to use the query -q option, e.g. varnishncsa -q > "ReqHeader ~ '^Host: .*example.com'" > > But this approach would end up in a lot varnishncsa instances, as pointed > out by Andrei, and also the problem with not being able to normalize the > Request Host header. > Then Dridi pointed me to man vmod_std and using std.log in VCL, which was > the final bit needed ;) > > So here's my current solution: > > I run a single instance of varnishncsa with the following params: > > VARNISHLOG_PARAMS="-f /etc/varnish/varnishncsa-log-format-string -a -w > /var/log/varnish/varnish.log" > > the varnishncsa-log-format-string is as follows: > > %{VCL_Log:myhost}x %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" > > at the beginning VCL_Log:key The value set by std.log("key:value") in > VCL, more on that later > > My varnish sits in front of Apache with 30 something different domains. > Currently I don't use varnish for all domains, but have setup varnish VCL > in such a way that I can filter on the domains and decide to cache with > varnish or just skip caching and pass to the apache backend. > > My varnish VCL is based on the varnish boilerplate which I found on the > net: > > http://verticalprogramming.com/2013/09/15/varnish-virtual-host-boilerplate > > So if I decide to cache a particular domain with varnish I can normalize > the Request Host header in VCL > > sub vcl_recv { > > if (req.http.host ~ "somedomain\.com") { > > # tilde ~ uses regex > if (req.http.host ~ "^(www.)?somedomain\.com$") { > > //normalize the req.http.host > set req.http.host = regsub(req.http.Host, "^www\.", ""); > > std.log("myhost:" + req.http.Host ); > ... > > so this std.log(myhost:somedomian.com) gets picked up by varnishncsa and > the custom format string, see above. > > which then produces a nice & steady varnish.log file with a normalized > host at the beginning for domains I want to cache and an empty space if I > don't want to. > > then we got the split-logfile from apache: > > https://httpd.apache.org/docs/2.4/programs/split-logfile.html > > which was exactly made for a setup with the host names at the very > beginning of the log file. Only thing someone needs to take care of is that > the logfiles will get created in the directory where the script is run. so > therefore I created a small bash script wrapper split-logfile.sh which > first changes to the right working directory: > > #!/bin/bash > cd /var/log/varnish > /usr/bin/split-logfile < varnish.log > > and on the daily logrotate on the /var/log/varnish/varnish.log I added the > following: > > /var/log/varnish/varnish.log { > ... > prerotate > /var/log/varnish/split-logfile.sh > endscript > ... > } > > so before the varnish.log gets rotated split-logfile.sh gets called and > creates the different log files per normalized host and an access.log for > all the requests without a hostname at the beginning. > After a view logrotate runs the /var/log/varnish/ could look like that: > > mydomain.com.log > myotherdomain.com.log > access.log > varnish.log > varnish.log-20161130 > varnish.log-20161201 > varnish.log-20161202 > > which finally give me exactly what I wanted! A single instance of > varnishncsa producing a varnish.log for all domains. > a per domain split via split-logfile.sh on each logrotate run resulting in > log files per domain ready to use with webalizer, which gets also called on > prerotate the logfile: > > /var/log/varnish/mydomain.com.log { > ... > prerotate > /usr/bin/webalizer -qc /etc/webalizer/mydomain.conf > endscript > ... > } > > perhaps this might help someone here looking for something similar? > > thanks & greetings > becki > > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at beckspaced.com Mon Dec 5 17:36:59 2016 From: admin at beckspaced.com (Admin Beckspaced) Date: Mon, 5 Dec 2016 18:36:59 +0100 Subject: Varnishncsa log file size Message-ID: <509ce6aa-cfa0-3b41-e6af-f758b6ec04f1@beckspaced.com> Hello Varnish-Users, recently I have setup varnish logging via varnishncsa. If I look at the varnish log file size it's currently around 50 MB per day which gets rotate daily via logrotate. How big can the log file size grow without any performance issues? Is 100 MB still a valid log file size? even 500 MB perhaps? The more domains I add the bigger the log file will grow per day. Any hints are welcome ;) Thanks & greetings Becki From A.Hongens at netmatch.nl Mon Dec 5 18:27:11 2016 From: A.Hongens at netmatch.nl (=?iso-8859-1?Q?Angelo_H=F6ngens?=) Date: Mon, 5 Dec 2016 18:27:11 +0000 Subject: Varnishncsa log file size In-Reply-To: <509ce6aa-cfa0-3b41-e6af-f758b6ec04f1@beckspaced.com> References: <509ce6aa-cfa0-3b41-e6af-f758b6ec04f1@beckspaced.com> Message-ID: <6A7ABA19243F1E4EADD8BB1563CDDCCB8AD80D54@TIL-EXCH-05.netmatch.local> I sometimes have a few gigabytes of logfile per day. I don't notice any performance issues at all, so don't worry about that. -- With kind regards, Angelo H?ngens Systems Administrator ------------------------------------------ NetMatch travel technology solutions Professor Donderstraat 46 5017 HL Tilburg T: +31 (0)13 5811088 F: +31 (0)13 5821239 mailto:A.Hongens at netmatch.nl http://www.netmatch.nl ------------------------------------------ Disclaimer Deze e-mail is vertrouwelijk en uitsluitend bedoeld voor geadresseerde(n) en de organisatie van geadresseerde(n) en mag niet openbaar worden gemaakt aan derde partijen This e-mail is confidential and may not be disclosed to third parties since this e-mail is only intended for the addressee and the organization the addressee represents. -----Original Message----- From: varnish-misc-bounces+a.hongens=netmatch.nl at varnish-cache.org [mailto:varnish-misc-bounces+a.hongens=netmatch.nl at varnish-cache.org] On Behalf Of Admin Beckspaced Sent: Monday, 5 December, 2016 18:37 To: varnish-misc at varnish-cache.org Subject: Varnishncsa log file size Hello Varnish-Users, recently I have setup varnish logging via varnishncsa. If I look at the varnish log file size it's currently around 50 MB per day which gets rotate daily via logrotate. How big can the log file size grow without any performance issues? Is 100 MB still a valid log file size? even 500 MB perhaps? The more domains I add the bigger the log file will grow per day. Any hints are welcome ;) Thanks & greetings Becki _______________________________________________ varnish-misc mailing list varnish-misc at varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From cservin-varnish at cromagnon.com Mon Dec 5 19:48:03 2016 From: cservin-varnish at cromagnon.com (Craig Servin) Date: Mon, 05 Dec 2016 13:48:03 -0600 Subject: Varnishncsa log file size In-Reply-To: <509ce6aa-cfa0-3b41-e6af-f758b6ec04f1@beckspaced.com> References: <509ce6aa-cfa0-3b41-e6af-f758b6ec04f1@beckspaced.com> Message-ID: Just make sure you don't have logrotate doing a copytruncate on the varnishlog. Let it rename and then send varnishncsa a HUP and you will have no performance issues. Cheers, Craig On 2016-12-05 11:36, Admin Beckspaced wrote: > Hello Varnish-Users, > > recently I have setup varnish logging via varnishncsa. > If I look at the varnish log file size it's currently around 50 MB per > day which gets rotate daily via logrotate. > > How big can the log file size grow without any performance issues? > Is 100 MB still a valid log file size? even 500 MB perhaps? > > The more domains I add the bigger the log file will grow per day. Any > hints are welcome ;) > > Thanks & greetings > Becki > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From t.cramer at beslist.nl Wed Dec 7 08:29:40 2016 From: t.cramer at beslist.nl (Thijs Cramer) Date: Wed, 7 Dec 2016 09:29:40 +0100 Subject: Serving static html when all backends are sick Message-ID: <1481099380.8276.9.camel@beslist.nl> Given the following code: ========================================= sub vcl_synth { ? if (resp.status == 999) { ????set resp.status = 200; ????set resp.http.Content-Type = "text/html; charset=utf-8"; ????synthetic( {"Static HTML"} ); ? ? return(deliver); ? } } # Check if backend is healthy, otherwise, redirect if (!std.healthy(req.backend_hint)) { ? return (synth(999, "All backends down.")); } ========================================= Will the static page be served if ALL backends are down, or just when the specific backend it got back from the backend_hint is down? We want to be able to serve static html as a fallback if all backends are down. Is this the proper way to do this? Thanks in advance! - Thijs From justinl at arena.net Wed Dec 7 21:22:05 2016 From: justinl at arena.net (Justin Lloyd) Date: Wed, 7 Dec 2016 21:22:05 +0000 Subject: Hit ratio dropped significantly after recent upgrades Message-ID: Hello, I use Varnish on four load-balanced web servers running Apache with several virtual hosts providing MediaWiki wikis. Last week I upgraded the web servers to support upgrading MediaWiki from 1.26 to 1.27 due to PHP requirements. This meant upgrading (reinstalling) Ubuntu 12.04 to 16.04, meaning I went from PHP 5.3 to 7, Apache 2.4 + mod_php to Apache 2.4 + PHP-FPM, and Varnish 3 to 4. For the upgrade from Varnish 3 to 4, I adapted my VCL code accordingly, based on Mediawiki Varnish sample code, but for reasons I don't understand, my Varnish hit ratio dropped from around 86% to ~20-37%. This picture shows a graph of the last 10 days hit ratio calculations for the four web servers. The graph is calculated using metrics from the Collectd Varnish plugin as (conns - bereqs / conns), i.e.: asPercent( diffSeries( linux.hostname.varnish-default-connections.connections-received, linux.hostname.varnish-default-backend.http_requests-requests ), linux.hostname.varnish-default-connections.connections-received ) I've also tried the Varnish hit and miss metrics but those give similar results. So I'm not sure what changed to give such a poor hit ratio, despite otherwise good performance on large, busy wikis. It's possible it could be something in MediaWiki's cookie and/or session handling and I've been investigating that possibility as well, but I figured this would be a good place to ask, too. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From japrice at gmail.com Thu Dec 8 13:35:04 2016 From: japrice at gmail.com (Jason Price) Date: Thu, 8 Dec 2016 08:35:04 -0500 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References: Message-ID: I think we're going to need something a little more specific to go on. That is a mile of changes all at once. Finding a single request that should be cached, but isn't and producing the varnish log for that request will probably help illuminate what's going on. On Wed, Dec 7, 2016 at 4:22 PM, Justin Lloyd wrote: > Hello, > > > > I use Varnish on four load-balanced web servers running Apache with > several virtual hosts providing MediaWiki wikis. Last week I upgraded the > web servers to support upgrading MediaWiki from 1.26 to 1.27 due to PHP > requirements. This meant upgrading (reinstalling) Ubuntu 12.04 to 16.04, > meaning I went from PHP 5.3 to 7, Apache 2.4 + mod_php to Apache 2.4 + > PHP-FPM, and Varnish 3 to 4. > > > > For the upgrade from Varnish 3 to 4, I adapted my VCL code > > accordingly, based on Mediawiki Varnish sample code > , but for reasons > I don?t understand, my Varnish hit ratio dropped from around 86% to > ~20-37%. This picture shows a graph of the > last 10 days hit ratio calculations for the four web servers. The graph is > calculated using metrics from the Collectd Varnish plugin as (conns - > bereqs / conns), i.e.: > > > > asPercent( > > diffSeries( > > linux.*hostname*.varnish-default-connections.connections-received, > > linux.*hostname*.varnish-default-backend.http_requests-requests > > ), > > linux.*hostname*.varnish-default-connections.connections-received > > ) > > > > I?ve also tried the Varnish hit and miss metrics but those give similar > results. So I?m not sure what changed to give such a poor hit ratio, > despite otherwise good performance on large, busy wikis. It?s possible it > could be something in MediaWiki?s cookie and/or session handling and I?ve > been investigating that possibility as well, but I figured this would be a > good place to ask, too. > > > > Thanks, > > Justin > > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Thu Dec 8 14:36:35 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 8 Dec 2016 15:36:35 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References: Message-ID: > results. So I?m not sure what changed to give such a poor hit ratio, despite > otherwise good performance on large, busy wikis. It?s possible it could be Hello Justin, I'm afraid too many things changed at once, with so many moving parts it will be hard to find the cause. I suggest you start with transaction logs, looking essentially at what the backend is responding and how your VCL is handling it. Dridi From dridi at varni.sh Thu Dec 8 14:48:39 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 8 Dec 2016 15:48:39 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References: Message-ID: On Thu, Dec 8, 2016 at 2:35 PM, Jason Price wrote: > I think we're going to need something a little more specific to go on. That > is a mile of changes all at once. Yes: varnishlog, coffee, and a lot of patience. > Finding a single request that should be cached, but isn't and producing the > varnish log for that request will probably help illuminate what's going on. There's currently no way to query the transaction log of a specific request: https://github.com/varnishcache/varnish-cache/issues/2154 I'm just saying... Dridi From guillaume at varnish-software.com Thu Dec 8 16:08:40 2016 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Thu, 8 Dec 2016 17:08:40 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: On Thu, Dec 8, 2016 at 3:48 PM, Dridi Boukelmoune wrote: > > There's currently no way to query the transaction log of a specific > request: > https://github.com/varnishcache/varnish-cache/issues/2154 awk is love, awk is life: varnishlog -d -v -g raw | awk '$1 == XXXXXX' still better than nothing. -- Guillaume Quintard -------------- next part -------------- An HTML attachment was scrubbed... URL: From justinl at arena.net Thu Dec 8 20:26:37 2016 From: justinl at arena.net (Justin Lloyd) Date: Thu, 8 Dec 2016 20:26:37 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: I have been doing a lot of digging with varnishtop and varnishlog, and our VCL really didn?t change from this upgrade except as needed to migrate from Varnish 3 to 4. As I mentioned, our web app is MediaWiki so we don't control its caching requirements and recommendations, so what I'm trying to understand is whether the drop in the hit rate is due to some change(s) in MediaWiki's cookie and/or cache handling (e.g. via Cache-Control and Set-Cookie headers) or if something in Varnish changed that affects how it determines things. For example, a while back I had been using the Varnish hit and miss metrics in Collectd to calculate the ratio but apparently how those values are calculated with respect to purges changed so the hit ratio dropped, causing me to change the ratio calculation to use incoming connections and backend requests instead. That said, based on my varnishlog and varnishtop testing, I have a strong feeling that the biggest part of the problem is thumbnail images. If you look again at my VCL code (https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20), you can see I strip cookies from thumbnails so they should get cached, but I seem to get a lot more misses than hits when watching for thumbnail URL requests through varnishtop. I give 8 GB to Varnish and its process is typically only around 1 to 2 GB when previous it would be at 8 GB with frequent nukes and the occasional spike of expires that would temporarily eliminate nukes while memory filled up again. For what it's worth, I added the thumbnail stripping a couple of years ago due to a performance issue and it helped tremendously, so I don't know why it would become problematic with these latest upgrades. Justin -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Thursday, December 8, 2016 6:49 AM To: Jason Price Cc: Justin Lloyd ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades On Thu, Dec 8, 2016 at 2:35 PM, Jason Price wrote: > I think we're going to need something a little more specific to go on. > That is a mile of changes all at once. Yes: varnishlog, coffee, and a lot of patience. > Finding a single request that should be cached, but isn't and > producing the varnish log for that request will probably help illuminate what's going on. There's currently no way to query the transaction log of a specific request: https://github.com/varnishcache/varnish-cache/issues/2154 I'm just saying... Dridi From daghf at varnish-software.com Fri Dec 9 12:46:48 2016 From: daghf at varnish-software.com (Dag Haavi Finstad) Date: Fri, 9 Dec 2016 13:46:48 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: Hi Is this Varnish 4.1 ? We have an unsolved bug open describing something very similar, https://github.com/varnishcache/varnish-cache/issues/1859 On Thu, Dec 8, 2016 at 9:26 PM, Justin Lloyd wrote: > I have been doing a lot of digging with varnishtop and varnishlog, and our VCL really didn?t change from this upgrade except as needed to migrate from Varnish 3 to 4. As I mentioned, our web app is MediaWiki so we don't control its caching requirements and recommendations, so what I'm trying to understand is whether the drop in the hit rate is due to some change(s) in MediaWiki's cookie and/or cache handling (e.g. via Cache-Control and Set-Cookie headers) or if something in Varnish changed that affects how it determines things. For example, a while back I had been using the Varnish hit and miss metrics in Collectd to calculate the ratio but apparently how those values are calculated with respect to purges changed so the hit ratio dropped, causing me to change the ratio calculation to use incoming connections and backend requests instead. > > That said, based on my varnishlog and varnishtop testing, I have a strong feeling that the biggest part of the problem is thumbnail images. If you look again at my VCL code (https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20), you can see I strip cookies from thumbnails so they should get cached, but I seem to get a lot more misses than hits when watching for thumbnail URL requests through varnishtop. I give 8 GB to Varnish and its process is typically only around 1 to 2 GB when previous it would be at 8 GB with frequent nukes and the occasional spike of expires that would temporarily eliminate nukes while memory filled up again. For what it's worth, I added the thumbnail stripping a couple of years ago due to a performance issue and it helped tremendously, so I don't know why it would become problematic with these latest upgrades. > > Justin > > -----Original Message----- > From: Dridi Boukelmoune [mailto:dridi at varni.sh] > Sent: Thursday, December 8, 2016 6:49 AM > To: Jason Price > Cc: Justin Lloyd ; varnish-misc at varnish-cache.org > Subject: Re: Hit ratio dropped significantly after recent upgrades > > On Thu, Dec 8, 2016 at 2:35 PM, Jason Price wrote: >> I think we're going to need something a little more specific to go on. >> That is a mile of changes all at once. > > Yes: varnishlog, coffee, and a lot of patience. > >> Finding a single request that should be cached, but isn't and >> producing the varnish log for that request will probably help illuminate what's going on. > > There's currently no way to query the transaction log of a specific request: > https://github.com/varnishcache/varnish-cache/issues/2154 > > I'm just saying... > > Dridi > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -- Dag Haavi Finstad Software Developer | Varnish Software Mobile: +47 476 64 134 We Make Websites Fly! From justinl at arena.net Fri Dec 9 13:43:38 2016 From: justinl at arena.net (Justin Lloyd) Date: Fri, 9 Dec 2016 13:43:38 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: Hello! Yes, it is Varnish 4.1.1-1 from the Ubuntu 16.04 repo. I'll look at the issue you've linked and see if I can match it to our situation. Thanks! Justin -----Original Message----- From: Dag Haavi Finstad [mailto:daghf at varnish-software.com] Sent: Friday, December 9, 2016 4:47 AM To: Justin Lloyd Cc: Dridi Boukelmoune ; Jason Price ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades Hi Is this Varnish 4.1 ? We have an unsolved bug open describing something very similar, https://github.com/varnishcache/varnish-cache/issues/1859 On Thu, Dec 8, 2016 at 9:26 PM, Justin Lloyd wrote: > I have been doing a lot of digging with varnishtop and varnishlog, and our VCL really didn?t change from this upgrade except as needed to migrate from Varnish 3 to 4. As I mentioned, our web app is MediaWiki so we don't control its caching requirements and recommendations, so what I'm trying to understand is whether the drop in the hit rate is due to some change(s) in MediaWiki's cookie and/or cache handling (e.g. via Cache-Control and Set-Cookie headers) or if something in Varnish changed that affects how it determines things. For example, a while back I had been using the Varnish hit and miss metrics in Collectd to calculate the ratio but apparently how those values are calculated with respect to purges changed so the hit ratio dropped, causing me to change the ratio calculation to use incoming connections and backend requests instead. > > That said, based on my varnishlog and varnishtop testing, I have a strong feeling that the biggest part of the problem is thumbnail images. If you look again at my VCL code (https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20), you can see I strip cookies from thumbnails so they should get cached, but I seem to get a lot more misses than hits when watching for thumbnail URL requests through varnishtop. I give 8 GB to Varnish and its process is typically only around 1 to 2 GB when previous it would be at 8 GB with frequent nukes and the occasional spike of expires that would temporarily eliminate nukes while memory filled up again. For what it's worth, I added the thumbnail stripping a couple of years ago due to a performance issue and it helped tremendously, so I don't know why it would become problematic with these latest upgrades. > > Justin > > -----Original Message----- > From: Dridi Boukelmoune [mailto:dridi at varni.sh] > Sent: Thursday, December 8, 2016 6:49 AM > To: Jason Price > Cc: Justin Lloyd ; varnish-misc at varnish-cache.org > Subject: Re: Hit ratio dropped significantly after recent upgrades > > On Thu, Dec 8, 2016 at 2:35 PM, Jason Price wrote: >> I think we're going to need something a little more specific to go on. >> That is a mile of changes all at once. > > Yes: varnishlog, coffee, and a lot of patience. > >> Finding a single request that should be cached, but isn't and >> producing the varnish log for that request will probably help illuminate what's going on. > > There's currently no way to query the transaction log of a specific request: > https://github.com/varnishcache/varnish-cache/issues/2154 > > I'm just saying... > > Dridi > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -- Dag Haavi Finstad Software Developer | Varnish Software Mobile: +47 476 64 134 We Make Websites Fly! From justinl at arena.net Fri Dec 9 17:08:27 2016 From: justinl at arena.net (Justin Lloyd) Date: Fri, 9 Dec 2016 17:08:27 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: We may be seeing this issue though I can't confirm since I all of my servers are running 4.1.1 now, but I calculate my hit ratio based on the total number of Varnish connections and how many are sent to Apache, per server. Again, for reference from my first email, here's my Graphite dashboard function for generating my hit ratio graph: asPercent( diffSeries( linux.hostname.varnish-default-connections.connections-received, linux.hostname.varnish-default-backend.http_requests-requests ), linux.hostname.varnish-default-connections.connections-received ) I also look at varnishstat to confirm, here's about a 10 minute sample: NAME CURRENT CHANGE AVERAGE AVG_10 AVG_100 AVG_1000 MAIN.client_req 7937622 0.00 47.00 5.27 83.94 60.92 MAIN.backend_req 6341302 51.91 37.00 38.97 53.59 43.80 MAIN.cache_hit 2267216 0.00 13.00 1.08 25.91 18.19 MAIN.cache_miss 4906402 0.00 29.00 1.50 43.93 37.26 So I don't think our problem is a cache_hit/miss value calculation issue since the number of client vs. backend requests is similar and very underperforming. To reiterate on a point in another of my responses in this thread, I think it may be something about MediaWiki thumbnail images not being cached properly despite our current VCL in that regard not having changed from how it worked prior to the upgrade during which time we were seeing a very high (86%-ish) hit ratio from the same formula. -----Original Message----- From: varnish-misc-bounces+justinl=arena.net at varnish-cache.org [mailto:varnish-misc-bounces+justinl=arena.net at varnish-cache.org] On Behalf Of Justin Lloyd Sent: Friday, December 9, 2016 5:44 AM To: Dag Haavi Finstad Cc: varnish-misc at varnish-cache.org Subject: RE: Hit ratio dropped significantly after recent upgrades Hello! Yes, it is Varnish 4.1.1-1 from the Ubuntu 16.04 repo. I'll look at the issue you've linked and see if I can match it to our situation. Thanks! Justin -----Original Message----- From: Dag Haavi Finstad [mailto:daghf at varnish-software.com] Sent: Friday, December 9, 2016 4:47 AM To: Justin Lloyd Cc: Dridi Boukelmoune ; Jason Price ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades Hi Is this Varnish 4.1 ? We have an unsolved bug open describing something very similar, https://github.com/varnishcache/varnish-cache/issues/1859 On Thu, Dec 8, 2016 at 9:26 PM, Justin Lloyd wrote: > I have been doing a lot of digging with varnishtop and varnishlog, and our VCL really didn?t change from this upgrade except as needed to migrate from Varnish 3 to 4. As I mentioned, our web app is MediaWiki so we don't control its caching requirements and recommendations, so what I'm trying to understand is whether the drop in the hit rate is due to some change(s) in MediaWiki's cookie and/or cache handling (e.g. via Cache-Control and Set-Cookie headers) or if something in Varnish changed that affects how it determines things. For example, a while back I had been using the Varnish hit and miss metrics in Collectd to calculate the ratio but apparently how those values are calculated with respect to purges changed so the hit ratio dropped, causing me to change the ratio calculation to use incoming connections and backend requests instead. > > That said, based on my varnishlog and varnishtop testing, I have a strong feeling that the biggest part of the problem is thumbnail images. If you look again at my VCL code (https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20), you can see I strip cookies from thumbnails so they should get cached, but I seem to get a lot more misses than hits when watching for thumbnail URL requests through varnishtop. I give 8 GB to Varnish and its process is typically only around 1 to 2 GB when previous it would be at 8 GB with frequent nukes and the occasional spike of expires that would temporarily eliminate nukes while memory filled up again. For what it's worth, I added the thumbnail stripping a couple of years ago due to a performance issue and it helped tremendously, so I don't know why it would become problematic with these latest upgrades. > > Justin > > -----Original Message----- > From: Dridi Boukelmoune [mailto:dridi at varni.sh] > Sent: Thursday, December 8, 2016 6:49 AM > To: Jason Price > Cc: Justin Lloyd ; varnish-misc at varnish-cache.org > Subject: Re: Hit ratio dropped significantly after recent upgrades > > On Thu, Dec 8, 2016 at 2:35 PM, Jason Price wrote: >> I think we're going to need something a little more specific to go on. >> That is a mile of changes all at once. > > Yes: varnishlog, coffee, and a lot of patience. > >> Finding a single request that should be cached, but isn't and >> producing the varnish log for that request will probably help illuminate what's going on. > > There's currently no way to query the transaction log of a specific request: > https://github.com/varnishcache/varnish-cache/issues/2154 > > I'm just saying... > > Dridi > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -- Dag Haavi Finstad Software Developer | Varnish Software Mobile: +47 476 64 134 We Make Websites Fly! _______________________________________________ varnish-misc mailing list varnish-misc at varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From dridi at varni.sh Fri Dec 9 18:10:42 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Fri, 9 Dec 2016 19:10:42 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: > To reiterate on a point in another of my responses in this thread, I think it may be something about MediaWiki thumbnail images not being cached properly despite our current VCL in that regard not having changed from how it worked prior to the upgrade during which time we were seeing a very high (86%-ish) hit ratio from the same formula. To reiterate on a point I made on a couple occasions, it's time to give varnishlog a spin. Too much focus on VCL, and not enough on what's happening. Dridi From justinl at arena.net Fri Dec 9 19:19:19 2016 From: justinl at arena.net (Justin Lloyd) Date: Fri, 9 Dec 2016 19:19:19 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: I really am looking at what's happening as well. I have been looking at both varnishlog and varnishtop and I see a lot of thumbnail image requests being sent to the backend when there is still plenty of room for them in the cache, so even though there are a lot of thumbnail images, I shouldn't see so many backend requests for them. As I previously mentioned, I give Varnish 8 GB and it used to stay full (based on RSS usage and looking at nukes vs. expires) but now it hovers around only about 2 GB used. A related statistics is that there used to be 600-700k objects in Varnish (based on our graphs of MAIN.n_object via Collectd's varnish-default-struct.objects-object metric) but now there are only roughly 40-70k objects in Varnish at any given time. So it's definitely caching a lot fewer things than it was before the upgrade, and most of the requested URLs for requests that have cookies are for a lot of images and thumbnails. Images shouldn't be cached due to size and overall volume but thumbnails should, which is why I strip cookies from the thumbnails. These varnishtop commands break out /images and /images/thumb client requests, showing IMHO too many regular images being cached and nowhere near enough thumbnails: # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/" and not ReqURL ~ "/images/thumb"' 349.47 VCL_call HASH 349.47 VCL_call RECV 349.47 VCL_call DELIVER 207.22 VCL_call HIT 116.40 VCL_call MISS 116.30 VCL_call PASS # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/thumb"' 1859.60 VCL_call HASH 1859.60 VCL_call RECV 1859.60 VCL_call DELIVER 1424.83 VCL_call MISS 422.84 VCL_call HIT 218.82 VCL_call PASS I'm still poking around trying to correlate caching of other types of URLs based on whether or not the requests have cookies, if Cache-Control gets returned, etc. but I just wanted to reply with this info. I do appreciate the responses I'm getting! :) -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Friday, December 9, 2016 10:11 AM To: Justin Lloyd Cc: Dag Haavi Finstad ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades > To reiterate on a point in another of my responses in this thread, I think it may be something about MediaWiki thumbnail images not being cached properly despite our current VCL in that regard not having changed from how it worked prior to the upgrade during which time we were seeing a very high (86%-ish) hit ratio from the same formula. To reiterate on a point I made on a couple occasions, it's time to give varnishlog a spin. Too much focus on VCL, and not enough on what's happening. Dridi From guillaume at varnish-software.com Mon Dec 12 18:51:55 2016 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Mon, 12 Dec 2016 19:51:55 +0100 Subject: Serving static html when all backends are sick In-Reply-To: <1481099380.8276.9.camel@beslist.nl> References: <1481099380.8276.9.camel@beslist.nl> Message-ID: A director is considered healthy is any of its backend is healthy, so if you put a director in req.backend_hint, you're good. -- Guillaume Quintard On Wed, Dec 7, 2016 at 9:29 AM, Thijs Cramer wrote: > Given the following code: > > ========================================= > sub vcl_synth { > if (resp.status == 999) { > set resp.status = 200; > set resp.http.Content-Type = "text/html; charset=utf-8"; > synthetic( {"Static HTML"} ); > return(deliver); > } > } > > # Check if backend is healthy, otherwise, redirect > if (!std.healthy(req.backend_hint)) { > return (synth(999, "All backends down.")); > } > > ========================================= > > Will the static page be served if ALL backends are down, or just when > the specific backend it got back from the backend_hint is down? > > We want to be able to serve static html as a fallback if all backends > are down. > > Is this the proper way to do this? > > Thanks in advance! > > - Thijs > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -------------- next part -------------- An HTML attachment was scrubbed... URL: From justinl at arena.net Tue Dec 13 02:37:06 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 02:37:06 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: To follow up on my last email from Friday, at this point the problem boils down to one thing that I've not been able to determine: Why are far fewer things being cached now than before the upgrade? 1. Cookies don't seem to be the problem. Most appear to be Google Analytics (as opposed to session), which are being unset by vcl_recv. 2. varnishlog/varnishtop shows many thumbnail URLs being missed and virtually none are requested with a no-cache cache-control header. Is it possible to use these tools determine if they (or any URLs for that matter) are being cached following a miss-deliver sequence? There are about 1.5m thumbnail files totaling around 30 GB, which prior to the upgrades wasn't an issue, and I don't think it is now since there are only a few expires and purges per minute and no nukes at all. Varnish is only using about 2 GB out of the 8 GB allocated to it, where it used to use all 8 GB and have lots of nukes and far fewer expires, so it's not a memory constraint. Could there be some other resource limitation I'm hitting without knowing it (nothing in any logs I've seen)? Everything else I could think of so far seems fine, e.g. open files, threads, tcp connections. -----Original Message----- From: varnish-misc-bounces+justinl=arena.net at varnish-cache.org [mailto:varnish-misc-bounces+justinl=arena.net at varnish-cache.org] On Behalf Of Justin Lloyd Sent: Friday, December 9, 2016 11:19 AM To: Dridi Boukelmoune Cc: varnish-misc at varnish-cache.org Subject: RE: Hit ratio dropped significantly after recent upgrades I really am looking at what's happening as well. I have been looking at both varnishlog and varnishtop and I see a lot of thumbnail image requests being sent to the backend when there is still plenty of room for them in the cache, so even though there are a lot of thumbnail images, I shouldn't see so many backend requests for them. As I previously mentioned, I give Varnish 8 GB and it used to stay full (based on RSS usage and looking at nukes vs. expires) but now it hovers around only about 2 GB used. A related statistics is that there used to be 600-700k objects in Varnish (based on our graphs of MAIN.n_object via Collectd's varnish-default-struct.objects-object metric) but now there are only roughly 40-70k objects in Varnish at any given time. So it's definitely caching a lot fewer things than it was before the upgrade, and most of the requested URLs for requests that have cookies are for a lot of images and thumbnails. Images shouldn't be cached due to size and overall volume but thumbnails should, which is why I strip cookies from the thumbnails. These varnishtop commands break out /images and /images/thumb client requests, showing IMHO too many regular images being cached and nowhere near enough thumbnails: # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/" and not ReqURL ~ "/images/thumb"' 349.47 VCL_call HASH 349.47 VCL_call RECV 349.47 VCL_call DELIVER 207.22 VCL_call HIT 116.40 VCL_call MISS 116.30 VCL_call PASS # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/thumb"' 1859.60 VCL_call HASH 1859.60 VCL_call RECV 1859.60 VCL_call DELIVER 1424.83 VCL_call MISS 422.84 VCL_call HIT 218.82 VCL_call PASS I'm still poking around trying to correlate caching of other types of URLs based on whether or not the requests have cookies, if Cache-Control gets returned, etc. but I just wanted to reply with this info. I do appreciate the responses I'm getting! :) -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Friday, December 9, 2016 10:11 AM To: Justin Lloyd Cc: Dag Haavi Finstad ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades > To reiterate on a point in another of my responses in this thread, I think it may be something about MediaWiki thumbnail images not being cached properly despite our current VCL in that regard not having changed from how it worked prior to the upgrade during which time we were seeing a very high (86%-ish) hit ratio from the same formula. To reiterate on a point I made on a couple occasions, it's time to give varnishlog a spin. Too much focus on VCL, and not enough on what's happening. Dridi _______________________________________________ varnish-misc mailing list varnish-misc at varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From dridi at varni.sh Tue Dec 13 08:16:05 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 13 Dec 2016 09:16:05 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: > Could there be some other resource limitation I'm hitting without knowing it (nothing in any logs I've seen)? Everything else I could think of so far seems fine, e.g. open files, threads, tcp connections. We can't tell if we can't have a look at a transaction's log so please find one thumbnail's req+bereq log with varnishlog and post it on the list. Check that it doesn't contain sensitive information and otherwise sanitize it before posting. Dridi From guillaume at varnish-software.com Tue Dec 13 08:17:13 2016 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Tue, 13 Dec 2016 09:17:13 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: Can you pastebin the req+bereq transactions in varnishlog, related to such a miss? -- Guillaume Quintard On Tue, Dec 13, 2016 at 3:37 AM, Justin Lloyd wrote: > To follow up on my last email from Friday, at this point the problem boils > down to one thing that I've not been able to determine: Why are far fewer > things being cached now than before the upgrade? > > 1. Cookies don't seem to be the problem. Most appear to be Google > Analytics (as opposed to session), which are being unset by vcl_recv. > > 2. varnishlog/varnishtop shows many thumbnail URLs being missed and > virtually none are requested with a no-cache cache-control header. Is it > possible to use these tools determine if they (or any URLs for that matter) > are being cached following a miss-deliver sequence? There are about 1.5m > thumbnail files totaling around 30 GB, which prior to the upgrades wasn't > an issue, and I don't think it is now since there are only a few expires > and purges per minute and no nukes at all. Varnish is only using about 2 GB > out of the 8 GB allocated to it, where it used to use all 8 GB and have > lots of nukes and far fewer expires, so it's not a memory constraint. > > Could there be some other resource limitation I'm hitting without knowing > it (nothing in any logs I've seen)? Everything else I could think of so far > seems fine, e.g. open files, threads, tcp connections. > > > -----Original Message----- > From: varnish-misc-bounces+justinl=arena.net at varnish-cache.org [mailto: > varnish-misc-bounces+justinl=arena.net at varnish-cache.org] On Behalf Of > Justin Lloyd > Sent: Friday, December 9, 2016 11:19 AM > To: Dridi Boukelmoune > Cc: varnish-misc at varnish-cache.org > Subject: RE: Hit ratio dropped significantly after recent upgrades > > I really am looking at what's happening as well. I have been looking at > both varnishlog and varnishtop and I see a lot of thumbnail image requests > being sent to the backend when there is still plenty of room for them in > the cache, so even though there are a lot of thumbnail images, I shouldn't > see so many backend requests for them. As I previously mentioned, I give > Varnish 8 GB and it used to stay full (based on RSS usage and looking at > nukes vs. expires) but now it hovers around only about 2 GB used. A related > statistics is that there used to be 600-700k objects in Varnish (based on > our graphs of MAIN.n_object via Collectd's varnish-default-struct.objects-object > metric) but now there are only roughly 40-70k objects in Varnish at any > given time. So it's definitely caching a lot fewer things than it was > before the upgrade, and most of the requested URLs for requests that have > cookies are for a lot of images and thumbnails. Images shouldn't be cached > due to size and overall volume but thumbnails should, which is why I strip > cookies from the thumbnails. These varnishtop commands break out /images > and /images/thumb client requests, showing IMHO too many regular images > being cached and nowhere near enough thumbnails: > > # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/" and not ReqURL ~ > "/images/thumb"' > > 349.47 VCL_call HASH > 349.47 VCL_call RECV > 349.47 VCL_call DELIVER > 207.22 VCL_call HIT > 116.40 VCL_call MISS > 116.30 VCL_call PASS > > # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/thumb"' > > 1859.60 VCL_call HASH > 1859.60 VCL_call RECV > 1859.60 VCL_call DELIVER > 1424.83 VCL_call MISS > 422.84 VCL_call HIT > 218.82 VCL_call PASS > > I'm still poking around trying to correlate caching of other types of URLs > based on whether or not the requests have cookies, if Cache-Control gets > returned, etc. but I just wanted to reply with this info. I do appreciate > the responses I'm getting! :) > > > -----Original Message----- > From: Dridi Boukelmoune [mailto:dridi at varni.sh] > Sent: Friday, December 9, 2016 10:11 AM > To: Justin Lloyd > Cc: Dag Haavi Finstad ; > varnish-misc at varnish-cache.org > Subject: Re: Hit ratio dropped significantly after recent upgrades > > > To reiterate on a point in another of my responses in this thread, I > think it may be something about MediaWiki thumbnail images not being cached > properly despite our current VCL in that regard not having changed from how > it worked prior to the upgrade during which time we were seeing a very high > (86%-ish) hit ratio from the same formula. > > To reiterate on a point I made on a couple occasions, it's time to give > varnishlog a spin. Too much focus on VCL, and not enough on what's > happening. > > Dridi > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From justinl at arena.net Tue Dec 13 14:23:56 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 14:23:56 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: Here?s a typical varnishlog miss for a thumbnail image, appropriately sanitized. I can provide more if it helps https://gist.github.com/Calygos/ca7906da005569046a7031d1fcaa6372 From: Guillaume Quintard [mailto:guillaume at varnish-software.com] Sent: Tuesday, December 13, 2016 12:17 AM To: Justin Lloyd Cc: Dridi Boukelmoune ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades Can you pastebin the req+bereq transactions in varnishlog, related to such a miss? -- Guillaume Quintard On Tue, Dec 13, 2016 at 3:37 AM, Justin Lloyd > wrote: To follow up on my last email from Friday, at this point the problem boils down to one thing that I've not been able to determine: Why are far fewer things being cached now than before the upgrade? 1. Cookies don't seem to be the problem. Most appear to be Google Analytics (as opposed to session), which are being unset by vcl_recv. 2. varnishlog/varnishtop shows many thumbnail URLs being missed and virtually none are requested with a no-cache cache-control header. Is it possible to use these tools determine if they (or any URLs for that matter) are being cached following a miss-deliver sequence? There are about 1.5m thumbnail files totaling around 30 GB, which prior to the upgrades wasn't an issue, and I don't think it is now since there are only a few expires and purges per minute and no nukes at all. Varnish is only using about 2 GB out of the 8 GB allocated to it, where it used to use all 8 GB and have lots of nukes and far fewer expires, so it's not a memory constraint. Could there be some other resource limitation I'm hitting without knowing it (nothing in any logs I've seen)? Everything else I could think of so far seems fine, e.g. open files, threads, tcp connections. -----Original Message----- From: varnish-misc-bounces+justinl=arena.net at varnish-cache.org [mailto:varnish-misc-bounces+justinl=arena.net at varnish-cache.org] On Behalf Of Justin Lloyd Sent: Friday, December 9, 2016 11:19 AM To: Dridi Boukelmoune > Cc: varnish-misc at varnish-cache.org Subject: RE: Hit ratio dropped significantly after recent upgrades I really am looking at what's happening as well. I have been looking at both varnishlog and varnishtop and I see a lot of thumbnail image requests being sent to the backend when there is still plenty of room for them in the cache, so even though there are a lot of thumbnail images, I shouldn't see so many backend requests for them. As I previously mentioned, I give Varnish 8 GB and it used to stay full (based on RSS usage and looking at nukes vs. expires) but now it hovers around only about 2 GB used. A related statistics is that there used to be 600-700k objects in Varnish (based on our graphs of MAIN.n_object via Collectd's varnish-default-struct.objects-object metric) but now there are only roughly 40-70k objects in Varnish at any given time. So it's definitely caching a lot fewer things than it was before the upgrade, and most of the requested URLs for requests that have cookies are for a lot of images and thumbnails. Images shouldn't be cached due to size and overall volume but thumbnails should, which is why I strip cookies from the thumbnails. These varnishtop commands break out /images and /images/thumb client requests, showing IMHO too many regular images being cached and nowhere near enough thumbnails: # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/" and not ReqURL ~ "/images/thumb"' 349.47 VCL_call HASH 349.47 VCL_call RECV 349.47 VCL_call DELIVER 207.22 VCL_call HIT 116.40 VCL_call MISS 116.30 VCL_call PASS # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/thumb"' 1859.60 VCL_call HASH 1859.60 VCL_call RECV 1859.60 VCL_call DELIVER 1424.83 VCL_call MISS 422.84 VCL_call HIT 218.82 VCL_call PASS I'm still poking around trying to correlate caching of other types of URLs based on whether or not the requests have cookies, if Cache-Control gets returned, etc. but I just wanted to reply with this info. I do appreciate the responses I'm getting! :) -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Friday, December 9, 2016 10:11 AM To: Justin Lloyd > Cc: Dag Haavi Finstad >; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades > To reiterate on a point in another of my responses in this thread, I think it may be something about MediaWiki thumbnail images not being cached properly despite our current VCL in that regard not having changed from how it worked prior to the upgrade during which time we were seeing a very high (86%-ish) hit ratio from the same formula. To reiterate on a point I made on a couple occasions, it's time to give varnishlog a spin. Too much focus on VCL, and not enough on what's happening. Dridi _______________________________________________ varnish-misc mailing list varnish-misc at varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc _______________________________________________ varnish-misc mailing list varnish-misc at varnish-cache.org https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Tue Dec 13 14:39:52 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 13 Dec 2016 15:39:52 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: On Tue, Dec 13, 2016 at 3:23 PM, Justin Lloyd wrote: > Here?s a typical varnishlog miss for a thumbnail image, appropriately > sanitized. I can provide more if it helps > > https://gist.github.com/Calygos/ca7906da005569046a7031d1fcaa6372 After a quick look at the log, I was able to find which line [1] of VCL is responsible for this behavior. Your backend doesn't say that the resource is public, so you may have to configure mediawiki to mark thumbnails as public. This is not something I can help with. Cheers [1] https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20#file-wiki-vcl-L144 From justinl at arena.net Tue Dec 13 14:46:41 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 14:46:41 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: Hmm, interesting. I've never made the connection to that line, which is part of the recommended Varnish configuration for MediaWiki[1], but it's not been a problem in the past as far as I know. I'll definitely investigate this more deeply. Thank you! [1] https://www.mediawiki.org/wiki/Manual:Varnish_caching -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Tuesday, December 13, 2016 6:40 AM To: Justin Lloyd Cc: Guillaume Quintard ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades On Tue, Dec 13, 2016 at 3:23 PM, Justin Lloyd wrote: > Here?s a typical varnishlog miss for a thumbnail image, appropriately > sanitized. I can provide more if it helps > > https://gist.github.com/Calygos/ca7906da005569046a7031d1fcaa6372 After a quick look at the log, I was able to find which line [1] of VCL is responsible for this behavior. Your backend doesn't say that the resource is public, so you may have to configure mediawiki to mark thumbnails as public. This is not something I can help with. Cheers [1] https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20#file-wiki-vcl-L144 From justinl at arena.net Tue Dec 13 15:01:59 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 15:01:59 +0000 Subject: Hit ratio dropped significantly after recent upgrades References:

Message-ID: Actually, unless I'm missing something, that line would only matter for requests with Authorization headers, which none of our responses have since we don?t use http authentication. -----Original Message----- From: Justin Lloyd Sent: Tuesday, December 13, 2016 6:47 AM To: 'Dridi Boukelmoune' Cc: Guillaume Quintard ; varnish-misc at varnish-cache.org Subject: RE: Hit ratio dropped significantly after recent upgrades Hmm, interesting. I've never made the connection to that line, which is part of the recommended Varnish configuration for MediaWiki[1], but it's not been a problem in the past as far as I know. I'll definitely investigate this more deeply. Thank you! [1] https://www.mediawiki.org/wiki/Manual:Varnish_caching -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Tuesday, December 13, 2016 6:40 AM To: Justin Lloyd Cc: Guillaume Quintard ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades On Tue, Dec 13, 2016 at 3:23 PM, Justin Lloyd wrote: > Here?s a typical varnishlog miss for a thumbnail image, appropriately > sanitized. I can provide more if it helps > > https://gist.github.com/Calygos/ca7906da005569046a7031d1fcaa6372 After a quick look at the log, I was able to find which line [1] of VCL is responsible for this behavior. Your backend doesn't say that the resource is public, so you may have to configure mediawiki to mark thumbnails as public. This is not something I can help with. Cheers [1] https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20#file-wiki-vcl-L144 From dridi at varni.sh Tue Dec 13 15:10:01 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 13 Dec 2016 16:10:01 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: On Tue, Dec 13, 2016 at 4:01 PM, Justin Lloyd wrote: > Actually, unless I'm missing something, that line would only matter for requests with Authorization headers, which none of our responses have since we don?t use http authentication. I still have trouble getting the hang of reading. And I had seen this one but for some reason missed it... From dridi at varni.sh Tue Dec 13 15:13:02 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 13 Dec 2016 16:13:02 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: > I still have trouble getting the hang of reading. And I had seen this > one but for some reason missed it... Actually, this looks like the response is properly cached, otherwise it'd be using the Transient storage. Dridi From justinl at arena.net Tue Dec 13 15:20:02 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 15:20:02 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: That's really the crux of my problem, as I tried to summarize in my last big email though it was longer than it probably should have been. Is it possible to determine from varnishlog whether/which requests are being missed and fetched but not cached? There's plenty of room in the cache, so I'm at a loss to determine why more stuff isn't being or staying cached long enough to result in a good hit rate like I saw prior to my recent upgrades. I get the feeling that this will turn out to be either something very simple in the Varnish and/or MediaWiki configuration or else a bug somewhere, though I'd lean towards the former as being more likely. -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Tuesday, December 13, 2016 7:13 AM To: Justin Lloyd Cc: Guillaume Quintard ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades > I still have trouble getting the hang of reading. And I had seen this > one but for some reason missed it... Actually, this looks like the response is properly cached, otherwise it'd be using the Transient storage. Dridi From dridi at varni.sh Tue Dec 13 15:42:01 2016 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 13 Dec 2016 16:42:01 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: On Tue, Dec 13, 2016 at 4:20 PM, Justin Lloyd wrote: > That's really the crux of my problem, as I tried to summarize in my last big email though it was longer than it probably should have been. Is it possible to determine from varnishlog whether/which requests are being missed and fetched but not cached? There are more things you can see with varnishlog like hashes for instance, but you could start by logging markers in your VCL to see which branches you are taking. Example: if (beresp.ttl <= 0s) { std.log("negative ttl"); set beresp.uncacheable = true; return (deliver); } And do the same for all branches, and see which code is executed. > I get the feeling that this will turn out to be either something very simple in the Varnish and/or MediaWiki configuration or else a bug somewhere, though I'd lean towards the former as being more likely. If you can reproduce it in lab conditions where you are the only user, look for a transaction containing: - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call PASS And then look for the previous transaction on the same resource. Dridi From justinl at arena.net Tue Dec 13 16:36:51 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 16:36:51 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: Yep! Getting tons of uncacheable responses, including lots of /images/thumb URLs, due to negative TTLs. Not sure yet what the best way to handle this is since I'm sure this is done for a reason, so I'll need to dig back in to the MW code and possibly hit up the devs on IRC again. Thanks! I'll let you know how this goes. Justin -----Original Message----- From: Dridi Boukelmoune [mailto:dridi at varni.sh] Sent: Tuesday, December 13, 2016 7:42 AM To: Justin Lloyd Cc: Guillaume Quintard ; varnish-misc at varnish-cache.org Subject: Re: Hit ratio dropped significantly after recent upgrades On Tue, Dec 13, 2016 at 4:20 PM, Justin Lloyd wrote: > That's really the crux of my problem, as I tried to summarize in my last big email though it was longer than it probably should have been. Is it possible to determine from varnishlog whether/which requests are being missed and fetched but not cached? There are more things you can see with varnishlog like hashes for instance, but you could start by logging markers in your VCL to see which branches you are taking. Example: if (beresp.ttl <= 0s) { std.log("negative ttl"); set beresp.uncacheable = true; return (deliver); } And do the same for all branches, and see which code is executed. > I get the feeling that this will turn out to be either something very simple in the Varnish and/or MediaWiki configuration or else a bug somewhere, though I'd lean towards the former as being more likely. If you can reproduce it in lab conditions where you are the only user, look for a transaction containing: - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call PASS And then look for the previous transaction on the same resource. Dridi From fgtham at gmail.com Tue Dec 13 20:13:18 2016 From: fgtham at gmail.com (Florian Tham) Date: Tue, 13 Dec 2016 21:13:18 +0100 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: References:

Message-ID: <158f9d15f30.275e.6038e5157c3d20148272845607c0f0ea@gmail.com> The log shows that the fetched object is introduced into the cache with both TTL and grace time set to 120s each: -- VCL_call BACKEND_RESPONSE -- TTL VCL 120 120 0 1481637557 -- VCL_return deliver -- Storage malloc s0 It would be interesting to see if a subsequent request to the same URL within less than 4 minutes would yield another miss or not. Regards, Florian Am 13. Dezember 2016 15:27:16 schrieb Justin Lloyd : > Here?s a typical varnishlog miss for a thumbnail image, appropriately > sanitized. I can provide more if it helps > > https://gist.github.com/Calygos/ca7906da005569046a7031d1fcaa6372 > > > From: Guillaume Quintard [mailto:guillaume at varnish-software.com] > Sent: Tuesday, December 13, 2016 12:17 AM > To: Justin Lloyd > Cc: Dridi Boukelmoune ; varnish-misc at varnish-cache.org > Subject: Re: Hit ratio dropped significantly after recent upgrades > > Can you pastebin the req+bereq transactions in varnishlog, related to such > a miss? > > -- > Guillaume Quintard > > On Tue, Dec 13, 2016 at 3:37 AM, Justin Lloyd > > wrote: > To follow up on my last email from Friday, at this point the problem boils > down to one thing that I've not been able to determine: Why are far fewer > things being cached now than before the upgrade? > > 1. Cookies don't seem to be the problem. Most appear to be Google Analytics > (as opposed to session), which are being unset by vcl_recv. > > 2. varnishlog/varnishtop shows many thumbnail URLs being missed and > virtually none are requested with a no-cache cache-control header. Is it > possible to use these tools determine if they (or any URLs for that matter) > are being cached following a miss-deliver sequence? There are about 1.5m > thumbnail files totaling around 30 GB, which prior to the upgrades wasn't > an issue, and I don't think it is now since there are only a few expires > and purges per minute and no nukes at all. Varnish is only using about 2 GB > out of the 8 GB allocated to it, where it used to use all 8 GB and have > lots of nukes and far fewer expires, so it's not a memory constraint. > > Could there be some other resource limitation I'm hitting without knowing > it (nothing in any logs I've seen)? Everything else I could think of so far > seems fine, e.g. open files, threads, tcp connections. > > > -----Original Message----- > From: > varnish-misc-bounces+justinl=arena.net at varnish-cache.org > [mailto:varnish-misc-bounces+justinl=arena.net at varnish-cache.org] > On Behalf Of Justin Lloyd > Sent: Friday, December 9, 2016 11:19 AM > To: Dridi Boukelmoune > > Cc: varnish-misc at varnish-cache.org > Subject: RE: Hit ratio dropped significantly after recent upgrades > > I really am looking at what's happening as well. I have been looking at > both varnishlog and varnishtop and I see a lot of thumbnail image requests > being sent to the backend when there is still plenty of room for them in > the cache, so even though there are a lot of thumbnail images, I shouldn't > see so many backend requests for them. As I previously mentioned, I give > Varnish 8 GB and it used to stay full (based on RSS usage and looking at > nukes vs. expires) but now it hovers around only about 2 GB used. A related > statistics is that there used to be 600-700k objects in Varnish (based on > our graphs of MAIN.n_object via Collectd's > varnish-default-struct.objects-object metric) but now there are only > roughly 40-70k objects in Varnish at any given time. So it's definitely > caching a lot fewer things than it was before the upgrade, and most of the > requested URLs for requests that have cookies are for a lot of images and > thumbnails. Images shouldn't be cached due to size and overall volume but > thumbnails should, which is why I strip cookies from the thumbnails. These > varnishtop commands break out /images and /images/thumb client requests, > showing IMHO too many regular images being cached and nowhere near enough > thumbnails: > > # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/" and not ReqURL ~ > "/images/thumb"' > > 349.47 VCL_call HASH > 349.47 VCL_call RECV > 349.47 VCL_call DELIVER > 207.22 VCL_call HIT > 116.40 VCL_call MISS > 116.30 VCL_call PASS > > # varnishtop -c -i VCL_call -q 'ReqURL ~ "/images/thumb"' > > 1859.60 VCL_call HASH > 1859.60 VCL_call RECV > 1859.60 VCL_call DELIVER > 1424.83 VCL_call MISS > 422.84 VCL_call HIT > 218.82 VCL_call PASS > > I'm still poking around trying to correlate caching of other types of URLs > based on whether or not the requests have cookies, if Cache-Control gets > returned, etc. but I just wanted to reply with this info. I do appreciate > the responses I'm getting! :) > > > -----Original Message----- > From: Dridi Boukelmoune [mailto:dridi at varni.sh] > Sent: Friday, December 9, 2016 10:11 AM > To: Justin Lloyd > > Cc: Dag Haavi Finstad > >; > varnish-misc at varnish-cache.org > Subject: Re: Hit ratio dropped significantly after recent upgrades > >> To reiterate on a point in another of my responses in this thread, I think >> it may be something about MediaWiki thumbnail images not being cached >> properly despite our current VCL in that regard not having changed from how >> it worked prior to the upgrade during which time we were seeing a very high >> (86%-ish) hit ratio from the same formula. > > To reiterate on a point I made on a couple occasions, it's time to give > varnishlog a spin. Too much focus on VCL, and not enough on what's happening. > > Dridi > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > > > ---------- > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From justinl at arena.net Tue Dec 13 20:19:12 2016 From: justinl at arena.net (Justin Lloyd) Date: Tue, 13 Dec 2016 20:19:12 +0000 Subject: Hit ratio dropped significantly after recent upgrades In-Reply-To: <158f9d15f30.275e.6038e5157c3d20148272845607c0f0ea@gmail.com> References: