From bluethundr at gmail.com Fri Jul 3 20:00:23 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Fri, 3 Jul 2015 16:00:23 -0400 Subject: cache pages with apache auth In-Reply-To: <5590F358.9060209@lamp-solutions.de> References: <5590F358.9060209@lamp-solutions.de> Message-ID: Hey guys, Thanks for your suggestions!! However, I'm still not having any luck. This is what I tried. I logged into one of my webservers and then altered the config so that it won't require authorization to access the healthcheck URL: ServerName wiki.mydomain.com ServerAlias www.wiki.mydomain.com Options +Indexes +FollowSymlinks LogLevel debug ErrorLog logs/wiki-error.log LogFormat "%h %l %u %t \"%r\" %>s %b" common CustomLog logs/wiki-access_log common DocumentRoot /var/www/jf/wiki *SetEnvIf Request_URI ^/healthcheck.php/ noauth=1* Options Indexes AuthType Basic AuthName "JF Wiki Page" AuthUserFile /etc/httpd/auth Require valid-user * Allow from env=noauth* Options -Indexes I've highlighted in bold what I changed in the config. Then copied the config over to the other web server and restarted apache on both. Both of the web hosts are still turning up as 'sick' in the varnish log: 0 Backend_health - web2 Went sick 4--X-R- 2 3 5 0.015589 0.000000 HTTP/1.1 401 Unauthorized 0 Backend_health - web1 Went sick 4--X-R- 2 3 5 0.045081 0.000000 HTTP/1.1 404 Not Found So then I tried Paul's suggestion to tell varnish to expect a 401 response for the probe. I'm not sure if I interpreted this request correctly, but this is what I tried: if ( req.url ~ "^/healthcheck.php") { error 401; } And there was no change in the result after restarting varnish. Both web servers are still turning up 'sick' in the varnish log. Could I get some advice on where I'm going wrong here? And maybe if there is another approach I could try, I'd be up for trying anything that might work. Thanks again for your help!! Tim On Mon, Jun 29, 2015 at 3:27 AM, Tobias Eichelbr?nner < tobias.eichelbroenner at lamp-solutions.de> wrote: > Hi Tim, > > > Backend_health - web2 Still sick 4--X-R- 0 3 5 0.014946 0.000000 > > HTTP/1.1 401 Unauthorized > > seems to me your healthcheck on > .url = "/healthcheck.php"; > does not send any authorization to your backend, so the probing fails. > The most simple solution is the disable authorization for > healthcheck.php in you Webserver. > > Keep in mind that if more then one user access your restricted area they > probably get the cached contend from the other user delivered. You could > put authorization header into the hash in give every user a different > password. > > Sincerely, > > Tobias > > -- > LAMP solutions GmbH > Gostenhofer Hauptstrasse 35 > 90443 Nuernberg > > Amtsgericht Nuernberg: HRB 22366 > Geschaeftsfuehrer: Heiko Schubert > > Es gelten unsere allgemeinen Geschaeftsbedingungen. > http://www.lamp-solutions.de/agbs/ > > Telefon : 0911 / 376 516 0 > Fax : 0911 / 376 516 11 > E-Mail : support at lamp-solutions.de > Web : www.lamp-solutions.de > Facebook : http://www.facebook.com/LAMPsolutions > Twitter : http://twitter.com/#!/lampsolutions > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluethundr at gmail.com Sat Jul 4 01:29:13 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Fri, 3 Jul 2015 21:29:13 -0400 Subject: cache pages with apache auth In-Reply-To: References: <5590F358.9060209@lamp-solutions.de> Message-ID: Hey guys, I was actually able to get this to work. So I wanted to share my solution with you. I had to change my probe definition to put the http headers into the request for the probe: backend web1 { .host = "10.10.10.25"; .port = "80"; .connect_timeout = 30s; .first_byte_timeout = 30s; .between_bytes_timeout = 30s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.mydomain.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } And it still works with apache auth! In my googling, I found this old varnish ticket: https://www.varnish-cache.org/trac/ticket/1165 That describes how to get apache auth working with varnish by passing the headers to .request. And since I'm only using this for a mediawiki, I found a good example VCL on their site that they recommend for using with mediawiki. I've adapted it to my uses and it seems to be doing a good job of caching the site. I'm really glad this works and I appreciate your input and feedback. Thanks, Tim On Fri, Jul 3, 2015 at 4:00 PM, Tim Dunphy wrote: > Hey guys, > > Thanks for your suggestions!! However, I'm still not having any luck. > This is what I tried. I logged into one of my webservers and then altered > the config so that it won't require authorization to access the healthcheck > URL: > > > ServerName wiki.mydomain.com > ServerAlias www.wiki.mydomain.com > Options +Indexes +FollowSymlinks > LogLevel debug > ErrorLog logs/wiki-error.log > LogFormat "%h %l %u %t \"%r\" %>s %b" common > CustomLog logs/wiki-access_log common > DocumentRoot /var/www/jf/wiki > *SetEnvIf Request_URI ^/healthcheck.php/ noauth=1* > > > Options Indexes > AuthType Basic > AuthName "JF Wiki Page" > AuthUserFile /etc/httpd/auth > Require valid-user > * Allow from env=noauth* > > > > Options -Indexes > > > > I've highlighted in bold what I changed in the config. Then copied the > config over to the other web server and restarted apache on both. > > Both of the web hosts are still turning up as 'sick' in the varnish log: > > 0 Backend_health - web2 Went sick 4--X-R- 2 3 5 0.015589 0.000000 > HTTP/1.1 401 Unauthorized > 0 Backend_health - web1 Went sick 4--X-R- 2 3 5 0.045081 0.000000 > HTTP/1.1 404 Not Found > > So then I tried Paul's suggestion to tell varnish to expect a 401 response > for the probe. I'm not sure if I interpreted this request correctly, but > this is what I tried: > > if ( req.url ~ "^/healthcheck.php") { > error 401; > } > > And there was no change in the result after restarting varnish. Both web > servers are still turning up 'sick' in the varnish log. > > Could I get some advice on where I'm going wrong here? And maybe if there > is another approach I could try, I'd be up for trying anything that might > work. > > Thanks again for your help!! > > Tim > > On Mon, Jun 29, 2015 at 3:27 AM, Tobias Eichelbr?nner < > tobias.eichelbroenner at lamp-solutions.de> wrote: > >> Hi Tim, >> >> > Backend_health - web2 Still sick 4--X-R- 0 3 5 0.014946 0.000000 >> > HTTP/1.1 401 Unauthorized >> >> seems to me your healthcheck on >> .url = "/healthcheck.php"; >> does not send any authorization to your backend, so the probing fails. >> The most simple solution is the disable authorization for >> healthcheck.php in you Webserver. >> >> Keep in mind that if more then one user access your restricted area they >> probably get the cached contend from the other user delivered. You could >> put authorization header into the hash in give every user a different >> password. >> >> Sincerely, >> >> Tobias >> >> -- >> LAMP solutions GmbH >> Gostenhofer Hauptstrasse 35 >> 90443 Nuernberg >> >> Amtsgericht Nuernberg: HRB 22366 >> Geschaeftsfuehrer: Heiko Schubert >> >> Es gelten unsere allgemeinen Geschaeftsbedingungen. >> http://www.lamp-solutions.de/agbs/ >> >> Telefon : 0911 / 376 516 0 >> Fax : 0911 / 376 516 11 >> E-Mail : support at lamp-solutions.de >> Web : www.lamp-solutions.de >> Facebook : http://www.facebook.com/LAMPsolutions >> Twitter : http://twitter.com/#!/lampsolutions >> >> _______________________________________________ >> varnish-misc mailing list >> varnish-misc at varnish-cache.org >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc >> > > > > -- > GPG me!! > > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B > > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluethundr at gmail.com Sun Jul 5 00:54:28 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Sat, 4 Jul 2015 20:54:28 -0400 Subject: 503 error from varnish Message-ID: Hey guys, I've setup my wiki to be delivered through varnish. My first breakthrough in getting this to work was to allow for http basic auth to pass through varnish so the user will be prompted to login. This is working quite well at this point. But my current problem is that I've been able to get the main page to the site cached, and a few others. But when I click to like the 4th or 5th page on the wiki that I want to cache through varnish, everything comes to a halt. I get a 503 - service unavailable error, instead of varnish going to the web server to retrieve the page I had clicked on. In my varnishlog, I'm seeing the following which results in an error: 10 SessionOpen c 10.10.10.25 35852 :80 10 ReqStart c 10.10.10.25 35852 1369002554 10 RxRequest c GET 10 RxURL c /index.php/Affluencers 10 RxProtocol c HTTP/1.1 10 RxHeader c Host: wiki.example.com 10 RxHeader c Cache-Control: max-age=0 10 RxHeader c Authorization: Basic YWRtaW46RHVrMzBmWmgwdQ== 10 RxHeader c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 10 RxHeader c User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36 10 RxHeader c Referer: http://wiki.example.com/index.php/NBCUniversal 10 RxHeader c Accept-Encoding: gzip, deflate, sdch 10 RxHeader c Accept-Language: en-US,en;q=0.8 10 RxHeader c X-Forwarded-For: 47.18.111.100 10 RxHeader c Connection: close 10 VCL_call c recv lookup 10 VCL_call c hash 10 Hash c /index.php/Affluencers 10 Hash c wiki.example.com 10 VCL_return c hash 10 VCL_call c miss fetch 10 FetchError c no backend connection 10 VCL_call c error deliver 10 VCL_call c deliver deliver 10 TxProtocol c HTTP/1.1 10 TxStatus c 503 10 TxResponse c Service Unavailable 10 TxHeader c Server: Varnish 10 TxHeader c Content-Type: text/html; charset=utf-8 10 TxHeader c Retry-After: 5 10 TxHeader c Content-Length: 419 10 TxHeader c Accept-Ranges: bytes 10 TxHeader c Date: Sun, 05 Jul 2015 00:42:16 GMT 10 TxHeader c X-Varnish: 1369002554 10 TxHeader c Age: 0 10 TxHeader c Via: 1.1 varnish 10 TxHeader c Connection: close 10 TxHeader c X-Cache: MISS 10 Length c 419 10 ReqEnd c 1369002554 1436056936.048977137 1436056936.049291372 0.000086308 0.000152111 0.000162125 10 SessionClose c error 10 StatSess c 10.10.10.25 35852 0 1 1 0 0 0 272 419 I'm not sure how to interpret that to be honest. But I'd like some help resolving this issue. :) One thing that I considered is that I may be running out of memory while trying to cache the site. The varnish host itself is only a 1GB VM with 40GB of disk space. Here's my varnish system config: grep -v '#' /etc/sysconfig/varnish NFILES=131072 MEMLOCK=82000 NPROCS="unlimited" RELOAD_VCL=1 VARNISH_VCL_CONF=/etc/varnish/default.vcl VARNISH_LISTEN_PORT=80 VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1 VARNISH_ADMIN_LISTEN_PORT=6082 VARNISH_SECRET_FILE=/etc/varnish/secret VARNISH_MIN_THREADS=50 VARNISH_MAX_THREADS=1000 VARNISH_THREAD_TIMEOUT=120 VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin VARNISH_STORAGE_SIZE=10G VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}" VARNISH_TTL=120 DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \ -f ${VARNISH_VCL_CONF} \ -T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \ -t ${VARNISH_TTL} \ -w ${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},${VARNISH_THREAD_TIMEOUT} \ -u varnish -g varnish \ -S ${VARNISH_SECRET_FILE} \ -s ${VARNISH_STORAGE}" And this is my very simple VCL file: backend web1 { .host = "10.10.10.25"; .port = "80"; .connect_timeout = 30s; .first_byte_timeout = 30s; .between_bytes_timeout = 30s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } backend web2 { .host = "10.10.10.26"; .port = "80"; .connect_timeout = 30s; .first_byte_timeout = 30s; .between_bytes_timeout = 30s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } director www round-robin { { .backend = web1; } { .backend = web2; } } sub vcl_recv { set req.backend = www; return (lookup); } sub vcl_fetch { set beresp.ttl = 3600s; set beresp.grace = 4h; return (deliver); } sub vcl_deliver { if (obj.hits> 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } So I'm wondering if I'm anywhere near the mark that I could be running out of memory when trying to cache pages to my wiki site. I'd really appreciate any feedback you guys may have! Thanks, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From geoff at uplex.de Sun Jul 5 09:35:45 2015 From: geoff at uplex.de (Geoff Simmons) Date: Sun, 05 Jul 2015 11:35:45 +0200 Subject: 503 error from varnish In-Reply-To: References: Message-ID: <5598FA71.4040209@uplex.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 7/5/15 2:54 AM, Tim Dunphy wrote: > > But my current problem is that I've been able to get the main page > to the site cached, and a few others. But when I click to like the > 4th or 5th page on the wiki that I want to cache through varnish, > everything comes to a halt. I get a 503 - service unavailable > error, instead of varnish going to the web server to retrieve the > page I had clicked on. 503 errors are in most cases an indication that there's a problem with your backends, and Varnish cannot retrieve responses. When that happens, you'll often get some info from the log in the FetchError line, as you did here: > 10 FetchError c no backend connection No backend connection usually means that health checks are failing -- in your case, they're failing for both backends that are assigned to your director. When health checks fail, Varnish doesn't attempt a fetch at all and sends a 503 response immediately. When you see that, check the Backend_health entries in the log. They describe the current health state (you'll probably see "Still sick"), and at the end of the log entry they either show the response status that the checks are getting; or if they show no response status, then the checks aren't getting any response. That should help you diagnose the problem. HTH, Geoff - -- UPLEX Systemoptimierung Scheffelstra?e 32 22301 Hamburg http://uplex.de/ Mob: +49-176-63690917 -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.14 (Darwin) iQIcBAEBCAAGBQJVmPpwAAoJEOUwvh9pJNUREXUP/1EMnxHq8BmANhCL+8qwDpO4 sTOdspRdHcpidtyR2MzRpjsZSBmvwjCze4ShMT0sop7Jp9qytlB6jh+DEOMNa/wd PF+1LGkWNZiNya7BBB/RSGmpXL/t9wQQWxtu4UpyV2Nqra2ktzEburxCPytV2Rwx ph4AAMiSMzI0mIimRbihmdNDL+/LTTXFIFPLJ8LLjMy7/ynB7wLlaRb6999d7PVq 0RIglbDCjvIKz87vtcjG6gAuSHIs7BpODNjZ04UWTMfFqAtYk+UwLj1ALS32/1mO K5AeDYU6x8UqR9ja4j/41rYeCpE7SwPeuZoVB7GmuJTSVZ8O56dZwne/veVEzrup uyl3oXuzYFqPu9Aw1aBZc7oVJJiw+97Xf7q6tPW83a1iWbJlg3pMy1+vtNOIrbKu Z9Anm1Jr8nB1Esb7X1xS3+YPjK4klqnnwR0Mod1MMwhcWkg/iqF3X91hPXzU8YId KoET7lSpGlHBnr1Czx4IuV/yjfn/iJJta67vljqeFMY6zZId+KYYhQUQgl9J+mDq q8xxAeybYHwA3t9L9OZ78eJBhMTi31V7OI+A7aTvx2L198bGTZIIQP4V/bGctSm2 CQ/g8FHVqovgtFfULVj0gkfr6+T0a5XWrpRYd1HcdtaaU2lptTcOIY33GBun3nHU MOnwzewxqXJueu1H1MWZ =4iaZ -----END PGP SIGNATURE----- From sm at sami-mantysaari.com Sun Jul 5 13:26:31 2015 From: sm at sami-mantysaari.com (=?UTF-8?B?U2FtaSBNw6RudHlzYWFyaQ==?=) Date: Sun, 05 Jul 2015 16:26:31 +0300 Subject: Varnish 4.0 & Removing/changing headers Message-ID: <55993087.30304@sami-mantysaari.com> Hi, I would like to hide the fact that my service is using an varnish cache. How can I do so as every guide I have found has been for the 3.0 version, but not for 4.0? sub vcl_deliver { remove resp.http.Via; remove resp.http.X-Whatever; remove resp.http.X-Powered-By; remove resp.http.X-Varnish; remove resp.http.Age; remove resp.http.Server; set resp.http.Server = "TFE"; set resp.http.X-Powered-By = "Curiosity"; } That for instance gives me an error. From andrew.langhorn at digital.cabinet-office.gov.uk Sun Jul 5 13:35:15 2015 From: andrew.langhorn at digital.cabinet-office.gov.uk (Andrew Langhorn) Date: Sun, 5 Jul 2015 14:35:15 +0100 Subject: Varnish 4.0 & Removing/changing headers In-Reply-To: <55993087.30304@sami-mantysaari.com> References: <55993087.30304@sami-mantysaari.com> Message-ID: What error do you get? Removing HTTP response headers is still supported in 4.x as far as I can remember. On Sunday, 5 July 2015, Sami M?ntysaari wrote: > Hi, > > I would like to hide the fact that my service is using an varnish cache. > How can I do so as every guide I have found has been for the 3.0 > version, but not for 4.0? > > sub vcl_deliver { > remove resp.http.Via; > remove resp.http.X-Whatever; > remove resp.http.X-Powered-By; > remove resp.http.X-Varnish; > remove resp.http.Age; > remove resp.http.Server; > set resp.http.Server = "TFE"; > set resp.http.X-Powered-By = "Curiosity"; > } > > That for instance gives me an error. > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Sun Jul 5 13:55:09 2015 From: dridi at varni.sh (Dridi Boukelmoune) Date: Sun, 5 Jul 2015 15:55:09 +0200 Subject: Varnish 4.0 & Removing/changing headers In-Reply-To: References: <55993087.30304@sami-mantysaari.com> Message-ID: Hi, The remove keyword was an alias to unset and is no longer supported in Varnish 4. Use "unset resp.http.Via;" instead. Dridi On Sun, Jul 5, 2015 at 3:35 PM, Andrew Langhorn wrote: > What error do you get? Removing HTTP response headers is still supported in > 4.x as far as I can remember. > > > On Sunday, 5 July 2015, Sami M?ntysaari wrote: >> >> Hi, >> >> I would like to hide the fact that my service is using an varnish cache. >> How can I do so as every guide I have found has been for the 3.0 >> version, but not for 4.0? >> >> sub vcl_deliver { >> remove resp.http.Via; >> remove resp.http.X-Whatever; >> remove resp.http.X-Powered-By; >> remove resp.http.X-Varnish; >> remove resp.http.Age; >> remove resp.http.Server; >> set resp.http.Server = "TFE"; >> set resp.http.X-Powered-By = "Curiosity"; >> } >> >> That for instance gives me an error. >> >> _______________________________________________ >> varnish-misc mailing list >> varnish-misc at varnish-cache.org >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > > > -- > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From sm at sami-mantysaari.com Sun Jul 5 14:34:29 2015 From: sm at sami-mantysaari.com (=?UTF-8?B?U2FtaSBNw6RudHlzYWFyaQ==?=) Date: Sun, 05 Jul 2015 17:34:29 +0300 Subject: Varnish 4.0 & Removing/changing headers In-Reply-To: References: <55993087.30304@sami-mantysaari.com>

Message-ID: <55994075.8030809@sami-mantysaari.com> Thank you, that fixed it. :) 5.7.2015, 16:55, Dridi Boukelmoune kirjoitti: > Hi, > > The remove keyword was an alias to unset and is no longer supported in > Varnish 4. > > Use "unset resp.http.Via;" instead. > > Dridi > > On Sun, Jul 5, 2015 at 3:35 PM, Andrew Langhorn > wrote: >> What error do you get? Removing HTTP response headers is still supported in >> 4.x as far as I can remember. >> >> >> On Sunday, 5 July 2015, Sami M?ntysaari wrote: >>> Hi, >>> >>> I would like to hide the fact that my service is using an varnish cache. >>> How can I do so as every guide I have found has been for the 3.0 >>> version, but not for 4.0? >>> >>> sub vcl_deliver { >>> remove resp.http.Via; >>> remove resp.http.X-Whatever; >>> remove resp.http.X-Powered-By; >>> remove resp.http.X-Varnish; >>> remove resp.http.Age; >>> remove resp.http.Server; >>> set resp.http.Server = "TFE"; >>> set resp.http.X-Powered-By = "Curiosity"; >>> } >>> >>> That for instance gives me an error. >>> >>> _______________________________________________ >>> varnish-misc mailing list >>> varnish-misc at varnish-cache.org >>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc >> >> >> -- >> >> >> _______________________________________________ >> varnish-misc mailing list >> varnish-misc at varnish-cache.org >> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From bluethundr at gmail.com Sun Jul 5 22:39:39 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Sun, 5 Jul 2015 18:39:39 -0400 Subject: 503 error from varnish In-Reply-To: <5598FA71.4040209@uplex.de> References: <5598FA71.4040209@uplex.de> Message-ID: Hi Geoff, Thanks for your input. But I remembered that I had an issue like this a while back. The problem was that the connect timeouts on my backend definitions was set too low. There is a bit of latency between the varnish servers and the web servers. So I raised the connect timeout from 30s to 60s for each of the back ends. That seems to do the trick! And now it seems that the whole site is caching. Thanks! Tim On Sun, Jul 5, 2015 at 5:35 AM, Geoff Simmons wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > On 7/5/15 2:54 AM, Tim Dunphy wrote: > > > > But my current problem is that I've been able to get the main page > > to the site cached, and a few others. But when I click to like the > > 4th or 5th page on the wiki that I want to cache through varnish, > > everything comes to a halt. I get a 503 - service unavailable > > error, instead of varnish going to the web server to retrieve the > > page I had clicked on. > > 503 errors are in most cases an indication that there's a problem with > your backends, and Varnish cannot retrieve responses. When that > happens, you'll often get some info from the log in the FetchError > line, as you did here: > > > 10 FetchError c no backend connection > > No backend connection usually means that health checks are failing -- > in your case, they're failing for both backends that are assigned to > your director. When health checks fail, Varnish doesn't attempt a > fetch at all and sends a 503 response immediately. > > When you see that, check the Backend_health entries in the log. They > describe the current health state (you'll probably see "Still sick"), > and at the end of the log entry they either show the response status > that the checks are getting; or if they show no response status, then > the checks aren't getting any response. That should help you diagnose > the problem. > > > HTH, > Geoff > - -- > UPLEX Systemoptimierung > Scheffelstra?e 32 > 22301 Hamburg > http://uplex.de/ > Mob: +49-176-63690917 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG/MacGPG2 v2.0.14 (Darwin) > > iQIcBAEBCAAGBQJVmPpwAAoJEOUwvh9pJNUREXUP/1EMnxHq8BmANhCL+8qwDpO4 > sTOdspRdHcpidtyR2MzRpjsZSBmvwjCze4ShMT0sop7Jp9qytlB6jh+DEOMNa/wd > PF+1LGkWNZiNya7BBB/RSGmpXL/t9wQQWxtu4UpyV2Nqra2ktzEburxCPytV2Rwx > ph4AAMiSMzI0mIimRbihmdNDL+/LTTXFIFPLJ8LLjMy7/ynB7wLlaRb6999d7PVq > 0RIglbDCjvIKz87vtcjG6gAuSHIs7BpODNjZ04UWTMfFqAtYk+UwLj1ALS32/1mO > K5AeDYU6x8UqR9ja4j/41rYeCpE7SwPeuZoVB7GmuJTSVZ8O56dZwne/veVEzrup > uyl3oXuzYFqPu9Aw1aBZc7oVJJiw+97Xf7q6tPW83a1iWbJlg3pMy1+vtNOIrbKu > Z9Anm1Jr8nB1Esb7X1xS3+YPjK4klqnnwR0Mod1MMwhcWkg/iqF3X91hPXzU8YId > KoET7lSpGlHBnr1Czx4IuV/yjfn/iJJta67vljqeFMY6zZId+KYYhQUQgl9J+mDq > q8xxAeybYHwA3t9L9OZ78eJBhMTi31V7OI+A7aTvx2L198bGTZIIQP4V/bGctSm2 > CQ/g8FHVqovgtFfULVj0gkfr6+T0a5XWrpRYd1HcdtaaU2lptTcOIY33GBun3nHU > MOnwzewxqXJueu1H1MWZ > =4iaZ > -----END PGP SIGNATURE----- > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From topper87 at Safe-mail.net Mon Jul 6 19:35:54 2015 From: topper87 at Safe-mail.net (topper87 at Safe-mail.net) Date: Mon, 6 Jul 2015 15:35:54 -0400 Subject: Varnish 4 - respecting Cache-Control Headers from backend Message-ID: Hi all, our backend sends response headers like s-maxage. I need to tell Varnish to cache the object with the same value as the backends sends the object. So Varnish should accept the s-maxage from backend and cache this object without redefining the http.cache.control because the value of s-maxage could change from site to site. I do not know the site urls so i would do it in vcl_backend_response if (beresp.http.Cache-Control ~ "no-cache") { //set s-maxage from backend ?? } I'd appreciate any help and support thanks From bluethundr at gmail.com Wed Jul 8 16:14:03 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Wed, 8 Jul 2015 12:14:03 -0400 Subject: 503 service unavailable error Message-ID: Hi guys, I'm having an issue where my varnish server will stop working after a while of browsing around the site I'm using it with and throw a 503 server unavailable error. In my varnish logs I'm getting a 'no backend connection error': * 10 FetchError c no backend connection* 10 VCL_call c error deliver 10 VCL_call c deliver deliver 10 TxProtocol c HTTP/1.1 10 TxStatus c 503 10 TxResponse c Service Unavailable 10 TxHeader c Server: Varnish And if I do a GET on the healthcheck from the command line on the varnish server, I get a 503 response from varnish: #GET http://wiki.example.com/healthcheck.php 503 Service Unavailable

Error 503 Service Unavailable

Service Unavailable

Guru Meditation:

XID: 2107225059

Varnish cache server

But if I do another GET on the healthcheck file from the varnish server to another apache VHOST on the same server as the wiki site that responds to the IP of the web server instead of the IP for the varnish server, the GET works: #GET http://ops1.example.com/healthcheck.php good So I'm not sure why varnish is having trouble reaching the HC file. The web server is a little far from the varnish server. The varnish machines are in NYC and the web servers are in northern Virginia. So I tried setting the timeouts in the varnish config to a really high number. And that was working for a while. But today I noticed that it stopped working. I'll have to restart the varnish service and browse the site for a while. Then it'll stop working again and produce the 503 error. It's pretty annoying! I was wondering if there might be something in my VCL I could tweak to make this work? Or if the fact is that the web servers are simply too far from varnish for this to be practical. Here's my VCL file. It's pretty basic: backend web1 { .host = "10.10.10.25"; .port = "80"; .connect_timeout = 1200s; .first_byte_timeout = 1200s; .between_bytes_timeout = 1200s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } backend web2 { .host = "10.10.10.26"; .port = "80"; .connect_timeout = 1200s; .first_byte_timeout = 1200s; .between_bytes_timeout = 1200s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } director www round-robin { { .backend = web1; } { .backend = web2; } } sub vcl_recv { if (req.url ~ "&action=submit($|/)") { return (pass); } set req.backend = www; return (lookup); } sub vcl_fetch { set beresp.ttl = 3600s; set beresp.grace = 4h; return (deliver); } sub vcl_deliver { if (obj.hits> 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From japrice at gmail.com Thu Jul 9 01:31:30 2015 From: japrice at gmail.com (Jason Price) Date: Wed, 8 Jul 2015 21:31:30 -0400 Subject: 503 service unavailable error In-Reply-To: References: Message-ID: that interval and window on your web server is scary..... what you're saying is 'check each web server every 10 minutes, and only fail it after 3 failures' next time you see the issue, look at: varnishadm -n debug.health I'd be willing to bet that varnish is just failing the backends. Try running the healthcheck manually from the varnish boxes: curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" And see if you're actually getting good healthchecks. If you're not, then you need to look at your backends (specifically healthcheck.php) On Wed, Jul 8, 2015 at 12:14 PM, Tim Dunphy wrote: > Hi guys, > > > I'm having an issue where my varnish server will stop working after a while > of browsing around the site I'm using it with and throw a 503 server > unavailable error. > > In my varnish logs I'm getting a 'no backend connection error': > > 10 FetchError c no backend connection > 10 VCL_call c error deliver > 10 VCL_call c deliver deliver > 10 TxProtocol c HTTP/1.1 > 10 TxStatus c 503 > 10 TxResponse c Service Unavailable > 10 TxHeader c Server: Varnish > > > And if I do a GET on the healthcheck from the command line on the varnish > server, I get a 503 response from varnish: > > #GET http://wiki.example.com/healthcheck.php > > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > > 503 Service Unavailable > > >

Error 503 Service Unavailable

Service Unavailable

Guru Meditation:

XID: 2107225059

Varnish cache server

> > > > But if I do another GET on the healthcheck file from the varnish server to > another apache VHOST on the same server as the wiki site that responds to > the IP of the web server instead of the IP for the varnish server, the GET > works: > > #GET http://ops1.example.com/healthcheck.php > good > > > So I'm not sure why varnish is having trouble reaching the HC file. The web > server is a little far from the varnish server. The varnish machines are in > NYC and the web servers are in northern Virginia. > > So I tried setting the timeouts in the varnish config to a really high > number. And that was working for a while. But today I noticed that it > stopped working. I'll have to restart the varnish service and browse the > site for a while. Then it'll stop working again and produce the 503 error. > It's pretty annoying! > > I was wondering if there might be something in my VCL I could tweak to make > this work? Or if the fact is that the web servers are simply too far from > varnish for this to be practical. > > Here's my VCL file. It's pretty basic: > > backend web1 { > .host = "10.10.10.25"; > .port = "80"; > .connect_timeout = 1200s; > .first_byte_timeout = 1200s; > .between_bytes_timeout = 1200s; > .max_connections = 70; > .probe = { > .request = > "GET /healthcheck.php HTTP/1.1" > "Host: wiki.example.com" > "Connection: close"; > .interval = 10m; > .timeout = 60s; > .window = 3; > .threshold = 2; > } > } > > backend web2 { > .host = "10.10.10.26"; > .port = "80"; > .connect_timeout = 1200s; > .first_byte_timeout = 1200s; > .between_bytes_timeout = 1200s; > .max_connections = 70; > .probe = { > .request = > "GET /healthcheck.php HTTP/1.1" > "Host: wiki.example.com" > "Connection: close"; > .interval = 10m; > .timeout = 60s; > .window = 3; > .threshold = 2; > } > } > > director www round-robin { > { .backend = web1; } > { .backend = web2; } > } > > sub vcl_recv { > > if (req.url ~ "&action=submit($|/)") { > return (pass); > } > > set req.backend = www; > return (lookup); > } > > sub vcl_fetch { > set beresp.ttl = 3600s; > set beresp.grace = 4h; > return (deliver); > } > > > sub vcl_deliver { > if (obj.hits> 0) { > set resp.http.X-Cache = "HIT"; > } else { > set resp.http.X-Cache = "MISS"; > } > } > > Thanks, > Tim > > > > -- > GPG me!! > > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc From bluethundr at gmail.com Thu Jul 9 03:19:58 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Wed, 8 Jul 2015 23:19:58 -0400 Subject: 503 service unavailable error In-Reply-To: References:

Message-ID: > > that interval and window on your web server is scary..... what you're > saying is 'check each web server every 10 minutes, and only fail it > after 3 failures' Hah!! Agreed. I was just trying to rule the connect timeouts out of the picture as to why the failures were happening! I plan to set them to more normal intervals once I'm finished testing and I've been able to get this to work. > > next time you see the issue, look at: > varnishadm -n debug.health Hmm you may have a point as to the back ends. Varnish is indeed seeing them as 'sick' when I encounter the 503 error: [root at varnish1:~] #varnishadm -n varnish1 debug.health Backend web1 is Sick Current states good: 0 threshold: 2 window: 3 Average responsetime of good probes: 0.000000 Oldest Newest ================================================================ ------------------------------------------------------4444444444 Good IPv4 ------------------------------------------------------XXXXXXXXXX Good Xmit ------------------------------------------------------RRRRRRRRRR Good Recv ----------------------------------------------------HH---------- Happy Backend web2 is Sick Current states good: 0 threshold: 2 window: 3 Average responsetime of good probes: 0.000000 Oldest Newest ================================================================ ------------------------------------------------------4444444444 Good IPv4 ------------------------------------------------------XXXXXXXXXX Good Xmit ------------------------------------------------------RRRRRRRRRR Good Recv ----------------------------------------------------HH---------- Happy > > I'd be willing to bet that varnish is just failing the backends. Try > running the healthcheck manually from the varnish boxes: > curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" > And see if you're actually getting good healthchecks. If you're not, > then you need to look at your backends (specifically healthcheck.php) But if I perform the curl you're suggesting, I am able to retrieve the healthcheck.php file!! #curl --user admin:somepass -H "Host:wiki.example.com" -v " http://10.10.10.25/healthcheck.php" * About to connect() to 52.5.117.61 port 80 (#0) * Trying 52.5.117.61... connected * Connected to 52.5.117.61 (52.5.117.61) port 80 (#0) * Server auth using Basic with user 'admin' > GET /healthcheck.php HTTP/1.1 > Authorization: Basic SomeBase64Hash== > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/ 3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Accept: */* > Host:wiki.example.com > < HTTP/1.1 200 OK < Date: Thu, 09 Jul 2015 02:10:35 GMT < Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips mod_fcgid/2.3.9 PHP/5.4.42 SVN/1.7.14 mod_wsgi/3.4 Python/2.7.5 < X-Powered-By: PHP/5.4.42 < Content-Length: 5 < Content-Type: text/html; charset=UTF-8 < good * Connection #0 to host 52.5.117.61 left intact * Closing connection #0 But in the curl I just did I was specifying the user auth. Which got me to thinking, maybe I'm handing apache basic auth in the wrong way in my VCL file? To test this idea out, I commented out the basic auth lines in my apache config. Then cycled the services on both apache servers and both varnish servers. When I ran the test you gave me again, this is the result I got back: #varnishadm -n varnish1 debug.health Backend web1 is Healthy Current states good: 3 threshold: 2 window: 3 Average responsetime of good probes: 0.032781 Oldest Newest ================================================================ ---------------------------------------------------------------4 Good IPv4 ---------------------------------------------------------------X Good Xmit ---------------------------------------------------------------R Good Recv -------------------------------------------------------------HHH Happy Backend web2 is Healthy Current states good: 3 threshold: 2 window: 3 Average responsetime of good probes: 0.032889 Oldest Newest ================================================================ ---------------------------------------------------------------4 Good IPv4 ---------------------------------------------------------------X Good Xmit ---------------------------------------------------------------R Good Recv -------------------------------------------------------------HHH Happy Everbody's happy again!! And I tried browsing around the wiki for quite a long time. And there were NO 503 errors the entire time I was using it. Which tells me that I am, indeed, not handling auth correctly in my VCL. The way I thought I solved the problem was by adding a .request to the web server definitions that specified the headers to do a GET on the health check: .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; The reason I thought this worked was because, after I'd restarted varnish with that change in place I was able to log into the wiki with basic auth in the web browser. And then I'd be able to use it for a while before the back-end would come up as 'sick' in varnish again which would cause the 503 error. I then tried following this advice again, which I had also tried earlier without much luck: http://blog.tenya.me/blog/2011/12/14/varnish-http-authentication/ Which tells you to add this section to your VCL file: if (! req.http.Authorization ~ "Basic SomeBase64Hash==") { error 401 "Restricted"; } And then add this sub_vcl section: sub vcl_error { if (obj.status == 401) { set obj.http.Content-Type = "text/html; charset=utf-8"; set obj.http.WWW-Authenticate = "Basic realm=Secured"; synthetic {" Error

401 Unauthorized (varnish)

"}; return (deliver); } } And after restarting varnish again on both nodes, with authentication in place in the VHOST configs on the web servers I was able to log into the wiki site again and browse around for a while. But then after some browsing around the back ends would go sick again and you would see the 503: #varnishadm -n varnish1 debug.health Backend web1 is Sick Current states good: 1 threshold: 2 window: 3 Average responsetime of good probes: 0.000000 Oldest Newest ================================================================ --------------------------------------------------------------44 Good IPv4 --------------------------------------------------------------XX Good Xmit --------------------------------------------------------------RR Good Recv ------------------------------------------------------------HH-- Happy Backend web2 is Sick Current states good: 1 threshold: 2 window: 3 Average responsetime of good probes: 0.000000 Oldest Newest ================================================================ --------------------------------------------------------------44 Good IPv4 --------------------------------------------------------------XX Good Xmit --------------------------------------------------------------RR Good Recv ------------------------------------------------------------HH-- Happy So SOMETHING must still be off with how I'm handling authentication in my VCL config. The next step I'm thinking of trying involves passing the authentication headers to the .request section of my web server definition. Although I'm not sure if it'll work. I'll let you guys know if it does. But I'd like to present the current state of my VLC again in case anyone has any insight or knowledge to share that may help. backend web1 { .host = "10.10.10.25"; .port = "80"; .connect_timeout = 3600s; .first_byte_timeout = 3600s; .between_bytes_timeout = 3600s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } backend web2 { .host = "10.10.10.26"; .port = "80"; .connect_timeout = 3600s; .first_byte_timeout = 3600s; .between_bytes_timeout = 3600s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } director www round-robin { { .backend = web1; } { .backend = web2; } } sub vcl_recv { if (! req.http.Authorization ~ "Basic Base64Hash==") { error 401 "Restricted"; } if (req.url ~ "&action=submit($|/)") { return (pass); } set req.backend = www; return (lookup); } sub vcl_fetch { set beresp.ttl = 3600s; set beresp.grace = 4h; return (deliver); } sub vcl_error { if (obj.status == 401) { set obj.http.Content-Type = "text/html; charset=utf-8"; set obj.http.WWW-Authenticate = "Basic realm=Secured"; synthetic {" Error

401 Unauthorized (varnish)

"}; return (deliver); } } sub vcl_deliver { if (obj.hits> 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } Once again I genuinely appreciate the help of this list, and hope I haven't worn out my welcome! ;) Thanks, Tim On Wed, Jul 8, 2015 at 9:31 PM, Jason Price wrote: > that interval and window on your web server is scary..... what you're > saying is 'check each web server every 10 minutes, and only fail it > after 3 failures' > > next time you see the issue, look at: > > varnishadm -n debug.health > > I'd be willing to bet that varnish is just failing the backends. Try > running the healthcheck manually from the varnish boxes: > > curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" > > And see if you're actually getting good healthchecks. If you're not, > then you need to look at your backends (specifically healthcheck.php) > > On Wed, Jul 8, 2015 at 12:14 PM, Tim Dunphy wrote: > > Hi guys, > > > > > > I'm having an issue where my varnish server will stop working after a > while > > of browsing around the site I'm using it with and throw a 503 server > > unavailable error. > > > > In my varnish logs I'm getting a 'no backend connection error': > > > > 10 FetchError c no backend connection > > 10 VCL_call c error deliver > > 10 VCL_call c deliver deliver > > 10 TxProtocol c HTTP/1.1 > > 10 TxStatus c 503 > > 10 TxResponse c Service Unavailable > > 10 TxHeader c Server: Varnish > > > > > > And if I do a GET on the healthcheck from the command line on the varnish > > server, I get a 503 response from varnish: > > > > #GET http://wiki.example.com/healthcheck.php > > > > > > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > > > > > 503 Service Unavailable > > > > > >

Error 503 Service Unavailable

> >

Service Unavailable

> >

Guru Meditation:

> >

XID: 2107225059

> >

Varnish cache server

> > > > > > > > But if I do another GET on the healthcheck file from the varnish server > to > > another apache VHOST on the same server as the wiki site that responds to > > the IP of the web server instead of the IP for the varnish server, the > GET > > works: > > > > #GET http://ops1.example.com/healthcheck.php > > good > > > > > > So I'm not sure why varnish is having trouble reaching the HC file. The > web > > server is a little far from the varnish server. The varnish machines are > in > > NYC and the web servers are in northern Virginia. > > > > So I tried setting the timeouts in the varnish config to a really high > > number. And that was working for a while. But today I noticed that it > > stopped working. I'll have to restart the varnish service and browse the > > site for a while. Then it'll stop working again and produce the 503 > error. > > It's pretty annoying! > > > > I was wondering if there might be something in my VCL I could tweak to > make > > this work? Or if the fact is that the web servers are simply too far from > > varnish for this to be practical. > > > > Here's my VCL file. It's pretty basic: > > > > backend web1 { > > .host = "10.10.10.25"; > > .port = "80"; > > .connect_timeout = 1200s; > > .first_byte_timeout = 1200s; > > .between_bytes_timeout = 1200s; > > .max_connections = 70; > > .probe = { > > .request = > > "GET /healthcheck.php HTTP/1.1" > > "Host: wiki.example.com" > > "Connection: close"; > > .interval = 10m; > > .timeout = 60s; > > .window = 3; > > .threshold = 2; > > } > > } > > > > backend web2 { > > .host = "10.10.10.26"; > > .port = "80"; > > .connect_timeout = 1200s; > > .first_byte_timeout = 1200s; > > .between_bytes_timeout = 1200s; > > .max_connections = 70; > > .probe = { > > .request = > > "GET /healthcheck.php HTTP/1.1" > > "Host: wiki.example.com" > > "Connection: close"; > > .interval = 10m; > > .timeout = 60s; > > .window = 3; > > .threshold = 2; > > } > > } > > > > director www round-robin { > > { .backend = web1; } > > { .backend = web2; } > > } > > > > sub vcl_recv { > > > > if (req.url ~ "&action=submit($|/)") { > > return (pass); > > } > > > > set req.backend = www; > > return (lookup); > > } > > > > sub vcl_fetch { > > set beresp.ttl = 3600s; > > set beresp.grace = 4h; > > return (deliver); > > } > > > > > > sub vcl_deliver { > > if (obj.hits> 0) { > > set resp.http.X-Cache = "HIT"; > > } else { > > set resp.http.X-Cache = "MISS"; > > } > > } > > > > Thanks, > > Tim > > > > > > > > -- > > GPG me!! > > > > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B > > > > > > _______________________________________________ > > varnish-misc mailing list > > varnish-misc at varnish-cache.org > > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From japrice at gmail.com Thu Jul 9 13:01:17 2015 From: japrice at gmail.com (Jason Price) Date: Thu, 9 Jul 2015 09:01:17 -0400 Subject: 503 service unavailable error In-Reply-To: References:

Message-ID: You're never specifying any auth in your probe: .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Connection: close"; I don't know the proper way to specify it, but you'll need to play around with curl, wireshark and varnish probes until you get it right. May be easier to test with telnet invocations: telnet 10.10.10.26 80 GET /healthcheck.php HTTP/1.1 Host: wiki.example.com Authorization: Basic ??????????????? Connection: close The above should give you an auth failure request. Twiddle with that until you get a successful authentication request, then translate it into the probe .request format. The link you provided gives you everything else you need. -Jason On Wed, Jul 8, 2015 at 11:19 PM, Tim Dunphy wrote: >> that interval and window on your web server is scary..... what you're >> saying is 'check each web server every 10 minutes, and only fail it >> after 3 failures' > > > Hah!! Agreed. I was just trying to rule the connect timeouts out of the > picture as to why the failures were happening! > I plan to set them to more normal intervals once I'm finished testing and > I've been able to get this to work. > >> >> >> next time you see the issue, look at: >> varnishadm -n debug.health > > > Hmm you may have a point as to the back ends. Varnish is indeed seeing them > as 'sick' when I encounter the 503 error: > > > [root at varnish1:~] #varnishadm -n varnish1 debug.health > Backend web1 is Sick > Current states good: 0 threshold: 2 window: 3 > Average responsetime of good probes: 0.000000 > Oldest Newest > ================================================================ > ------------------------------------------------------4444444444 Good IPv4 > ------------------------------------------------------XXXXXXXXXX Good Xmit > ------------------------------------------------------RRRRRRRRRR Good Recv > ----------------------------------------------------HH---------- Happy > Backend web2 is Sick > Current states good: 0 threshold: 2 window: 3 > Average responsetime of good probes: 0.000000 > Oldest Newest > ================================================================ > ------------------------------------------------------4444444444 Good IPv4 > ------------------------------------------------------XXXXXXXXXX Good Xmit > ------------------------------------------------------RRRRRRRRRR Good Recv > ----------------------------------------------------HH---------- Happy > >> >> >> I'd be willing to bet that varnish is just failing the backends. Try >> running the healthcheck manually from the varnish boxes: >> curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" >> And see if you're actually getting good healthchecks. If you're not, >> then you need to look at your backends (specifically healthcheck.php) > > > But if I perform the curl you're suggesting, I am able to retrieve the > healthcheck.php file!! > > #curl --user admin:somepass -H "Host:wiki.example.com" -v > "http://10.10.10.25/healthcheck.php" > * About to connect() to 52.5.117.61 port 80 (#0) > * Trying 52.5.117.61... connected > * Connected to 52.5.117.61 (52.5.117.61) port 80 (#0) > * Server auth using Basic with user 'admin' >> GET /healthcheck.php HTTP/1.1 >> Authorization: Basic SomeBase64Hash== >> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 >> NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2 >> Accept: */* >> Host:wiki.example.com >> > < HTTP/1.1 200 OK > < Date: Thu, 09 Jul 2015 02:10:35 GMT > < Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips mod_fcgid/2.3.9 > PHP/5.4.42 SVN/1.7.14 mod_wsgi/3.4 Python/2.7.5 > < X-Powered-By: PHP/5.4.42 > < Content-Length: 5 > < Content-Type: text/html; charset=UTF-8 > < > good > * Connection #0 to host 52.5.117.61 left intact > * Closing connection #0 > > But in the curl I just did I was specifying the user auth. Which got me to > thinking, maybe I'm handing apache basic auth in the wrong way in my VCL > file? > > To test this idea out, I commented out the basic auth lines in my apache > config. Then cycled the services on both apache servers and both varnish > servers. > > When I ran the test you gave me again, this is the result I got back: > > #varnishadm -n varnish1 debug.health > Backend web1 is Healthy > Current states good: 3 threshold: 2 window: 3 > Average responsetime of good probes: 0.032781 > Oldest Newest > ================================================================ > ---------------------------------------------------------------4 Good IPv4 > ---------------------------------------------------------------X Good Xmit > ---------------------------------------------------------------R Good Recv > -------------------------------------------------------------HHH Happy > Backend web2 is Healthy > Current states good: 3 threshold: 2 window: 3 > Average responsetime of good probes: 0.032889 > Oldest Newest > ================================================================ > ---------------------------------------------------------------4 Good IPv4 > ---------------------------------------------------------------X Good Xmit > ---------------------------------------------------------------R Good Recv > -------------------------------------------------------------HHH Happy > > Everbody's happy again!! > > And I tried browsing around the wiki for quite a long time. And there were > NO 503 errors the entire time I was using it. Which tells me that I am, > indeed, not handling auth correctly in my VCL. > > The way I thought I solved the problem was by adding a .request to the web > server definitions that specified the headers to do a GET on the health > check: > > .request = > "GET /healthcheck.php HTTP/1.1" > "Host: wiki.example.com" > "Connection: close"; > > The reason I thought this worked was because, after I'd restarted varnish > with that change in place I was able to log into the wiki with basic auth in > the web browser. And then I'd be able to use it for a while before the > back-end would come up as 'sick' in varnish again which would cause the 503 > error. > > I then tried following this advice again, which I had also tried earlier > without much luck: > > http://blog.tenya.me/blog/2011/12/14/varnish-http-authentication/ > > Which tells you to add this section to your VCL file: > > if (! req.http.Authorization ~ "Basic SomeBase64Hash==") > { > error 401 "Restricted"; > } > > And then add this sub_vcl section: > > sub vcl_error { > > if (obj.status == 401) { > set obj.http.Content-Type = "text/html; charset=utf-8"; > set obj.http.WWW-Authenticate = "Basic realm=Secured"; > synthetic {" > > "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> > > > > Error > > >

401 Unauthorized (varnish)

> > "}; > return (deliver); > } > } > > And after restarting varnish again on both nodes, with authentication in > place in the VHOST configs on the web servers I was able to log into the > wiki site again and browse around for a while. > > But then after some browsing around the back ends would go sick again and > you would see the 503: > > #varnishadm -n varnish1 debug.health > Backend web1 is Sick > Current states good: 1 threshold: 2 window: 3 > Average responsetime of good probes: 0.000000 > Oldest Newest > ================================================================ > --------------------------------------------------------------44 Good IPv4 > --------------------------------------------------------------XX Good Xmit > --------------------------------------------------------------RR Good Recv > ------------------------------------------------------------HH-- Happy > Backend web2 is Sick > Current states good: 1 threshold: 2 window: 3 > Average responsetime of good probes: 0.000000 > Oldest Newest > ================================================================ > --------------------------------------------------------------44 Good IPv4 > --------------------------------------------------------------XX Good Xmit > --------------------------------------------------------------RR Good Recv > ------------------------------------------------------------HH-- Happy > > So SOMETHING must still be off with how I'm handling authentication in my > VCL config. The next step I'm thinking of trying involves passing the > authentication headers to the .request section of my web server definition. > Although I'm not sure if it'll work. I'll let you guys know if it does. > > But I'd like to present the current state of my VLC again in case anyone has > any insight or knowledge to share that may help. > > backend web1 { > > .host = "10.10.10.25"; > > .port = "80"; > > .connect_timeout = 3600s; > > .first_byte_timeout = 3600s; > > .between_bytes_timeout = 3600s; > > .max_connections = 70; > > .probe = { > > .request = > > "GET /healthcheck.php HTTP/1.1" > > "Host: wiki.example.com" > > "Connection: close"; > > .interval = 10m; > > .timeout = 60s; > > .window = 3; > > .threshold = 2; > > } > > } > > backend web2 { > > .host = "10.10.10.26"; > > .port = "80"; > > .connect_timeout = 3600s; > > .first_byte_timeout = 3600s; > > .between_bytes_timeout = 3600s; > > .max_connections = 70; > > .probe = { > > .request = > > "GET /healthcheck.php HTTP/1.1" > > "Host: wiki.example.com" > > "Connection: close"; > > .interval = 10m; > > .timeout = 60s; > > .window = 3; > > .threshold = 2; > > } > > } > > director www round-robin { > > { .backend = web1; } > > { .backend = web2; } > > } > > sub vcl_recv { > > if (! req.http.Authorization ~ "Basic Base64Hash==") > > { > > error 401 "Restricted"; > > } > > if (req.url ~ "&action=submit($|/)") { > > return (pass); > > } > > set req.backend = www; > > return (lookup); > > } > > sub vcl_fetch { > > set beresp.ttl = 3600s; > > set beresp.grace = 4h; > > return (deliver); > > } > > sub vcl_error { > > if (obj.status == 401) { > > set obj.http.Content-Type = "text/html; charset=utf-8"; > > set obj.http.WWW-Authenticate = "Basic realm=Secured"; > > synthetic {" > > > "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> > > > > > > > Error > > > > > >

401 Unauthorized (varnish)

> > > > "}; > > return (deliver); > > } > > } > > sub vcl_deliver { > > if (obj.hits> 0) { > > set resp.http.X-Cache = "HIT"; > > } else { > > set resp.http.X-Cache = "MISS"; > > } > > } > > Once again I genuinely appreciate the help of this list, and hope I haven't > worn out my welcome! ;) > > Thanks, > Tim > > > On Wed, Jul 8, 2015 at 9:31 PM, Jason Price wrote: >> >> that interval and window on your web server is scary..... what you're >> saying is 'check each web server every 10 minutes, and only fail it >> after 3 failures' >> >> next time you see the issue, look at: >> >> varnishadm -n debug.health >> >> I'd be willing to bet that varnish is just failing the backends. Try >> running the healthcheck manually from the varnish boxes: >> >> curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" >> >> And see if you're actually getting good healthchecks. If you're not, >> then you need to look at your backends (specifically healthcheck.php) >> >> On Wed, Jul 8, 2015 at 12:14 PM, Tim Dunphy wrote: >> > Hi guys, >> > >> > >> > I'm having an issue where my varnish server will stop working after a >> > while >> > of browsing around the site I'm using it with and throw a 503 server >> > unavailable error. >> > >> > In my varnish logs I'm getting a 'no backend connection error': >> > >> > 10 FetchError c no backend connection >> > 10 VCL_call c error deliver >> > 10 VCL_call c deliver deliver >> > 10 TxProtocol c HTTP/1.1 >> > 10 TxStatus c 503 >> > 10 TxResponse c Service Unavailable >> > 10 TxHeader c Server: Varnish >> > >> > >> > And if I do a GET on the healthcheck from the command line on the >> > varnish >> > server, I get a 503 response from varnish: >> > >> > #GET http://wiki.example.com/healthcheck.php >> > >> > >> > > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> >> > >> > >> > 503 Service Unavailable >> > >> > >> >

Error 503 Service Unavailable

>> >

Service Unavailable

>> >

Guru Meditation:

>> >

XID: 2107225059

>> >

Varnish cache server

>> > >> > >> > >> > But if I do another GET on the healthcheck file from the varnish server >> > to >> > another apache VHOST on the same server as the wiki site that responds >> > to >> > the IP of the web server instead of the IP for the varnish server, the >> > GET >> > works: >> > >> > #GET http://ops1.example.com/healthcheck.php >> > good >> > >> > >> > So I'm not sure why varnish is having trouble reaching the HC file. The >> > web >> > server is a little far from the varnish server. The varnish machines are >> > in >> > NYC and the web servers are in northern Virginia. >> > >> > So I tried setting the timeouts in the varnish config to a really high >> > number. And that was working for a while. But today I noticed that it >> > stopped working. I'll have to restart the varnish service and browse the >> > site for a while. Then it'll stop working again and produce the 503 >> > error. >> > It's pretty annoying! >> > >> > I was wondering if there might be something in my VCL I could tweak to >> > make >> > this work? Or if the fact is that the web servers are simply too far >> > from >> > varnish for this to be practical. >> > >> > Here's my VCL file. It's pretty basic: >> > >> > backend web1 { >> > .host = "10.10.10.25"; >> > .port = "80"; >> > .connect_timeout = 1200s; >> > .first_byte_timeout = 1200s; >> > .between_bytes_timeout = 1200s; >> > .max_connections = 70; >> > .probe = { >> > .request = >> > "GET /healthcheck.php HTTP/1.1" >> > "Host: wiki.example.com" >> > "Connection: close"; >> > .interval = 10m; >> > .timeout = 60s; >> > .window = 3; >> > .threshold = 2; >> > } >> > } >> > >> > backend web2 { >> > .host = "10.10.10.26"; >> > .port = "80"; >> > .connect_timeout = 1200s; >> > .first_byte_timeout = 1200s; >> > .between_bytes_timeout = 1200s; >> > .max_connections = 70; >> > .probe = { >> > .request = >> > "GET /healthcheck.php HTTP/1.1" >> > "Host: wiki.example.com" >> > "Connection: close"; >> > .interval = 10m; >> > .timeout = 60s; >> > .window = 3; >> > .threshold = 2; >> > } >> > } >> > >> > director www round-robin { >> > { .backend = web1; } >> > { .backend = web2; } >> > } >> > >> > sub vcl_recv { >> > >> > if (req.url ~ "&action=submit($|/)") { >> > return (pass); >> > } >> > >> > set req.backend = www; >> > return (lookup); >> > } >> > >> > sub vcl_fetch { >> > set beresp.ttl = 3600s; >> > set beresp.grace = 4h; >> > return (deliver); >> > } >> > >> > >> > sub vcl_deliver { >> > if (obj.hits> 0) { >> > set resp.http.X-Cache = "HIT"; >> > } else { >> > set resp.http.X-Cache = "MISS"; >> > } >> > } >> > >> > Thanks, >> > Tim >> > >> > >> > >> > -- >> > GPG me!! >> > >> > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B >> > >> > >> > _______________________________________________ >> > varnish-misc mailing list >> > varnish-misc at varnish-cache.org >> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > > > > -- > GPG me!! > > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B > From bluethundr at gmail.com Thu Jul 9 20:50:36 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Thu, 9 Jul 2015 16:50:36 -0400 Subject: 503 service unavailable error In-Reply-To: References:

Message-ID: Hey Jason, You're never specifying any auth in your probe: > > .probe = { > .request = > "GET /healthcheck.php HTTP/1.1" > "Host: wiki.example.com" > "Connection: close"; Yeah, understood. Actually when I mailed yesterday that was something I was planning on doing. Not something I had done. But sometimes I'm not very clear in explaining things. At any rate, I was able to get the Basic Auth headers into my .probe .request and the good news is it seems to have worked!! This was the change that I made: .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.jokefire.com" "Authorization: Basic myBase64Hash==" "Connection: close"; So after that change was made and I cycled varnish I literally NEVER got the 503 error again. Just an occasional 504 that went away on a page reload. But nothing serious. And even that could probably be done away with some VCL tweaking. So after that success I made some modifications to the VCL to make it work a little better with mediawiki. Here's the current state of my VCL for anyone that's interested. backend web1 { .host = ?10.10.10.25?; .port = "80"; .connect_timeout = 3600s; .first_byte_timeout = 3600s; .between_bytes_timeout = 3600s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Authorization: Basic Base64Hash==" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } backend web2 { .host = ?10.10.10.26?; .port = "80"; .connect_timeout = 3600s; .first_byte_timeout = 3600s; .between_bytes_timeout = 3600s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Authorization: Basic Base64Hash==" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } director www round-robin { { .backend = web1; } { .backend = web2; } } # access control list for "purge": open to only localhost and other local nodes acl purge { "127.0.0.1"; } sub vcl_recv { set req.http.host = regsub(req.http.host, "^www\.wiki\.example\.com$"," wiki.example.com"); # Serve objects up to 2 minutes past their expiry if the backend # is slow to respond. set req.grace = 120s; if (! req.http.Authorization ~ "Basic myBase64Hash==") { error 401 "Restricted"; } if (req.url ~ "&action=submit($|/)") { return (pass); } if (req.restarts == 0) { if (req.http.x-forwarded-for) { set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip; } else { set req.http.X-Forwarded-For = client.ip; } } set req.backend = www; # This uses the ACL action called "purge". Basically if a request to # PURGE the cache comes from anywhere other than localhost, ignore it. if (req.request == "PURGE") {if (!client.ip ~ purge) {error 405 "Not allowed.";} return(lookup);} if (req.request != "GET" && req.request != "HEAD" && req.request != "PUT" && req.request != "POST" && req.request != "TRACE" && req.request != "OPTIONS" && req.request != "DELETE") {return(pipe);} /* Non-RFC2616 or CONNECT which is weird. */ # Pass anything other than GET and HEAD directly. if (req.request != "GET" && req.request != "HEAD") {return(pass);} /* We only deal with GET and HEAD by default */ # Pass requests from logged-in users directly. if (req.http.Authorization || req.http.Cookie) {return(pass);} /* Not cacheable by default */ # Pass any requests with the "If-None-Match" header directly. if (req.http.If-None-Match) {return(pass);} # normalize Accept-Encoding to reduce vary if (req.http.Accept-Encoding) { if (req.http.User-Agent ~ "MSIE 6") { unset req.http.Accept-Encoding; } elsif (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; } elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; } else { unset req.http.Accept-Encoding; } } return (lookup); } sub vcl_pipe { # Note that only the first request to the backend will have # X-Forwarded-For set. If you use X-Forwarded-For and want to # have it set for all requests, make sure to have: # set req.http.connection = "close"; # This is otherwise not necessary if you do not do any request rewriting. set req.http.connection = "close"; } # Called if the cache has a copy of the page. sub vcl_hit { if (req.request == "PURGE") {ban_url(req.url); error 200 "Purged";} if (!obj.ttl > 0s) {return(pass);} } # Called if the cache does not have a copy of the page. sub vcl_miss { if (req.request == "PURGE") {error 200 "Not in cache";} } # Called after a document has been successfully retrieved from the backend. sub vcl_fetch { # set minimum timeouts to auto-discard stored objects # set beresp.prefetch = -30s; set beresp.grace = 120s; if (beresp.ttl < 48h) { set beresp.ttl = 48h;} if (!beresp.ttl > 0s) {return(hit_for_pass);} if (beresp.http.Set-Cookie) {return(hit_for_pass);} #if (beresp.http.Cache-Control ~ "(private|no-cache|no-store)") # {return(hit_for_pass);} if (req.http.Authorization && !beresp.http.Cache-Control ~ "public") {return(hit_for_pass);} } sub vcl_error { if (obj.status == 401) { set obj.http.Content-Type = "text/html; charset=utf-8"; set obj.http.WWW-Authenticate = "Basic realm=Secured"; synthetic {" Error

401 Unauthorized (varnish)

"}; return (deliver); } } sub vcl_deliver { if (obj.hits> 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } Now, all that's left to do is to set those completely insane timeouts I've been using to try and troubleshoot the problem to something a little more reasonable. Thanks for all the help! Tim On Thu, Jul 9, 2015 at 9:01 AM, Jason Price wrote: > You're never specifying any auth in your probe: > > .probe = { > .request = > "GET /healthcheck.php HTTP/1.1" > "Host: wiki.example.com" > "Connection: close"; > > I don't know the proper way to specify it, but you'll need to play > around with curl, wireshark and varnish probes until you get it right. > > May be easier to test with telnet invocations: > > telnet 10.10.10.26 80 > GET /healthcheck.php HTTP/1.1 > Host: wiki.example.com > Authorization: Basic ??????????????? > Connection: close > > > The above should give you an auth failure request. Twiddle with that > until you get a successful authentication request, then translate it > into the probe .request format. The link you provided gives you > everything else you need. > > -Jason > > On Wed, Jul 8, 2015 at 11:19 PM, Tim Dunphy wrote: > >> that interval and window on your web server is scary..... what you're > >> saying is 'check each web server every 10 minutes, and only fail it > >> after 3 failures' > > > > > > Hah!! Agreed. I was just trying to rule the connect timeouts out of the > > picture as to why the failures were happening! > > I plan to set them to more normal intervals once I'm finished testing and > > I've been able to get this to work. > > > >> > >> > >> next time you see the issue, look at: > >> varnishadm -n debug.health > > > > > > Hmm you may have a point as to the back ends. Varnish is indeed seeing > them > > as 'sick' when I encounter the 503 error: > > > > > > [root at varnish1:~] #varnishadm -n varnish1 debug.health > > Backend web1 is Sick > > Current states good: 0 threshold: 2 window: 3 > > Average responsetime of good probes: 0.000000 > > Oldest Newest > > ================================================================ > > ------------------------------------------------------4444444444 Good > IPv4 > > ------------------------------------------------------XXXXXXXXXX Good > Xmit > > ------------------------------------------------------RRRRRRRRRR Good > Recv > > ----------------------------------------------------HH---------- Happy > > Backend web2 is Sick > > Current states good: 0 threshold: 2 window: 3 > > Average responsetime of good probes: 0.000000 > > Oldest Newest > > ================================================================ > > ------------------------------------------------------4444444444 Good > IPv4 > > ------------------------------------------------------XXXXXXXXXX Good > Xmit > > ------------------------------------------------------RRRRRRRRRR Good > Recv > > ----------------------------------------------------HH---------- Happy > > > >> > >> > >> I'd be willing to bet that varnish is just failing the backends. Try > >> running the healthcheck manually from the varnish boxes: > >> curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" > >> And see if you're actually getting good healthchecks. If you're not, > >> then you need to look at your backends (specifically healthcheck.php) > > > > > > But if I perform the curl you're suggesting, I am able to retrieve the > > healthcheck.php file!! > > > > #curl --user admin:somepass -H "Host:wiki.example.com" -v > > "http://10.10.10.25/healthcheck.php" > > * About to connect() to 52.5.117.61 port 80 (#0) > > * Trying 52.5.117.61... connected > > * Connected to 52.5.117.61 (52.5.117.61) port 80 (#0) > > * Server auth using Basic with user 'admin' > >> GET /healthcheck.php HTTP/1.1 > >> Authorization: Basic SomeBase64Hash== > >> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 > >> NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > >> Accept: */* > >> Host:wiki.example.com > >> > > < HTTP/1.1 200 OK > > < Date: Thu, 09 Jul 2015 02:10:35 GMT > > < Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips mod_fcgid/2.3.9 > > PHP/5.4.42 SVN/1.7.14 mod_wsgi/3.4 Python/2.7.5 > > < X-Powered-By: PHP/5.4.42 > > < Content-Length: 5 > > < Content-Type: text/html; charset=UTF-8 > > < > > good > > * Connection #0 to host 52.5.117.61 left intact > > * Closing connection #0 > > > > But in the curl I just did I was specifying the user auth. Which got me > to > > thinking, maybe I'm handing apache basic auth in the wrong way in my VCL > > file? > > > > To test this idea out, I commented out the basic auth lines in my apache > > config. Then cycled the services on both apache servers and both varnish > > servers. > > > > When I ran the test you gave me again, this is the result I got back: > > > > #varnishadm -n varnish1 debug.health > > Backend web1 is Healthy > > Current states good: 3 threshold: 2 window: 3 > > Average responsetime of good probes: 0.032781 > > Oldest Newest > > ================================================================ > > ---------------------------------------------------------------4 Good > IPv4 > > ---------------------------------------------------------------X Good > Xmit > > ---------------------------------------------------------------R Good > Recv > > -------------------------------------------------------------HHH Happy > > Backend web2 is Healthy > > Current states good: 3 threshold: 2 window: 3 > > Average responsetime of good probes: 0.032889 > > Oldest Newest > > ================================================================ > > ---------------------------------------------------------------4 Good > IPv4 > > ---------------------------------------------------------------X Good > Xmit > > ---------------------------------------------------------------R Good > Recv > > -------------------------------------------------------------HHH Happy > > > > Everbody's happy again!! > > > > And I tried browsing around the wiki for quite a long time. And there > were > > NO 503 errors the entire time I was using it. Which tells me that I am, > > indeed, not handling auth correctly in my VCL. > > > > The way I thought I solved the problem was by adding a .request to the > web > > server definitions that specified the headers to do a GET on the health > > check: > > > > .request = > > "GET /healthcheck.php HTTP/1.1" > > "Host: wiki.example.com" > > "Connection: close"; > > > > The reason I thought this worked was because, after I'd restarted varnish > > with that change in place I was able to log into the wiki with basic > auth in > > the web browser. And then I'd be able to use it for a while before the > > back-end would come up as 'sick' in varnish again which would cause the > 503 > > error. > > > > I then tried following this advice again, which I had also tried earlier > > without much luck: > > > > http://blog.tenya.me/blog/2011/12/14/varnish-http-authentication/ > > > > Which tells you to add this section to your VCL file: > > > > if (! req.http.Authorization ~ "Basic SomeBase64Hash==") > > { > > error 401 "Restricted"; > > } > > > > And then add this sub_vcl section: > > > > sub vcl_error { > > > > if (obj.status == 401) { > > set obj.http.Content-Type = "text/html; charset=utf-8"; > > set obj.http.WWW-Authenticate = "Basic realm=Secured"; > > synthetic {" > > > > > "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> > > > > > > > > Error > > > > > >

401 Unauthorized (varnish)

> > > > "}; > > return (deliver); > > } > > } > > > > And after restarting varnish again on both nodes, with authentication in > > place in the VHOST configs on the web servers I was able to log into the > > wiki site again and browse around for a while. > > > > But then after some browsing around the back ends would go sick again and > > you would see the 503: > > > > #varnishadm -n varnish1 debug.health > > Backend web1 is Sick > > Current states good: 1 threshold: 2 window: 3 > > Average responsetime of good probes: 0.000000 > > Oldest Newest > > ================================================================ > > --------------------------------------------------------------44 Good > IPv4 > > --------------------------------------------------------------XX Good > Xmit > > --------------------------------------------------------------RR Good > Recv > > ------------------------------------------------------------HH-- Happy > > Backend web2 is Sick > > Current states good: 1 threshold: 2 window: 3 > > Average responsetime of good probes: 0.000000 > > Oldest Newest > > ================================================================ > > --------------------------------------------------------------44 Good > IPv4 > > --------------------------------------------------------------XX Good > Xmit > > --------------------------------------------------------------RR Good > Recv > > ------------------------------------------------------------HH-- Happy > > > > So SOMETHING must still be off with how I'm handling authentication in my > > VCL config. The next step I'm thinking of trying involves passing the > > authentication headers to the .request section of my web server > definition. > > Although I'm not sure if it'll work. I'll let you guys know if it does. > > > > But I'd like to present the current state of my VLC again in case anyone > has > > any insight or knowledge to share that may help. > > > > backend web1 { > > > > .host = "10.10.10.25"; > > > > .port = "80"; > > > > .connect_timeout = 3600s; > > > > .first_byte_timeout = 3600s; > > > > .between_bytes_timeout = 3600s; > > > > .max_connections = 70; > > > > .probe = { > > > > .request = > > > > "GET /healthcheck.php HTTP/1.1" > > > > "Host: wiki.example.com" > > > > "Connection: close"; > > > > .interval = 10m; > > > > .timeout = 60s; > > > > .window = 3; > > > > .threshold = 2; > > > > } > > > > } > > > > backend web2 { > > > > .host = "10.10.10.26"; > > > > .port = "80"; > > > > .connect_timeout = 3600s; > > > > .first_byte_timeout = 3600s; > > > > .between_bytes_timeout = 3600s; > > > > .max_connections = 70; > > > > .probe = { > > > > .request = > > > > "GET /healthcheck.php HTTP/1.1" > > > > "Host: wiki.example.com" > > > > "Connection: close"; > > > > .interval = 10m; > > > > .timeout = 60s; > > > > .window = 3; > > > > .threshold = 2; > > > > } > > > > } > > > > director www round-robin { > > > > { .backend = web1; } > > > > { .backend = web2; } > > > > } > > > > sub vcl_recv { > > > > if (! req.http.Authorization ~ "Basic Base64Hash==") > > > > { > > > > error 401 "Restricted"; > > > > } > > > > if (req.url ~ "&action=submit($|/)") { > > > > return (pass); > > > > } > > > > set req.backend = www; > > > > return (lookup); > > > > } > > > > sub vcl_fetch { > > > > set beresp.ttl = 3600s; > > > > set beresp.grace = 4h; > > > > return (deliver); > > > > } > > > > sub vcl_error { > > > > if (obj.status == 401) { > > > > set obj.http.Content-Type = "text/html; charset=utf-8"; > > > > set obj.http.WWW-Authenticate = "Basic realm=Secured"; > > > > synthetic {" > > > > > > > "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> > > > > > > > > > > > > > > Error > > > > > > > > > > > >

401 Unauthorized (varnish)

> > > > > > > > "}; > > > > return (deliver); > > > > } > > > > } > > > > sub vcl_deliver { > > > > if (obj.hits> 0) { > > > > set resp.http.X-Cache = "HIT"; > > > > } else { > > > > set resp.http.X-Cache = "MISS"; > > > > } > > > > } > > > > Once again I genuinely appreciate the help of this list, and hope I > haven't > > worn out my welcome! ;) > > > > Thanks, > > Tim > > > > > > On Wed, Jul 8, 2015 at 9:31 PM, Jason Price wrote: > >> > >> that interval and window on your web server is scary..... what you're > >> saying is 'check each web server every 10 minutes, and only fail it > >> after 3 failures' > >> > >> next time you see the issue, look at: > >> > >> varnishadm -n debug.health > >> > >> I'd be willing to bet that varnish is just failing the backends. Try > >> running the healthcheck manually from the varnish boxes: > >> > >> curl -H "Host:kiki.example.com" -v "http://10.10.10.26/healthcheck.php" > >> > >> And see if you're actually getting good healthchecks. If you're not, > >> then you need to look at your backends (specifically healthcheck.php) > >> > >> On Wed, Jul 8, 2015 at 12:14 PM, Tim Dunphy > wrote: > >> > Hi guys, > >> > > >> > > >> > I'm having an issue where my varnish server will stop working after a > >> > while > >> > of browsing around the site I'm using it with and throw a 503 server > >> > unavailable error. > >> > > >> > In my varnish logs I'm getting a 'no backend connection error': > >> > > >> > 10 FetchError c no backend connection > >> > 10 VCL_call c error deliver > >> > 10 VCL_call c deliver deliver > >> > 10 TxProtocol c HTTP/1.1 > >> > 10 TxStatus c 503 > >> > 10 TxResponse c Service Unavailable > >> > 10 TxHeader c Server: Varnish > >> > > >> > > >> > And if I do a GET on the healthcheck from the command line on the > >> > varnish > >> > server, I get a 503 response from varnish: > >> > > >> > #GET http://wiki.example.com/healthcheck.php > >> > > >> > > >> > >> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > >> > > >> > > >> > 503 Service Unavailable > >> > > >> > > >> >

Error 503 Service Unavailable

> >> >

Service Unavailable

> >> >

Guru Meditation:

> >> >

XID: 2107225059

> >> >

Varnish cache server

> >> > > >> > > >> > > >> > But if I do another GET on the healthcheck file from the varnish > server > >> > to > >> > another apache VHOST on the same server as the wiki site that responds > >> > to > >> > the IP of the web server instead of the IP for the varnish server, the > >> > GET > >> > works: > >> > > >> > #GET http://ops1.example.com/healthcheck.php > >> > good > >> > > >> > > >> > So I'm not sure why varnish is having trouble reaching the HC file. > The > >> > web > >> > server is a little far from the varnish server. The varnish machines > are > >> > in > >> > NYC and the web servers are in northern Virginia. > >> > > >> > So I tried setting the timeouts in the varnish config to a really high > >> > number. And that was working for a while. But today I noticed that it > >> > stopped working. I'll have to restart the varnish service and browse > the > >> > site for a while. Then it'll stop working again and produce the 503 > >> > error. > >> > It's pretty annoying! > >> > > >> > I was wondering if there might be something in my VCL I could tweak to > >> > make > >> > this work? Or if the fact is that the web servers are simply too far > >> > from > >> > varnish for this to be practical. > >> > > >> > Here's my VCL file. It's pretty basic: > >> > > >> > backend web1 { > >> > .host = "10.10.10.25"; > >> > .port = "80"; > >> > .connect_timeout = 1200s; > >> > .first_byte_timeout = 1200s; > >> > .between_bytes_timeout = 1200s; > >> > .max_connections = 70; > >> > .probe = { > >> > .request = > >> > "GET /healthcheck.php HTTP/1.1" > >> > "Host: wiki.example.com" > >> > "Connection: close"; > >> > .interval = 10m; > >> > .timeout = 60s; > >> > .window = 3; > >> > .threshold = 2; > >> > } > >> > } > >> > > >> > backend web2 { > >> > .host = "10.10.10.26"; > >> > .port = "80"; > >> > .connect_timeout = 1200s; > >> > .first_byte_timeout = 1200s; > >> > .between_bytes_timeout = 1200s; > >> > .max_connections = 70; > >> > .probe = { > >> > .request = > >> > "GET /healthcheck.php HTTP/1.1" > >> > "Host: wiki.example.com" > >> > "Connection: close"; > >> > .interval = 10m; > >> > .timeout = 60s; > >> > .window = 3; > >> > .threshold = 2; > >> > } > >> > } > >> > > >> > director www round-robin { > >> > { .backend = web1; } > >> > { .backend = web2; } > >> > } > >> > > >> > sub vcl_recv { > >> > > >> > if (req.url ~ "&action=submit($|/)") { > >> > return (pass); > >> > } > >> > > >> > set req.backend = www; > >> > return (lookup); > >> > } > >> > > >> > sub vcl_fetch { > >> > set beresp.ttl = 3600s; > >> > set beresp.grace = 4h; > >> > return (deliver); > >> > } > >> > > >> > > >> > sub vcl_deliver { > >> > if (obj.hits> 0) { > >> > set resp.http.X-Cache = "HIT"; > >> > } else { > >> > set resp.http.X-Cache = "MISS"; > >> > } > >> > } > >> > > >> > Thanks, > >> > Tim > >> > > >> > > >> > > >> > -- > >> > GPG me!! > >> > > >> > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B > >> > > >> > > >> > _______________________________________________ > >> > varnish-misc mailing list > >> > varnish-misc at varnish-cache.org > >> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > > > > > > > > > > -- > > GPG me!! > > > > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B > > > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From excelsio at gmx.com Mon Jul 13 17:31:04 2015 From: excelsio at gmx.com (Michael) Date: Mon, 13 Jul 2015 19:31:04 +0200 Subject: modify URL on client side Message-ID: <55A3F5D8.60102@gmx.com> Hello, So far I?ve setup pound with varnish. It?s working fine, except one thing: * The webserver itself is reachable via www.short-name-of-company.com. Several new domains have been added with similar looking names and tld endings, e.g. www.full-name-of-company.info * A client connects with its browser to www.full-name-of-company.info, is passed to pound and varnish. Varnish gets the data from "www.short-name-of-company.com"-server. Unfortunately the apache server of that one is configured to always return "www.short-name-of-company.com" to the client, i.e. the client sees "www.short-name-of-company.com" instead of "www.full-name-of-company.info". * the apache configuration of the web server won?t be changed * So I?m looking for a possibility to change the URL being displayed at the client?s browser. * Is there any possibility to change this within varnish? Best Regards Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From psihozefir at yahoo.com Wed Jul 15 06:12:30 2015 From: psihozefir at yahoo.com (psihozefir) Date: Wed, 15 Jul 2015 06:12:30 +0000 (UTC) Subject: regsub() odd behaviour in varnish-3.0.4-r2 on ubuntu-14.04 LTS Message-ID: <727596253.1296993.1436940750536.JavaMail.yahoo@mail.yahoo.com> Hello! I run a varnish installation and I encountered an odd behaviour when trying to get a cookie and its value using regsub(); here is the code: vcl_hash() {[...]hash_data(regsub(req.http.cookie,"(LoggedIn=[^;]+;)","\1"));[...]return(hash) } The LoggedIn cookie takes integer values depending on the user type and there are multiple user types (or groups).The regex should match only the cookie name, followed by the equal sign, followed by the user type value and finally followed by semicolon and then return the matched string. What varnish does instead is it returns an arbirarily truncated cookie:56 Hash???????? c ASP.NET_SessionId=piejvbp31o4cbdsbzak55j2c; LoggedIn=0; __utma=203440913.1436939550.; __utmb=203440913.; __utmc=20913; __utmz=203440913..utmcsr=(direct)|utmccn=(direct)|utmcmd=(none) (Note: I manually truncated/edited __utm* cookies' values intentionally.) What am I doing wrong, or what should I know more about regsub() to be able to return the intended string?I tested the regex on regex101.com and it works correctly there. Thank you for your help!Sorin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From krishna.ku at flipkart.com Tue Jul 21 09:30:04 2015 From: krishna.ku at flipkart.com (Krishna Kumar (Engineering)) Date: Tue, 21 Jul 2015 15:00:04 +0530 Subject: Performance issue with Varnish Message-ID: Hi all, I am testing using Varnish 4.0.3 in our datacenter. The test setup has 24 servers (48 core, 128GB, Ubuntu 14.10, Intel ixgbe) running HAProxy, with thousand backends, each of which are VM's created on different servers. The HAProxy servers all run either HAProxy only, or HAProxy + Varnish only, the two testing configurations, with no other load. Model 1 (Direct): Clients --> [ HAProxy ] --> 2300 Backends Model 2 (Caching): Clients --> [ HAProxy --> 1 Varnish backend ] --> 2300 Backends The Square brackets above indicate that this item runs on a single baremetal as described above. Diagrammatically: Clients (1000 VM's) ________ _______ ________ ________ (24 baremetals, each |HAProxy| |HAProxy| |HAProxy| |HAProxy| of which runs either |Varnish | |Varnish | |Varnish | |Varnish | HAProxy or HAProxy+Varnish) -------------- -------------- -------------- -------------- | | | | v v v v B1-B2300 B1-B2300 B1-B2300 B1-B2300 (2300 Backends, VM's running nginx) For the Caching setup, Varnish is configured to run on each baremetal on the 'lo' interface (on the same system). The test is admittedly primitive, each of the 2300 clients simply gets the same 128 byte or 128K byte for a 15 minute run, using wrk with "-t100 -c800". The following table shows the RPS and BW for 128 bytes and 128K bytes, for both "Direct" and "Caching" scenarios, and the % change: |--------|--------------------------------------|-------------------------------------|---------------------------------| | I/O | Req Per Sec | BW | Load | | Size | Direct Cache % | Direct Cache % | Direct Cache % | |--------|--------------------------------------|--------------------------------------|---------------------------------| | 128 | 5559603 2112749 -62% | 1770 780 -56% | 629 197 -69% | | 128K | 118294 138990 18% | 15158 17429 15% | 143 389 172% | |---------|--------------------------------------|-------------------------------------|---------------------------------| (hope it is readable) For small packets, Varnish gets much lower RPS and BW (about half), but also takes lesser system load (measured by top). For large packets, Varnish gets about 15-18% increase in RPS and BW, but significantly increased system load. Varnishstat from a typical server: MAIN.cache_hit 73512277 14655.56 Cache hits MAIN.cache_miss 2 0.00 Cache misses /etc/default/varnish has: NFILES=262144 MEMLOCK=82000 DAEMON_OPTS="-a :6081 -T localhost:6082 -f /etc/varnish/default.vcl \ -p thread_pools=2 -p thread_pool_min=500 -p thread_pool_max=750 \ -p thread_pool_add_delay=2 -S /etc/varnish/secret -s malloc,64G" /etc/varnish/default.vcl is created with the 2300 VM ip addresses as: backend varnish_server_1 { .host = "10.32.118.14"; .port = "80"; .connect_timeout = 5s; .first_byte_timeout = 5s; .between_bytes_timeout = 2s; } ... (and so on till varnish_server_2300) sub vcl_init { new varnish_cluster = directors.round_robin(); varnish_cluster.add_backend(varnish_server_1); varnish_cluster.add_backend(varnish_server_2); ... varnish_cluster.add_backend(varnish_server_2300); } sub vcl_recv { set req.backend_hint = varnish_cluster.backend(); } Could someone suggest any improvement in my setup for better performance? Thanks, Krishna Kumar -- ------------------------------------------------------------------------------------------------------------------------------------------ This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Although Flipkart has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments -------------- next part -------------- An HTML attachment was scrubbed... URL: From made at berlingskemedia.dk Wed Jul 22 10:07:41 2015 From: made at berlingskemedia.dk (Martin Dew-Hattens) Date: Wed, 22 Jul 2015 12:07:41 +0200 Subject: Varnish puppet problem Message-ID: Hello varnishers, Got a problem tryo to apply the varnish puppet when I do puppet apply -v site.pp i get Error: Failed to parse template varnish/vcl/default.erb: Filepath: /etc/puppet/modules/varnish/templates/vcl/default.erb Line: 31 Detail: undefined method `map' for "/request":String at /etc/puppet/modules/varnish/manifests/instance.pp:399 on node pa-varnish-as.eu-west-1.compute.internal Error: Failed to parse template varnish/vcl/default.erb: Filepath: /etc/puppet/modules/varnish/templates/vcl/default.erb Line: 31 Detail: undefined method `map' for "/request":String at /etc/puppet/modules/varnish/manifests/instance.pp:399 on node pa-varnish-as.eu-west-1.compute.internal Just for testing I have define health_check_request => '/request', but I can see I am getting ther error in this area of default.vcl probe backend_health_check { <% if @health_check_request -%> .request = <%= @health_check_request.map { |r| %(#{"\ " * 8}"#{r}")}.join("\n") + ";" %> <% else -%> so its look for health_check_request.map what is that ??? -- Martin Dew-Hattens B.Sc, Ph.D VMware Certified Professional (VCP5-DCV) Novell Certified Linux Administrator (CLP) CEH, CHFI, ECSA Tlf: +45 25 45 65 28 made at berlingskemedia.dk Berlingske Media A/S Pilestr?de 34 DK1147 Copenhagen K -------------- next part -------------- An HTML attachment was scrubbed... URL: From mattias at nucleus.be Wed Jul 22 13:03:04 2015 From: mattias at nucleus.be (Mattias Geniar) Date: Wed, 22 Jul 2015 13:03:04 +0000 Subject: Varnish puppet problem In-Reply-To: References: Message-ID: > Detail: undefined method `map' for "/request":String The problem isn't Varnish, it's actually within Puppet/Ruby. You are passing along a string parameter with the value "/request" and are using an Array method ".map" on it. The "@health_check_request.map" method is only valid for Arrays, not Strings. Assuming you actually meant to pass along a String, the following should work: <% if @health_check_request -%> .request = "<%= @health_check_request %>"; <% else -%> If not, check your input, make sure you're passing along an Array and not a String. Mattias From madgaikw at cisco.com Thu Jul 23 17:22:18 2015 From: madgaikw at cisco.com (Madhava Gaikwad (madgaikw)) Date: Thu, 23 Jul 2015 17:22:18 +0000 Subject: Partial object caching with varnish Message-ID: <482801000965C7489DFC3C6412129B1F6B745850@xmb-aln-x10.cisco.com> Hello, I am newbie to varnish. I want to confirm following two things I am trying to accomplish. I understood varnish can do live streaming etc. But can it really just do partial object caching. Say if I send a range request for 10 bytes from a pdf, and setup varnish cache to cache everything, will it be able to do it? What is the VCL config I should look at this. I have a forward proxy which cannot do range request caching. Neither it can do streaming. So I am thinking to put varnish in between client and forward proxy. Forward proxy connects to internet. I know you may come up and laugh at me first on what I am trying to achieve, but is it achievable? Thank you. Madhava -------------- next part -------------- An HTML attachment was scrubbed... URL: From varnish at tengu.ch Fri Jul 24 05:56:45 2015 From: varnish at tengu.ch (=?UTF-8?B?Q8OpZHJpYyBKZWFubmVyZXQ=?=) Date: Fri, 24 Jul 2015 07:56:45 +0200 Subject: Masquerade backend errors on 4.0.x Message-ID: <55B1D39D.2050601@tengu.ch> Hello, I'm having some troubles trying to masquerade backend error with varnish 4.0.x. Lemme explain: queries are sent to some backend server, which might return either 403 or 200 HTTP code, both with content (403 will explain "access denied" with some random string, blah). My aim is to override the 403 error with "204" (no content) and deliver an empty content. Issue so far: I can set the response status to 204, but I'm unable to deliver an actual empty body and varnish still sets the header Content-Length to the backend answer length. My thought were to put some "return(sync(204))" in vcl_backend_response, but this subroutine is unable to return "synth"; I tried to mess a bit in vcl_deliver, but there as well, unable to return "synth". Of course, "unset *.http.Content-Length" doesn't work. We really need to return 204 with no content, as: - an application would display some bad stuff if we don't return 204 - other applications crash weirdly if we do return 204 with a Content-Length above 0 I'm pretty sure the vcl_deliver should be able to return synth() in order to allow body/content override, but? it doesn't seem to be the case in varnish 4.0.x :(. Any thought or advice? Thanks! Cheers, C. From varnish at tengu.ch Fri Jul 24 06:33:53 2015 From: varnish at tengu.ch (=?UTF-8?B?Q8OpZHJpYyBKZWFubmVyZXQ=?=) Date: Fri, 24 Jul 2015 08:33:53 +0200 Subject: Masquerade backend errors on 4.0.x In-Reply-To: <55B1D39D.2050601@tengu.ch> References: <55B1D39D.2050601@tengu.ch> Message-ID: <55B1DC51.1060703@tengu.ch> Hello, Just found out: documentation is wrong, vcl_deliver DOES support "synth" as a return function? This showed the support: https://gist.github.com/mjf/ddae14982720f77b665a#file-varnish_cache_subroutines-rst and a test validated it. So now I can do a synthetic(""); in vcl_synth if we get a 204 status. Might be good to update/correct this page: https://www.varnish-cache.org/docs/4.0/users-guide/vcl-built-in-subs.html#vcl-deliver Cheers, C. On 07/24/2015 07:56 AM, C?dric Jeanneret wrote: > Hello, > > I'm having some troubles trying to masquerade backend error with varnish > 4.0.x. > > Lemme explain: > queries are sent to some backend server, which might return either 403 > or 200 HTTP code, both with content (403 will explain "access denied" > with some random string, blah). > > My aim is to override the 403 error with "204" (no content) and deliver > an empty content. > > Issue so far: I can set the response status to 204, but I'm unable to > deliver an actual empty body and varnish still sets the header > Content-Length to the backend answer length. > > My thought were to put some "return(sync(204))" in vcl_backend_response, > but this subroutine is unable to return "synth"; I tried to mess a bit > in vcl_deliver, but there as well, unable to return "synth". > > Of course, "unset *.http.Content-Length" doesn't work. > > We really need to return 204 with no content, as: > - an application would display some bad stuff if we don't return 204 > - other applications crash weirdly if we do return 204 with a > Content-Length above 0 > > I'm pretty sure the vcl_deliver should be able to return synth() in > order to allow body/content override, but? it doesn't seem to be the > case in varnish 4.0.x :(. > > Any thought or advice? > > Thanks! > > Cheers, > > C. > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > From bluethundr at gmail.com Tue Jul 28 02:24:21 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Mon, 27 Jul 2015 22:24:21 -0400 Subject: frequent 504's on VCL Message-ID: Hey guys, I'm getting frequent 504 errors in the browser when using even this stripped down version of my VCL. I'm also seeing the pages load with the css totally blown and just the text showing up with no graphics and no formatting. It's pretty odd how inconsistent this VCL is behaving. On one load of the page where the graphics are broken I'm seeing this error in the logs: 10 TxHeader c Connection: close 10 TxHeader c X-Cache: MISS 10 Debug c Write error, retval = -1, len = 602, errno = Connection reset by peer 10 ReqEnd c 241437586 1438049428.763690233 1438049435.868994236 0.000110626 7.105183125 0.000120878 10 StatSess c 54.86.143.49 48979 7 1 1 0 0 1 602 0 0 CLI - Rd ping 0 CLI - Wr 200 19 PONG 1438049436 1.0 And on 504 errors I'm seeing this result in the logs: 10 TxHeader c Via: 1.1 varnish 10 TxHeader c Connection: close 10 TxHeader c X-Cache: MISS 10 Length c 316 10 ReqEnd c 241437672 1438049679.750560999 1438049679.750730276 0.000099182 0.000094652 0.000074625 10 SessionClose c error 10 StatSess c 54.86.143.49 49821 0 1 1 0 0 0 285 316 I'm running 3 back ends using apache 2.4 on Centos 7. I'm running two Varnish nodes at version 3.0.5. All 3 web backends are checking in as healthy: [root at varnish1:/etc/varnish] #varnishadm -n varnish1 debug.health Backend web1 is Healthy Current states good: 3 threshold: 2 window: 3 Average responsetime of good probes: 0.026873 Oldest Newest ================================================================ --------------------------------------------------------------44 Good IPv4 --------------------------------------------------------------XX Good Xmit --------------------------------------------------------------RR Good Recv ------------------------------------------------------------HHHH Happy Backend web2 is Healthy Current states good: 3 threshold: 2 window: 3 Average responsetime of good probes: 0.029118 Oldest Newest ================================================================ --------------------------------------------------------------44 Good IPv4 --------------------------------------------------------------XX Good Xmit --------------------------------------------------------------RR Good Recv ------------------------------------------------------------HHHH Happy Backend web3 is Healthy Current states good: 3 threshold: 2 window: 3 Average responsetime of good probes: 0.029101 Oldest Newest ================================================================ --------------------------------------------------------------44 Good IPv4 --------------------------------------------------------------XX Good Xmit --------------------------------------------------------------RR Good Recv ------------------------------------------------------------HHHH Happy And here is the VCL that I am having this trouble with: [root at varnish1:/etc/varnish] #egrep -v "^#|^$" default.vcl backend web1 { .host = ?10.10.10.25?; .port = "80"; .connect_timeout = 45s; .first_byte_timeout = 45s; .between_bytes_timeout = 45s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Authorization: Basic SomeLongBase64Hash=" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } backend web2 { .host = ?10.10.10.26?; .port = "80"; .connect_timeout = 45s; .first_byte_timeout = 45s; .between_bytes_timeout = 45s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Authorization: Basic SomeLongBase64Hash=" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } backend web3 { .host = ?10.10.10.27?; .port = "80"; .connect_timeout = 45s; .first_byte_timeout = 45s; .between_bytes_timeout = 45s; .max_connections = 70; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Authorization: Basic SomeLongBase64Hash=" "Connection: close"; .interval = 10m; .timeout = 60s; .window = 3; .threshold = 2; } } director www round-robin { { .backend = web1; } { .backend = web2; } { .backend = web3; } } sub vcl_recv { set req.backend = www; unset req.http.Cookie; if (! req.http.Authorization ~ "Basic SomeLongBase64Hash=" && ! req.http.Authorization ~ "Basic AnotherLongBase64Hash=" ) { error 401 "Restricted"; } if (req.url ~ "&action=submit($|/)") { return (pass); } return (lookup); } sub vcl_fetch { set beresp.ttl = 3600s; set beresp.grace = 4h; return (deliver); } sub vcl_error { if (obj.status == 401) { set obj.http.Content-Type = "text/html; charset=utf-8"; set obj.http.WWW-Authenticate = "Basic realm=Secured"; synthetic {" Error

401 Unauthorized (varnish)

"}; return (deliver); } } sub vcl_deliver { if (obj.hits> 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } The IP's you see listed above are fake. I'm not really running the web servers on a 10-net. I'm really looking forward to getting this solved! And I'd appreciate any feedback you may have. Thanks, Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkarsten at varnish-software.com Tue Jul 28 11:31:06 2015 From: lkarsten at varnish-software.com (Lasse Karstensen) Date: Tue, 28 Jul 2015 13:31:06 +0200 Subject: frequent 504's on VCL In-Reply-To: References: Message-ID: <20150728113105.GB21270@immer.varnish-software.com> On Mon, Jul 27, 2015 at 10:24:21PM -0400, Tim Dunphy wrote: [cut] > It's pretty odd how inconsistent this VCL is behaving. Varnish does not produce 504 responses. If you see 504, they are coming through from your backend. > On one load of the page where the graphics are broken I'm seeing this error > in the logs: > 10 TxHeader c Connection: close > 10 TxHeader c X-Cache: MISS > 10 Debug c Write error, retval = -1, len = 602, errno = > Connection reset by peer > 10 ReqEnd c 241437586 1438049428.763690233 1438049435.868994236 > 0.000110626 7.105183125 0.000120878 Client or backend closed the connection (went away) after 7 seconds. [..] > And on 504 errors I'm seeing this result in the logs: > 10 TxHeader c Via: 1.1 varnish > 10 TxHeader c Connection: close > 10 TxHeader c X-Cache: MISS > 10 Length c 316 > 10 ReqEnd c 241437672 1438049679.750560999 1438049679.750730276 > 0.000099182 0.000094652 0.000074625 > 10 SessionClose c error > 10 StatSess c 54.86.143.49 49821 0 1 1 0 0 0 285 316 > I'm running 3 back ends using apache 2.4 on Centos 7. I'm running two > Varnish nodes at version 3.0.5. Please note that Varnish 3 is end of life. In the 4.0 release timestamp logging is vastly improved. It would tell you if it was the client or backend that went away above, for example. Another sweet feature in 4.0 is that just-expired content will be served while a background fetch is initiated. > backend web1 { > .host = ?10.10.10.25?; > .port = "80"; > .connect_timeout = 45s; > .first_byte_timeout = 45s; > .between_bytes_timeout = 45s; [probe section] > .timeout = 60s; These timeouts are way too long. What client sits around for 45s waiting for a web page? Isn't it better to produce clean 503s to the client that can be looked for in varnishlog, rather than not responding? -- Lasse Karstensen Varnish Software AS From bluethundr at gmail.com Tue Jul 28 15:50:38 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Tue, 28 Jul 2015 11:50:38 -0400 Subject: frequent 504's on VCL In-Reply-To: <20150728113105.GB21270@immer.varnish-software.com> References: <20150728113105.GB21270@immer.varnish-software.com> Message-ID: > > > It's pretty odd how inconsistent this VCL is behaving. > > Varnish does not produce 504 responses. > > If you see 504, they are coming through from your backend. Yeah that makes sense. > > > > > On one load of the page where the graphics are broken I'm seeing this > error > > in the logs: > > 10 TxHeader c Connection: close > > 10 TxHeader c X-Cache: MISS > > 10 Debug c Write error, retval = -1, len = 602, errno = > > Connection reset by peer > > 10 ReqEnd c 241437586 1438049428.763690233 1438049435.868994236 > > 0.000110626 7.105183125 0.000120878 > > Client or backend closed the connection (went away) after 7 seconds. > > > [..] > > And on 504 errors I'm seeing this result in the logs: > > 10 TxHeader c Via: 1.1 varnish > > 10 TxHeader c Connection: close > > 10 TxHeader c X-Cache: MISS > > 10 Length c 316 > > 10 ReqEnd c 241437672 1438049679.750560999 1438049679.750730276 > > 0.000099182 0.000094652 0.000074625 > > 10 SessionClose c error > > 10 StatSess c 54.86.143.49 49821 0 1 1 0 0 0 285 316 > > I'm running 3 back ends using apache 2.4 on Centos 7. I'm running two > > Varnish nodes at version 3.0.5. > When I'm tailing the logs for both apache and varnish at the same time, this is what I see happening in both logs when the 504 errors occur: Varnish: 10 Debug c Write error, retval = -1, len = 613, errno = Connection reset by peer 10 ReqEnd c 822463677 1438097028.730667830 1438097034.047527790 0.000082970 5.316770077 0.000089884 10 StatSess c 54.86.143.49 42935 5 1 1 0 0 1 613 0 0 CLI - Rd ping Apache: [Tue Jul 28 15:16:11.614501 2015] [authz_core:debug] [pid 6407] mod_authz_core.c(809): [client 162.243.86.41:55114] AH01626: authorization result of : denied (no authenticated user yet) [Tue Jul 28 15:16:11.614763 2015] [authz_core:debug] [pid 6407] mod_authz_core.c(809): [client 162.243.86.41:55114] AH01626: authorization result of Require valid-user : granted [Tue Jul 28 15:16:11.614767 2015] [authz_core:debug] [pid 6407] mod_authz_core.c(809): [client 162.243.86.41:55114] AH01626: authorization result of : granted [Tue Jul 28 15:16:11.614841 2015] [authz_core:debug] [pid 6407] mod_authz_core.c(809): [client 162.243.86.41:55114] AH01626: authorization result of Require valid-user : denied (no authenticated user yet) And as you can see from my VCL I am performing some apache basic authentication, and then passing it through to the back end. And as you can see my health check also passes authentication headers to the health check file (healthcheck.php) Now I could be wrong. But what I think is happening is that varnish is passing the request to on back end, and authenticating, and then sends another request to a different host without authentication being passed to it. If my theory is correct, this could be fixed by added session persistence to my varnish VCL. How can I add sticky sessions to varnish? I think that might do the trick. Thanks Tim > > Please note that Varnish 3 is end of life. > > In the 4.0 release timestamp logging is vastly improved. It would tell > you if it was the client or backend that went away above, for example. > > Another sweet feature in 4.0 is that just-expired content will be served > while a background fetch is initiated. > > > backend web1 { > > .host = ?10.10.10.25?; > > .port = "80"; > > .connect_timeout = 45s; > > .first_byte_timeout = 45s; > > .between_bytes_timeout = 45s; > [probe section] > > .timeout = 60s; > > These timeouts are way too long. What client sits around for 45s waiting > for a web page? Isn't it better to produce clean 503s to the client that > can be looked for in varnishlog, rather than not responding? -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B On Tue, Jul 28, 2015 at 7:31 AM, Lasse Karstensen < lkarsten at varnish-software.com> wrote: > On Mon, Jul 27, 2015 at 10:24:21PM -0400, Tim Dunphy wrote: > [cut] > > It's pretty odd how inconsistent this VCL is behaving. > > Varnish does not produce 504 responses. > > If you see 504, they are coming through from your backend. > > > > On one load of the page where the graphics are broken I'm seeing this > error > > in the logs: > > 10 TxHeader c Connection: close > > 10 TxHeader c X-Cache: MISS > > 10 Debug c Write error, retval = -1, len = 602, errno = > > Connection reset by peer > > 10 ReqEnd c 241437586 1438049428.763690233 1438049435.868994236 > > 0.000110626 7.105183125 0.000120878 > > Client or backend closed the connection (went away) after 7 seconds. > > > [..] > > And on 504 errors I'm seeing this result in the logs: > > 10 TxHeader c Via: 1.1 varnish > > 10 TxHeader c Connection: close > > 10 TxHeader c X-Cache: MISS > > 10 Length c 316 > > 10 ReqEnd c 241437672 1438049679.750560999 1438049679.750730276 > > 0.000099182 0.000094652 0.000074625 > > 10 SessionClose c error > > 10 StatSess c 54.86.143.49 49821 0 1 1 0 0 0 285 316 > > I'm running 3 back ends using apache 2.4 on Centos 7. I'm running two > > Varnish nodes at version 3.0.5. > > Please note that Varnish 3 is end of life. > > In the 4.0 release timestamp logging is vastly improved. It would tell > you if it was the client or backend that went away above, for example. > > Another sweet feature in 4.0 is that just-expired content will be served > while a background fetch is initiated. > > > backend web1 { > > .host = ?10.10.10.25?; > > .port = "80"; > > .connect_timeout = 45s; > > .first_byte_timeout = 45s; > > .between_bytes_timeout = 45s; > [probe section] > > .timeout = 60s; > > These timeouts are way too long. What client sits around for 45s waiting > for a web page? Isn't it better to produce clean 503s to the client that > can be looked for in varnishlog, rather than not responding? > > > -- > Lasse Karstensen > Varnish Software AS > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluethundr at gmail.com Tue Jul 28 17:49:30 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Tue, 28 Jul 2015 13:49:30 -0400 Subject: vmod basic auth on varnish 4 Message-ID: Guys, I've just upgraded my varnish setup from version 3 to version 4. I've successfully ported almost my entire default.vcl file to the new syntax. But I need to use Apache basic auth. The new version requires you to install a module for this. But my installation was done via rpm/yum. All the instructions I found on how to install this module tell you to install from source. Which I'd rather avoid. How can I add this module if I installed varnish thru yum? Thank you, Tim Sent from my iPhone From t.honacker at googlemail.com Wed Jul 29 08:37:11 2015 From: t.honacker at googlemail.com (Tobias Honacker) Date: Wed, 29 Jul 2015 10:37:11 +0200 Subject: malloc and caching misses Message-ID: Hi guys, first of all we are using malloc 16G and varnish currently using > 25G of our memory so the vm begin to swap. n_lru_nuked value raise up quickly. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 65265 nobody 20 0 27.3g 25g 81m S 12.3 82.2 463:09.42 varnishd root /usr/sbin/varnishd -P /var/run/varnish.pid -a 127.0.0.1:8080 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -S /etc/varnish/secret -s malloc,16G nobody /usr/sbin/varnishd -P /var/run/varnish.pid -a 127.0.0.1:8080 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -S /etc/varnish/secret -s malloc,16G Further we cache our main site "/". The first varnish proxy logs many cache misses on "/", the second varnish proxy not. I ran following command to monitor this behavior for ~20 minutes varnishlog -g request -q "RespHeader eq 'X-Cache: MISS'" -i ReqUrl |grep ReqURL | awk '{print $3}' and the output is that varnish logs 432 cache misses on "/" using beresp.ttl = 300s on that site. Both configs (varnish 1 and varnish 2) are the same using puppet to publish the configs. Env: varnishd (varnish-4.0.2 revision bfe7cd1) Copyright (c) 2006 Verdens Gang AS Copyright (c) 2006-2014 Varnish Software AS Red Hat Enterprise Linux Server release 6.6 (Santiago) User -> 2 keepalives -> 2 varnishes -> backend (tomcat) VCL snippet: sub vcl_recv: if( req.url ~ "^/$") { return (hash); } return (pass); sub vcl_backend_response: if ((beresp.status == 200) && (beresp.http.Cache-Control ~ "max-age")) { return (deliver); } Thanks, Tobias -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkarsten at varnish-software.com Wed Jul 29 11:42:41 2015 From: lkarsten at varnish-software.com (Lasse Karstensen) Date: Wed, 29 Jul 2015 13:42:41 +0200 Subject: vmod basic auth on varnish 4 In-Reply-To: References: Message-ID: <20150729114240.GD21270@immer.varnish-software.com> On Tue, Jul 28, 2015 at 01:49:30PM -0400, Tim Dunphy wrote: > I've just upgraded my varnish setup from version 3 to version 4. I've successfully ported almost my entire default.vcl file to the new syntax. [cut] > But I need to use Apache basic auth. The new version requires you to install a module for this. I'm not familiar with any change in this regard between 3.0 and 4.0. Basic auth is just a static header. Set bereq.http.Authorization and you should be good to go? -- Lasse Karstensen Varnish Software AS From lkarsten at varnish-software.com Wed Jul 29 11:48:58 2015 From: lkarsten at varnish-software.com (Lasse Karstensen) Date: Wed, 29 Jul 2015 13:48:58 +0200 Subject: malloc and caching misses In-Reply-To: References: Message-ID: <20150729114857.GE21270@immer.varnish-software.com> On Wed, Jul 29, 2015 at 10:37:11AM +0200, Tobias Honacker wrote: > first of all we are using malloc 16G and varnish currently using > 25G of > our memory so the vm begin to swap. n_lru_nuked value raise up quickly. For reducing the resident memory size, try reducing fetch_chunksize to 8KB. If that doesn't bring the memory consumption down, reduce the malloc size. Since you're doing a lot of return(pass), there will be a lot of transient memory usage for buffering the objects as they are passed through Varnish. Note that there is a concurrency related panic in 4.0.2 when running with a bit of load. You should upgrade to 4.0.3. -- Lasse Karstensen Varnish Software AS From bluethundr at gmail.com Wed Jul 29 14:51:58 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Wed, 29 Jul 2015 10:51:58 -0400 Subject: vmod basic auth on varnish 4 In-Reply-To: <20150729114240.GD21270@immer.varnish-software.com> References: <20150729114240.GD21270@immer.varnish-software.com> Message-ID: Hi Lasse, I was able to work out how to do this. All I had to do was download the varnish 4.0.3 source code, do autogen.sh, configure and make. But not a make install. Then I went to the vmod basic auth directory I had downloaded and expanded earlier. Went through autogen.sh, configure, make && make install there. While pointing to my vmods directory in the configure script. Works perfectly! And that is apparently the way to do it. However it would be a lot easier if I could have done this through a yum install. Apparently this is not possible currently. This vmod was appealing because with this there is no need to enter in the user name and password of the auth user using a base64 hash. You simply point it to a file you create with htpasswd somewhere on your file system. It's a little better that way I think. Thanks Tim On Wed, Jul 29, 2015 at 7:42 AM, Lasse Karstensen < lkarsten at varnish-software.com> wrote: > On Tue, Jul 28, 2015 at 01:49:30PM -0400, Tim Dunphy wrote: > > I've just upgraded my varnish setup from version 3 to version 4. I've > successfully ported almost my entire default.vcl file to the new syntax. > [cut] > > But I need to use Apache basic auth. The new version requires you to > install a module for this. > > I'm not familiar with any change in this regard between 3.0 and 4.0. Basic > auth is just a static header. > Set bereq.http.Authorization and you should be good to go? > > > -- > Lasse Karstensen > Varnish Software AS > -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluethundr at gmail.com Thu Jul 30 02:18:02 2015 From: bluethundr at gmail.com (Tim Dunphy) Date: Wed, 29 Jul 2015 22:18:02 -0400 Subject: 504 errors with basic auth in varnish 4 vcl Message-ID: Hey all, I'm having trouble getting basic auth to work under varnish 4. I'm setting it up in front of a mediawiki site. If I hit the page from either a web browser I get a 504 The server didn't respond in time error. This doesn't happen when basic auth is not enabled in either the apache config or the varnish config. Which makes me think I may be handling basic auth wrong somewhere in my setup. I am able to curl the health check file through varnish. I'm doing this on the varnish server itself: #time curl --user admin http://wiki.example.com/healthcheck.php Enter host password for user 'admin': good real 0m3.080s user 0m0.003s sys 0m0.004s The health check file contains only the word 'good'. On the web server, the healthcheck.php file is in the doc root of the wiki site and is readable by the apache user: #ls -l /var/www/jf/wiki/healthcheck.php -rw-r--r--. 1 apache ftpgroup 5 Jul 17 00:42 /var/www/jf/wiki/healthcheck.php I've setup a no auth exception in the apache vhost for the site: ServerName wiki.example.com ServerAlias www.wiki.example.com Options -Indexes +FollowSymlinks LogLevel debug ErrorLog logs/wiki-error.log LogFormat "%h %l %u %t \"%r\" %>s %b" common CustomLog logs/wiki-access_log common DocumentRoot /var/www/jf/wiki SetEnvIf Request_URI ^/healthcheck.php noauth=1 Options -Indexes AuthType Basic AuthName "JF Wiki Page" AuthUserFile /etc/httpd/auth Require valid-user #equire all granted Allow from env=noauth Options -Indexes On the varnish end I installed the 'basicauth' vmod, and imported it. Then set it up in the VCL. Here's how my VCL is looking: #egrep -v '#|^$' default.vcl vcl 4.0; import std; import directors; import basicauth; backend web1 { .host = "10.10.10.25"; # <-- not a real IP .port = "80"; .connect_timeout = 45s; .first_byte_timeout = 45s; .between_bytes_timeout = 45s; .max_connections = 800; .probe = { .request = "GET /healthcheck.php HTTP/1.1" "Host: wiki.example.com" "Authorization: Basic LongBasicAuthBase64Hash==" "Connection: close"; .timeout = 10s; .interval = 1s; .window = 15; .threshold = 8; } } sub vcl_init { new wiki = directors.round_robin(); wiki.add_backend(web1); } sub vcl_recv { set req.backend_hint = wiki.backend(); if (!basicauth.match("/etc/httpd/auth", req.http.Authorization)) { return(synth(401, "Authentication required")); } } sub vcl_backend_response { } sub vcl_deliver { } sub vcl_synth { if (resp.status == 401) { set resp.http.WWW-Authenticate = "Basic"; } } You can see in my VCL that I'm attempting to pass basic auth headers to the healthcheck .probe. In varnishlog, when I'm getting the 504 errors in the browser, I'm seeing the following: - Timestamp Process: 1438220128.357217 5.381197 0.000029 - RespHeader Transfer-Encoding: chunked - Debug "RES_MODE 8" - RespHeader Connection: close - RespHeader Accept-Ranges: bytes - Debug "Write error, retval = -1, len = 14553, errno = Connection reset by peer" - Timestamp Resp: 1438220128.357317 5.381297 0.000101 - Debug "XXX REF 1" - ReqAcct 506 0 506 0 0 0 - End And in the apache error log for the site I'm seeing this authorization error that corresponds with the time that I'm getting the 504 error: [Thu Jul 30 01:37:43.197847 2015] [authz_core:debug] [pid 29441] mod_authz_core.c(809): [client 10.10.10.19:47588] AH01626: authorization result of Require valid-user : denied (no authenticated user yet) I'm hoping to get some suggestions that will get this to work! Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -------------- next part -------------- An HTML attachment was scrubbed... URL: