From cloude at instructables.com  Wed Apr  1 23:11:47 2009
From: cloude at instructables.com (Cloude Porteus)
Date: Wed, 1 Apr 2009 16:11:47 -0700
Subject: Debugging / nuked objects spike
Message-ID: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>

We've been running varnish in production for about a week and I've
been noticing that things aren't quite right, but it's been hard to
figure out what. Most of the time Varnish is running with a 98% hit
ratio and all is fine. We have been running for a few days with about
250k documents in the cache.

I just noticed that the number of documents in the cache dropped from
~140k -> ~30k and the LRU Nuked Objects increased by 100k. I assume
we're hitting our storage limit, which is currently set to 10gb. We
had it set at 50gb before, but we were still having similar problems.
I noticed last night there was a couple of hours where it looked like
the hit ratio was close to zero, but then it went back to normal.

Any ideas what would cause Varnish to nuke ~100k objects all at once?
I've gone over all the performance tuning info and we've tried to
implement most of the suggestions. I'm just not sure which direction
to start tuning further.

Thanks for any suggestions. Here's our current default.vcl:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Configuration file for varnish
NFILES=131072
MEMLOCK=82000
VARNISH_VCL_CONF=/etc/varnish/instructables.vcl
VARNISH_LISTEN_ADDRESS=
VARNISH_LISTEN_PORT=80
VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
VARNISH_ADMIN_LISTEN_PORT=82
VARNISH_MIN_THREADS=400
VARNISH_MAX_THREADS=1000
VARNISH_THREAD_TIMEOUT=60
VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin
VARNISH_STORAGE_SIZE=10G
VARNISH_STORAGE="file,${VARNISH_STORAGE_FILE},${VARNISH_STORAGE_SIZE}"
VARNISH_TTL=1800

DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
             -f ${VARNISH_VCL_CONF} \
             -T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
             -t ${VARNISH_TTL} \
             -w
${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},${VARNISH_THREAD_TIMEOUT}
\
             -u varnish -g varnish \
             -s ${VARNISH_STORAGE} \
             -p obj_workspace=4096 \
             -p sess_workspace=262144 \
             -p lru_interval=3600
             -p listen_depth=8192 \
             -p log_hashstring=off \
             -p sess_timeout=10 \
             -p shm_workspace=32768 \
             -p ping_interval=1 \
             -p thread_pools=4 \
             -p thread_pool_min=100 \
             -p srcaddr_ttl=0 \
             -p esi_syntax=1 "

thanks,
cloude
-- 
VP of Product Development
Instructables.com

http://www.instructables.com/member/lebowski


From jna at twitter.com  Wed Apr  1 23:23:13 2009
From: jna at twitter.com (John Adams)
Date: Wed, 1 Apr 2009 16:23:13 -0700
Subject: Debugging / nuked objects spike
In-Reply-To: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
References: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
Message-ID: <0EB3DE15-CDDA-4AF6-8FB6-233716A68683@twitter.com>

Are you sure it's the same child  process running? If the child dies  
randomly (SEGV, etc -- check your syslogs) it might be restarting on  
you.

If you're using lots of regexps you may have to increase  
sess_workspace. Look for those errors in the logs.

-j

On Apr 1, 2009, at 4:11 PM, Cloude Porteus wrote:

> We've been running varnish in production for about a week and I've
> been noticing that things aren't quite right, but it's been hard to
> figure out what. Most of the time Varnish is running with a 98% hit
> ratio and all is fine. We have been running for a few days with about
> 250k documents in the cache.
>
> I just noticed that the number of documents in the cache dropped from
> ~140k -> ~30k and the LRU Nuked Objects increased by 100k. I assume
> we're hitting our storage limit, which is currently set to 10gb. We
> had it set at 50gb before, but we were still having similar problems.
> I noticed last night there was a couple of hours where it looked like
> the hit ratio was close to zero, but then it went back to normal.
>
> Any ideas what would cause Varnish to nuke ~100k objects all at once?
> I've gone over all the performance tuning info and we've tried to
> implement most of the suggestions. I'm just not sure which direction
> to start tuning further.
>
> Thanks for any suggestions. Here's our current default.vcl:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> # Configuration file for varnish
> NFILES=131072
> MEMLOCK=82000
> VARNISH_VCL_CONF=/etc/varnish/instructables.vcl
> VARNISH_LISTEN_ADDRESS=
> VARNISH_LISTEN_PORT=80
> VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
> VARNISH_ADMIN_LISTEN_PORT=82
> VARNISH_MIN_THREADS=400
> VARNISH_MAX_THREADS=1000
> VARNISH_THREAD_TIMEOUT=60
> VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin
> VARNISH_STORAGE_SIZE=10G
> VARNISH_STORAGE="file,${VARNISH_STORAGE_FILE},${VARNISH_STORAGE_SIZE}"
> VARNISH_TTL=1800
>
> DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
>             -f ${VARNISH_VCL_CONF} \
>             -T ${VARNISH_ADMIN_LISTEN_ADDRESS}:$ 
> {VARNISH_ADMIN_LISTEN_PORT} \
>             -t ${VARNISH_TTL} \
>             -w
> ${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},$ 
> {VARNISH_THREAD_TIMEOUT}
> \
>             -u varnish -g varnish \
>             -s ${VARNISH_STORAGE} \
>             -p obj_workspace=4096 \
>             -p sess_workspace=262144 \
>             -p lru_interval=3600
>             -p listen_depth=8192 \
>             -p log_hashstring=off \
>             -p sess_timeout=10 \
>             -p shm_workspace=32768 \
>             -p ping_interval=1 \
>             -p thread_pools=4 \
>             -p thread_pool_min=100 \
>             -p srcaddr_ttl=0 \
>             -p esi_syntax=1 "
>
> thanks,
> cloude
> -- 
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev

---
John Adams
Twitter Operations
jna at twitter.com
http://twitter.com/netik


From des at des.no  Thu Apr  2 00:02:29 2009
From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=)
Date: Thu, 02 Apr 2009 02:02:29 +0200
Subject: Debugging / nuked objects spike
In-Reply-To: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
	(Cloude Porteus's message of "Wed, 1 Apr 2009 16:11:47 -0700")
References: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
Message-ID: <86myazapnu.fsf@ds4.des.no>

Cloude Porteus <cloude at instructables.com> writes:
> Any ideas what would cause Varnish to nuke ~100k objects all at once?

Just a guess: these objects are your hot set, so they're all loaded
within the first seconds of operation, and they all have the same expiry
time, so they all expire within a few seconds of each other.

DES
-- 
Dag-Erling Sm?rgrav - des at des.no


From sky at crucially.net  Thu Apr  2 00:51:31 2009
From: sky at crucially.net (Artur Bergman)
Date: Wed, 1 Apr 2009 17:51:31 -0700
Subject: Debugging / nuked objects spike
In-Reply-To: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
References: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
Message-ID: <9F5D9B7B-A6D7-4E99-9A70-58CA866FB1C0@crucially.net>


On Apr 1, 2009, at 4:11 PM, Cloude Porteus wrote:

>
> I just noticed that the number of documents in the cache dropped from
> ~140k -> ~30k and the LRU Nuked Objects increased by 100k. I assume
> we're hitting our storage limit, which is currently set to 10gb. We
> had it set at 50gb before, but we were still having similar problems.
> I noticed last night there was a couple of hours where it looked like
> the hit ratio was close to zero, but then it went back to normal.
>
> Any ideas what would cause Varnish to nuke ~100k objects all at once?
> I've gone over all the performance tuning info and we've tried to
> implement most of the suggestions. I'm just not sure which direction
> to start tuning further.

There are problems with the fragmentation of the store. Try using  
malloc and see if the problem goes away. (We see this regularly)

Artur

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090401/fabb83c6/attachment.html>

From sky at crucially.net  Thu Apr  2 00:51:46 2009
From: sky at crucially.net (Artur Bergman)
Date: Wed, 1 Apr 2009 17:51:46 -0700
Subject: Debugging / nuked objects spike
In-Reply-To: <86myazapnu.fsf@ds4.des.no>
References: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
	<86myazapnu.fsf@ds4.des.no>
Message-ID: <7157980A-735A-4674-923C-8DCB6F27F369@crucially.net>

They would expire, not nuke then :)

Cheers
Artur

On Apr 1, 2009, at 5:02 PM, Dag-Erling Sm?rgrav wrote:

> Cloude Porteus <cloude at instructables.com> writes:
>> Any ideas what would cause Varnish to nuke ~100k objects all at once?
>
> Just a guess: these objects are your hot set, so they're all loaded
> within the first seconds of operation, and they all have the same  
> expiry
> time, so they all expire within a few seconds of each other.
>
> DES
> -- 
> Dag-Erling Sm?rgrav - des at des.no
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev


From on at cs.ait.ac.th  Thu Apr  2 04:51:48 2009
From: on at cs.ait.ac.th (Olivier Nicole)
Date: Thu, 2 Apr 2009 11:51:48 +0700 (ICT)
Subject: ./configure error for sys/mount.h
Message-ID: <200904020451.n324pmxt058307@banyan.cs.ait.ac.th>

Hi,

./configure reports the following error:

checking sys/mount.h usability... no
checking sys/mount.h presence... yes
configure: WARNING: sys/mount.h: present but cannot be compiled
configure: WARNING: sys/mount.h:     check for missing prerequisite headers?
configure: WARNING: sys/mount.h: see the Autoconf documentation
configure: WARNING: sys/mount.h:     section "Present But Cannot Be Compiled"
configure: WARNING: sys/mount.h: proceeding with the preprocessor's result
configure: WARNING: sys/mount.h: in the future, the compiler will take precedence
configure: WARNING:     ## --------------------------------------------- ##
configure: WARNING:     ## Report this to varnish-dev at projects.linpro.no ##
configure: WARNING:     ## --------------------------------------------- ##

As requested, I report. This happens on 2.0.3 and 2.0.4, I think I
tracked it down to sys/param.h missing in conftest.c.

I am not sure it has an impact on the compilation/execution of
Varnish.

Best regards,

Olivier


From cloude at instructables.com  Thu Apr  2 18:06:00 2009
From: cloude at instructables.com (Cloude Porteus)
Date: Thu, 2 Apr 2009 11:06:00 -0700
Subject: Debugging / nuked objects spike
In-Reply-To: <9F5D9B7B-A6D7-4E99-9A70-58CA866FB1C0@crucially.net>
References: <4a05e1020904011611h6800c49xed759f3d4146a39c@mail.gmail.com>
	<9F5D9B7B-A6D7-4E99-9A70-58CA866FB1C0@crucially.net>
Message-ID: <4a05e1020904021106m18ee75ffjc8e27d10b1649d59@mail.gmail.com>

Artur,
So far so good switching to malloc. The system load is also down to .03 from
an average of .45 when I was using the file storage option.

Thanks for the help!

-cloude

On Wed, Apr 1, 2009 at 5:51 PM, Artur Bergman <sky at crucially.net> wrote:

>
> On Apr 1, 2009, at 4:11 PM, Cloude Porteus wrote:
>
>
> I just noticed that the number of documents in the cache dropped from
> ~140k -> ~30k and the LRU Nuked Objects increased by 100k. I assume
> we're hitting our storage limit, which is currently set to 10gb. We
> had it set at 50gb before, but we were still having similar problems.
> I noticed last night there was a couple of hours where it looked like
> the hit ratio was close to zero, but then it went back to normal.
>
> Any ideas what would cause Varnish to nuke ~100k objects all at once?
> I've gone over all the performance tuning info and we've tried to
> implement most of the suggestions. I'm just not sure which direction
> to start tuning further.
>
>
> There are problems with the fragmentation of the store. Try using malloc
> and see if the problem goes away. (We see this regularly)
>
> Artur
>
>


-- 
VP of Product Development
Instructables.com

http://www.instructables.com/member/lebowski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090402/53f439e9/attachment.html>

From rob.halff at gmail.com  Fri Apr  3 12:49:09 2009
From: rob.halff at gmail.com (Rob Halff)
Date: Fri, 3 Apr 2009 14:49:09 +0200
Subject: Virtualhost logging for varnishncsa
Message-ID: <cee869b0904030549o4954cdfap21f05e8aec34971f@mail.gmail.com>

Hi,

I've changed the varnishncsa sourcecode to support virtualhost logging.

I know in the TODO of varnishncsa there is a future wish for "Log in
any format one wants",
but I can imagine that would need a total rewrite and takes some time.

So in the meantime I have a request to add the virtualhost logging as
a commandline option.

I've added  a -v flag enabling virtualhost style logging
In this case the logformat looks like:

$ varnishncsa -v
www.test.nl 111.222.333.44 - - [03/Apr/2009:11:41:57 +0200] "GET
http://www.test.nl/favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/4.0
(compatible; MSIE 7.0; Windows NT 5.1; GTB5)"

Notice the 'www.test.nl' being logged.

This is the equivalent of this kind apache logging:

http://httpd.apache.org/docs/2.0/vhosts/mass.html#simple :

    LogFormat "%V %h %l %u %t \"%r\" %s %b" vcommon

http://httpd.apache.org/docs/2.0/vhosts/mass.html#simple.rewrite :

    LogFormat "%{Host}i %h %l %u %t \"%r\" %s %b" vcommon

So this adds the Host part to the normal kind of logging ncsa is doing.

Taken that this is a very common way to log virtual hosts I would say
the -v option is not just some hack to suit my own needs, I think it
is usefull for others also.

Without this I am not able to use awstats the way we used it when we
where not using varnish.

Attached you will find the diff against the current trunk.

Greetings,
Rob Halff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: varnishncsa_virtual_host_patch.diff
Type: application/octet-stream
Size: 1120 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090403/605c8fca/attachment.obj>

From des at des.no  Fri Apr  3 21:42:25 2009
From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=)
Date: Fri, 03 Apr 2009 23:42:25 +0200
Subject: ./configure error for sys/mount.h
In-Reply-To: <200904020451.n324pmxt058307@banyan.cs.ait.ac.th> (Olivier
	Nicole's message of "Thu, 2 Apr 2009 11:51:48 +0700 (ICT)")
References: <200904020451.n324pmxt058307@banyan.cs.ait.ac.th>
Message-ID: <861vs9ifcu.fsf@ds4.des.no>

Olivier Nicole <on at cs.ait.ac.th> writes:
> configure: WARNING: sys/mount.h: present but cannot be compiled

Fixed, thanks.

DES
-- 
Dag-Erling Sm?rgrav - des at des.no


From dome at tel.co.th  Tue Apr  7 08:36:42 2009
From: dome at tel.co.th (Dome Charoenyost)
Date: Tue, 7 Apr 2009 15:36:42 +0700
Subject: How to check maximum Request /s
Message-ID: <8ccbff060904070136v445f2affuc01f70a96ebe14f@mail.gmail.com>

Dear All,
              I try varnishstat to check maxmimum request / s but not
found. where to get it ?

Best regards.
Dome C.


From rob.halff at gmail.com  Tue Apr  7 08:50:34 2009
From: rob.halff at gmail.com (Rob Halff)
Date: Tue, 7 Apr 2009 10:50:34 +0200
Subject: Virtualhost logging for varnishncsa
Message-ID: <cee869b0904070150u40218062oa939a7604de8d712@mail.gmail.com>

Here is a new diff, the http:// part also needs to be omitted to do
the correct kind of logging.

Can I draw the conclusing, from the overwhelming response, that nobody
is really interested in this patch ? :-)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: varnishncsa_virtual_host_patch.diff
Type: application/octet-stream
Size: 1617 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090407/2ee0887b/attachment.obj>

From tical.net at gmail.com  Tue Apr  7 14:14:16 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Tue, 7 Apr 2009 10:14:16 -0400
Subject: Varnish 2.0.3 stops accepting connections - fixed? Performance 
	revisited
Message-ID: <afa5bb590904070714w3b422a9brc227b1144b9e770e@mail.gmail.com>

Hi all.? I have already seen Eden Li's patch, apparently committed to
2.0.4, which fixes the problem of varnish not re-checking to see if
file descriptors are available again to service connections - at least
that's my extremely lay understanding, being mostly a non-programmer.
Further to Eden Li's post to this list which says "We're getting
around it now by setting the max open file limit and listen_depth
appropriately so that varnish never gets to this point, but it'd be
nice if this was fixed in case we ever accidentally get here again." -
I'm wondering if someone can critique my config.  I've observed
several instances where Varnish would do exactly what Eden describes -
stop listening to requests on port 80.  A 'telnet ip 80' would simply
freeze indefinitely and not connect.  The child process was running,
etc.  So I'm going to assume for the moment that the bugfix will be
the fix for this issue, as I have not been able to duplicate it under
lab testing, but only under live load conditions.  Here is the way we
call varnishd:

#!/bin/bash

ulimit -n 131072
ulimit -l 82000


/usr/sbin/varnishd -a x.x.x.x:80 -b x.x.x.x:80 -T x.x.x.x:6083 \
        -t 60 -w150,2000,60 -u varnish -g varnish -p
obj_workspace=4096 -p sess_workspace=262144 -p listen_depth=8192 \
        -p shm_workspace=29000 -p thread_pools=24 -p thread_pool_min=8
-p ping_interval=1 -p srcaddr_ttl=0 -s malloc,60M

This configuration was a hack between John Adams config from a post
from February with the subject "Is anyone using ESI with a lot of
traffic?", and the Fedora startup script for varnish in /etc/init.d -
platform is Linux 2.6 (Fedora 10 and RHEL).  cat /proc/sys/fs/file-max
says 65535 - I set this value on the fly without rebooting yet.  This
behavior I described, where port 80 will stop taking connections, is
also present when I call varnishd using simply 'varnishd -a x.x.x.x:80
-b x.x.x.x:80 -T x.x.x.x:6083' with no ulimit commands, no additional
arguments, and as far as I can remember, default FDs in
/proc/sys/fs/file-max.

Before I "chase my tail" any further, can anyone recommend any
improvements to the above config?  Also, is there any particular
reason, given the above config, why 'ab' (the apache benchmarking
utility) would fail intermittently to connect 1000 concurrent sessions
to varnish?  I found that when I used John Adams' default of 400
initial minimum threads, the daemon would do unpredictable things and
not let me run 'ab' consistently without refusing the connections -
any idea why?

Thanks in advance, both for any replies, and to everyone who has
contributed to the project.

-Ray


From vijayaraghavan.subramaniam at wipro.com  Thu Apr  9 15:33:15 2009
From: vijayaraghavan.subramaniam at wipro.com (vijayaraghavan.subramaniam at wipro.com)
Date: Thu, 9 Apr 2009 21:03:15 +0530
Subject: Logging and Statistics
Message-ID: <D15944B3F92FF543AF7F0BBAF55BFC8A07FB7401@BLR-EC-MBX01.wipro.com>

Hi,
 
I'm using varnish server.  I want to write varnishncsa log information into database,  what is best practices & Is there any varnish API available to write into database?
 
Thanks,
--Raghavan.
 

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/f00dc2f1/attachment.html>

From cloude at instructables.com  Thu Apr  9 19:02:25 2009
From: cloude at instructables.com (Cloude Porteus)
Date: Thu, 9 Apr 2009 12:02:25 -0700
Subject: High Server Load Averages?
Message-ID: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>

Has anyone experienced very high server load averages? We're running varnish
on a dual core with 8gb of ram. It runs okay for a day or two and then I
start seeing load averages in 6-10 range for an hour or so, drops down to
2-3, then goes back up.

This starts to happen once we have more items in the cache than our physical
memory. Maybe increasing our lru_interval will help? It's currently set to
3600.

Right now we're running with a 50gb file storage option. There are 270k
objects in the cache, 70gb virtual memory, 6.2gb of res memory used, 11gb of
data on disk in the file storage. We have a 98% hit ratio.

We followed Artur's advice about setting a tmpfs and creating an ext2
partition for our file storage.

I also tried running with malloc as our storage type, but I had to set it at
a little less than half of our physical ram in order for it to work well
after the cache got full. I don't understand why the virtual memory is
double when I am running in malloc mode. I was running it with 5gb and the
virtual memory was about 10-12gb and once it got full it started using the
swap memory.

Thanks for any help/insight.

best,
cloude
-- 
VP of Product Development
Instructables.com

http://www.instructables.com/member/lebowski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/01db2aba/attachment.html>

From sky at crucially.net  Thu Apr  9 19:18:22 2009
From: sky at crucially.net (Artur Bergman)
Date: Thu, 9 Apr 2009 12:18:22 -0700
Subject: High Server Load Averages?
In-Reply-To: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>
References: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>
Message-ID: <871677B9-7E39-4246-BE5E-422E4E593151@crucially.net>

For the file storage or for the shmlog?

When do you start nuking/expiring from disk? I suspect the load goes  
up when you run out of storage space?

Cheers
Artur


On Apr 9, 2009, at 12:02 PM, Cloude Porteus wrote:

> Has anyone experienced very high server load averages? We're running  
> varnish on a dual core with 8gb of ram. It runs okay for a day or  
> two and then I start seeing load averages in 6-10 range for an hour  
> or so, drops down to 2-3, then goes back up.
>
> This starts to happen once we have more items in the cache than our  
> physical memory. Maybe increasing our lru_interval will help? It's  
> currently set to 3600.
>
> Right now we're running with a 50gb file storage option. There are  
> 270k objects in the cache, 70gb virtual memory, 6.2gb of res memory  
> used, 11gb of data on disk in the file storage. We have a 98% hit  
> ratio.
>
> We followed Artur's advice about setting a tmpfs and creating an  
> ext2 partition for our file storage.
>
> I also tried running with malloc as our storage type, but I had to  
> set it at a little less than half of our physical ram in order for  
> it to work well after the cache got full. I don't understand why the  
> virtual memory is double when I am running in malloc mode. I was  
> running it with 5gb and the virtual memory was about 10-12gb and  
> once it got full it started using the swap memory.
>
> Thanks for any help/insight.
>
> best,
> cloude
> -- 
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/25f9bb25/attachment.html>

From cloude at instructables.com  Thu Apr  9 19:27:55 2009
From: cloude at instructables.com (Cloude Porteus)
Date: Thu, 9 Apr 2009 12:27:55 -0700
Subject: High Server Load Averages?
In-Reply-To: <871677B9-7E39-4246-BE5E-422E4E593151@crucially.net>
References: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>
	<871677B9-7E39-4246-BE5E-422E4E593151@crucially.net>
Message-ID: <4a05e1020904091227x519affe6r1a9540b5a6dde1b9@mail.gmail.com>

Varnishstat doesn't list any nuked objects and file storage and shmlog look
like they have plenty of space:

df -h
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Filesystem            Size  Used Avail Use% Mounted on
tmpfs                 150M   81M   70M  54% /usr/local/var/varnish
/dev/sdc1              74G   11G   61G  16% /var/lib/varnish

top
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
top - 12:26:33 up 164 days, 22:21,  1 user,  load average: 2.60, 3.26, 3.75
Tasks:  67 total,   1 running,  66 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.7%us,  0.3%sy,  0.0%ni, 97.0%id,  0.7%wa,  0.3%hi,  1.0%si,
0.0%st
Mem:   8183492k total,  7763100k used,   420392k free,    13424k buffers
Swap:  3148720k total,    56636k used,  3092084k free,  7317692k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7441 varnish   15   0 70.0g 6.4g 6.1g S    2 82.5  56:33.31 varnishd


Varnishstat:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hitrate ratio:        8        8        8
Hitrate avg:     0.9782   0.9782   0.9782

    36494404       219.98       160.57 Client connections accepted
    36494486       220.98       160.57 Client requests received
    35028477       212.98       154.12 Cache hits
      474091         4.00         2.09 Cache hits for pass
      988013         6.00         4.35 Cache misses
     1465955        10.00         6.45 Backend connections success
           9         0.00         0.00 Backend connections failures
         994          .            .   N struct sess_mem
          11          .            .   N struct sess
      274047          .            .   N struct object
      252063          .            .   N struct objecthead
      609018          .            .   N struct smf
       28720          .            .   N small free smf
           2          .            .   N large free smf
           2          .            .   N struct vbe_conn
         901          .            .   N struct bereq
        2000          .            .   N worker threads
        2000         0.00         0.01 N worker threads created
         143         0.00         0.00 N overflowed work requests
           1          .            .   N backends
      672670          .            .   N expired objects
     3514467          .            .   N LRU moved objects
          49         0.00         0.00 HTTP header overflows
    32124238       206.98       141.34 Objects sent with write
    36494396       224.98       160.57 Total Sessions
    36494484       224.98       160.57 Total Requests
         783         0.00         0.00 Total pipe
      518770         4.00         2.28 Total pass
     1464570        10.00         6.44 Total fetch
 14559014884     93563.69     64058.18 Total header bytes
168823109304    489874.04    742804.45 Total body bytes
    36494387       224.98       160.57 Session Closed
         203         0.00         0.00 Session herd
  1736767745     10880.80      7641.60 SHM records
   148079555       908.90       651.53 SHM writes
       15088         0.00         0.07 SHM flushes due to overflow
       10494         0.00         0.05 SHM MTX contention
         687         0.00         0.00 SHM cycles through buffer
     2988576        21.00        13.15 allocator requests
      580296          .            .   outstanding allocations
  8916353024          .            .   bytes allocated
 44770738176          .            .   bytes free
         656         0.00         0.00 SMS allocator requests
      303864          .            .   SMS bytes allocated
      303864          .            .   SMS bytes freed
     1465172        10.00         6.45 Backend requests made


On Thu, Apr 9, 2009 at 12:18 PM, Artur Bergman <sky at crucially.net> wrote:

> For the file storage or for the shmlog?
>
> When do you start nuking/expiring from disk? I suspect the load goes up
> when you run out of storage space?
>
> Cheers
> Artur
>
>
> On Apr 9, 2009, at 12:02 PM, Cloude Porteus wrote:
>
> Has anyone experienced very high server load averages? We're running
> varnish on a dual core with 8gb of ram. It runs okay for a day or two and
> then I start seeing load averages in 6-10 range for an hour or so, drops
> down to 2-3, then goes back up.
>
> This starts to happen once we have more items in the cache than our
> physical memory. Maybe increasing our lru_interval will help? It's currently
> set to 3600.
>
> Right now we're running with a 50gb file storage option. There are 270k
> objects in the cache, 70gb virtual memory, 6.2gb of res memory used, 11gb of
> data on disk in the file storage. We have a 98% hit ratio.
>
> We followed Artur's advice about setting a tmpfs and creating an ext2
> partition for our file storage.
>
> I also tried running with malloc as our storage type, but I had to set it
> at a little less than half of our physical ram in order for it to work well
> after the cache got full. I don't understand why the virtual memory is
> double when I am running in malloc mode. I was running it with 5gb and the
> virtual memory was about 10-12gb and once it got full it started using the
> swap memory.
>
> Thanks for any help/insight.
>
> best,
> cloude
> --
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev
>
>
>


-- 
VP of Product Development
Instructables.com

http://www.instructables.com/member/lebowski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/45b12784/attachment.html>

From sky at crucially.net  Thu Apr  9 20:43:13 2009
From: sky at crucially.net (Artur Bergman)
Date: Thu, 9 Apr 2009 13:43:13 -0700
Subject: High Server Load Averages?
In-Reply-To: <4a05e1020904091227x519affe6r1a9540b5a6dde1b9@mail.gmail.com>
References: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>
	<871677B9-7E39-4246-BE5E-422E4E593151@crucially.net>
	<4a05e1020904091227x519affe6r1a9540b5a6dde1b9@mail.gmail.com>
Message-ID: <477CEE15-7F81-4E12-8110-55D1534D3C4B@crucially.net>

What is your iopressure?

iostat -k -x 5

or something like that

artur

On Apr 9, 2009, at 12:27 PM, Cloude Porteus wrote:

> Varnishstat doesn't list any nuked objects and file storage and  
> shmlog look like they have plenty of space:
>
> df -h
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Filesystem            Size  Used Avail Use% Mounted on
> tmpfs                 150M   81M   70M  54% /usr/local/var/varnish
> /dev/sdc1              74G   11G   61G  16% /var/lib/varnish
>
> top
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> top - 12:26:33 up 164 days, 22:21,  1 user,  load average: 2.60,  
> 3.26, 3.75
> Tasks:  67 total,   1 running,  66 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.7%us,  0.3%sy,  0.0%ni, 97.0%id,  0.7%wa,  0.3%hi,   
> 1.0%si,  0.0%st
> Mem:   8183492k total,  7763100k used,   420392k free,    13424k  
> buffers
> Swap:  3148720k total,    56636k used,  3092084k free,  7317692k  
> cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  7441 varnish   15   0 70.0g 6.4g 6.1g S    2 82.5  56:33.31 varnishd
>
>
> Varnishstat:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Hitrate ratio:        8        8        8
> Hitrate avg:     0.9782   0.9782   0.9782
>
>     36494404       219.98       160.57 Client connections accepted
>     36494486       220.98       160.57 Client requests received
>     35028477       212.98       154.12 Cache hits
>       474091         4.00         2.09 Cache hits for pass
>       988013         6.00         4.35 Cache misses
>      1465955        10.00         6.45 Backend connections success
>            9         0.00         0.00 Backend connections failures
>          994          .            .   N struct sess_mem
>           11          .            .   N struct sess
>       274047          .            .   N struct object
>       252063          .            .   N struct objecthead
>       609018          .            .   N struct smf
>        28720          .            .   N small free smf
>            2          .            .   N large free smf
>            2          .            .   N struct vbe_conn
>          901          .            .   N struct bereq
>         2000          .            .   N worker threads
>         2000         0.00         0.01 N worker threads created
>          143         0.00         0.00 N overflowed work requests
>            1          .            .   N backends
>       672670          .            .   N expired objects
>      3514467          .            .   N LRU moved objects
>           49         0.00         0.00 HTTP header overflows
>     32124238       206.98       141.34 Objects sent with write
>     36494396       224.98       160.57 Total Sessions
>     36494484       224.98       160.57 Total Requests
>          783         0.00         0.00 Total pipe
>       518770         4.00         2.28 Total pass
>      1464570        10.00         6.44 Total fetch
>  14559014884     93563.69     64058.18 Total header bytes
> 168823109304    489874.04    742804.45 Total body bytes
>     36494387       224.98       160.57 Session Closed
>          203         0.00         0.00 Session herd
>   1736767745     10880.80      7641.60 SHM records
>    148079555       908.90       651.53 SHM writes
>        15088         0.00         0.07 SHM flushes due to overflow
>        10494         0.00         0.05 SHM MTX contention
>          687         0.00         0.00 SHM cycles through buffer
>      2988576        21.00        13.15 allocator requests
>       580296          .            .   outstanding allocations
>   8916353024          .            .   bytes allocated
>  44770738176          .            .   bytes free
>          656         0.00         0.00 SMS allocator requests
>       303864          .            .   SMS bytes allocated
>       303864          .            .   SMS bytes freed
>      1465172        10.00         6.45 Backend requests made
>
>
>
> On Thu, Apr 9, 2009 at 12:18 PM, Artur Bergman <sky at crucially.net>  
> wrote:
> For the file storage or for the shmlog?
>
> When do you start nuking/expiring from disk? I suspect the load goes  
> up when you run out of storage space?
>
> Cheers
> Artur
>
>
> On Apr 9, 2009, at 12:02 PM, Cloude Porteus wrote:
>
>> Has anyone experienced very high server load averages? We're  
>> running varnish on a dual core with 8gb of ram. It runs okay for a  
>> day or two and then I start seeing load averages in 6-10 range for  
>> an hour or so, drops down to 2-3, then goes back up.
>>
>> This starts to happen once we have more items in the cache than our  
>> physical memory. Maybe increasing our lru_interval will help? It's  
>> currently set to 3600.
>>
>> Right now we're running with a 50gb file storage option. There are  
>> 270k objects in the cache, 70gb virtual memory, 6.2gb of res memory  
>> used, 11gb of data on disk in the file storage. We have a 98% hit  
>> ratio.
>>
>> We followed Artur's advice about setting a tmpfs and creating an  
>> ext2 partition for our file storage.
>>
>> I also tried running with malloc as our storage type, but I had to  
>> set it at a little less than half of our physical ram in order for  
>> it to work well after the cache got full. I don't understand why  
>> the virtual memory is double when I am running in malloc mode. I  
>> was running it with 5gb and the virtual memory was about 10-12gb  
>> and once it got full it started using the swap memory.
>>
>> Thanks for any help/insight.
>>
>> best,
>> cloude
>> -- 
>> VP of Product Development
>> Instructables.com
>>
>> http://www.instructables.com/member/lebowski
>> _______________________________________________
>> varnish-dev mailing list
>> varnish-dev at projects.linpro.no
>> http://projects.linpro.no/mailman/listinfo/varnish-dev
>
>
>
>
> -- 
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/e96f8210/attachment.html>

From cloude at instructables.com  Thu Apr  9 21:46:08 2009
From: cloude at instructables.com (Cloude Porteus)
Date: Thu, 9 Apr 2009 14:46:08 -0700
Subject: High Server Load Averages?
In-Reply-To: <477CEE15-7F81-4E12-8110-55D1534D3C4B@crucially.net>
References: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>
	<871677B9-7E39-4246-BE5E-422E4E593151@crucially.net>
	<4a05e1020904091227x519affe6r1a9540b5a6dde1b9@mail.gmail.com>
	<477CEE15-7F81-4E12-8110-55D1534D3C4B@crucially.net>
Message-ID: <4a05e1020904091446h7f8be7e5l1b33652e1c323767@mail.gmail.com>

The current load is just above 2. I'll check this again when I see a load
spike.

[cloude at squid03 ~]$ iostat -k -x 5
Linux 2.6.18-53.1.19.el5.centos.plus (squid03.instructables.com)
04/09/2009

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.19    0.00    0.95    2.14    0.00   95.73

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.07     9.64  0.15  1.65    10.08    45.68    61.80
0.13   70.32   3.96   0.72
sdb               0.07     9.63  0.15  1.66    10.14    45.68    61.75
0.02   10.03   3.76   0.68
sdc               0.03    16.47  1.21 14.69    13.99   128.81    17.96
0.08    4.81   4.31   6.85
sdd               0.03    16.45  1.17 13.24    13.29   119.96    18.49
0.24   16.52   4.06   5.86
md1               0.00     0.00  0.43 11.13    20.19    44.52    11.19
0.00    0.00   0.00   0.00
md2               0.00     0.00  2.41 29.40    26.58   117.61     9.07
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     3.15
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.90    0.00    2.40   46.70    0.00   50.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdc               0.00     0.40  6.00 238.40    74.40   974.40     8.58
132.88  515.03   4.09 100.02
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.90    0.00    1.80   67.67    0.00   29.63

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdc               0.00     1.60 13.40 141.80   188.80  1053.60    16.01
138.62  934.04   6.44 100.02
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50    0.00    1.80   61.40    0.00   36.30

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.40     0.00     2.40    12.00
0.00    9.00   9.00   0.36
sdb               0.00     0.00  0.00  0.40     0.00     2.40    12.00
0.00    9.50   9.50   0.38
sdc               0.00     1.60  6.40 257.00   132.00  2195.20    17.67
107.40  450.21   3.68  96.82
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.20     0.00     0.80     8.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.60    0.00    1.60   47.80    0.00   50.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.20     0.00     1.60    16.00
0.00   11.00  11.00   0.22
sdb               0.00     0.00  0.00  0.20     0.00     1.60    16.00
0.00   13.00  13.00   0.26
sdc               0.00     0.80  0.20 301.80     8.80  1270.40     8.47
119.40  373.98   3.31 100.04
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.60    0.00    1.70   47.80    0.00   49.90

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdc               0.00     1.20  2.40 245.31    43.11  1538.52    12.77
101.41  419.12   4.03  99.80
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.60    0.00    1.50    3.20    0.00   94.69

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               0.20     0.00  0.40  0.00     2.40     0.00    12.00
0.01   14.00   7.00   0.28
sdc               0.00     0.00  6.60 11.00   174.40   192.80    41.73
1.26  421.34   3.73   6.56
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.70    0.00    1.60   29.50    0.00   68.20

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdc               0.00     0.00  5.60 208.60   110.40   857.60     9.04
70.18  301.18   2.90  62.06
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50    0.00    1.50   48.05    0.00   49.95

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00     0.20  0.00  0.80     0.00     5.60    14.00
0.01    8.75   8.75   0.70
sdb               0.00     0.20  0.00  0.80     0.00     5.60    14.00
0.01    9.50   9.50   0.76
sdc               0.00     1.00  6.80 232.40    91.20  1180.80    10.64
110.32  475.49   4.18 100.02
sdd               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md1               0.00     0.00  0.00  0.60     0.00     2.40     8.00
0.00    0.00   0.00   0.00
md2               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
md0               0.00     0.00  0.00  0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00


On Thu, Apr 9, 2009 at 1:43 PM, Artur Bergman <sky at crucially.net> wrote:

> What is your iopressure?
> iostat -k -x 5
>
> or something like that
>
> artur
>
> On Apr 9, 2009, at 12:27 PM, Cloude Porteus wrote:
>
> Varnishstat doesn't list any nuked objects and file storage and shmlog look
> like they have plenty of space:
>
> df -h
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Filesystem            Size  Used Avail Use% Mounted on
> tmpfs                 150M   81M   70M  54% /usr/local/var/varnish
> /dev/sdc1              74G   11G   61G  16% /var/lib/varnish
>
> top
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> top - 12:26:33 up 164 days, 22:21,  1 user,  load average: 2.60, 3.26, 3.75
> Tasks:  67 total,   1 running,  66 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.7%us,  0.3%sy,  0.0%ni, 97.0%id,  0.7%wa,  0.3%hi,  1.0%si,
> 0.0%st
> Mem:   8183492k total,  7763100k used,   420392k free,    13424k buffers
> Swap:  3148720k total,    56636k used,  3092084k free,  7317692k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  7441 varnish   15   0 70.0g 6.4g 6.1g S    2 82.5  56:33.31 varnishd
>
>
> Varnishstat:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Hitrate ratio:        8        8        8
> Hitrate avg:     0.9782   0.9782   0.9782
>
>     36494404       219.98       160.57 Client connections accepted
>     36494486       220.98       160.57 Client requests received
>     35028477       212.98       154.12 Cache hits
>       474091         4.00         2.09 Cache hits for pass
>       988013         6.00         4.35 Cache misses
>      1465955        10.00         6.45 Backend connections success
>            9         0.00         0.00 Backend connections failures
>          994          .            .   N struct sess_mem
>           11          .            .   N struct sess
>       274047          .            .   N struct object
>       252063          .            .   N struct objecthead
>       609018          .            .   N struct smf
>        28720          .            .   N small free smf
>            2          .            .   N large free smf
>            2          .            .   N struct vbe_conn
>          901          .            .   N struct bereq
>         2000          .            .   N worker threads
>         2000         0.00         0.01 N worker threads created
>          143         0.00         0.00 N overflowed work requests
>            1          .            .   N backends
>       672670          .            .   N expired objects
>      3514467          .            .   N LRU moved objects
>           49         0.00         0.00 HTTP header overflows
>     32124238       206.98       141.34 Objects sent with write
>     36494396       224.98       160.57 Total Sessions
>     36494484       224.98       160.57 Total Requests
>          783         0.00         0.00 Total pipe
>       518770         4.00         2.28 Total pass
>      1464570        10.00         6.44 Total fetch
>  14559014884     93563.69     64058.18 Total header bytes
> 168823109304    489874.04    742804.45 Total body bytes
>     36494387       224.98       160.57 Session Closed
>          203         0.00         0.00 Session herd
>   1736767745     10880.80      7641.60 SHM records
>    148079555       908.90       651.53 SHM writes
>        15088         0.00         0.07 SHM flushes due to overflow
>        10494         0.00         0.05 SHM MTX contention
>          687         0.00         0.00 SHM cycles through buffer
>      2988576        21.00        13.15 allocator requests
>       580296          .            .   outstanding allocations
>   8916353024          .            .   bytes allocated
>  44770738176          .            .   bytes free
>          656         0.00         0.00 SMS allocator requests
>       303864          .            .   SMS bytes allocated
>       303864          .            .   SMS bytes freed
>      1465172        10.00         6.45 Backend requests made
>
>
>
> On Thu, Apr 9, 2009 at 12:18 PM, Artur Bergman <sky at crucially.net> wrote:
>
>> For the file storage or for the shmlog?
>>
>> When do you start nuking/expiring from disk? I suspect the load goes up
>> when you run out of storage space?
>>
>> Cheers
>> Artur
>>
>>
>> On Apr 9, 2009, at 12:02 PM, Cloude Porteus wrote:
>>
>> Has anyone experienced very high server load averages? We're running
>> varnish on a dual core with 8gb of ram. It runs okay for a day or two and
>> then I start seeing load averages in 6-10 range for an hour or so, drops
>> down to 2-3, then goes back up.
>>
>> This starts to happen once we have more items in the cache than our
>> physical memory. Maybe increasing our lru_interval will help? It's currently
>> set to 3600.
>>
>> Right now we're running with a 50gb file storage option. There are 270k
>> objects in the cache, 70gb virtual memory, 6.2gb of res memory used, 11gb of
>> data on disk in the file storage. We have a 98% hit ratio.
>>
>> We followed Artur's advice about setting a tmpfs and creating an ext2
>> partition for our file storage.
>>
>> I also tried running with malloc as our storage type, but I had to set it
>> at a little less than half of our physical ram in order for it to work well
>> after the cache got full. I don't understand why the virtual memory is
>> double when I am running in malloc mode. I was running it with 5gb and the
>> virtual memory was about 10-12gb and once it got full it started using the
>> swap memory.
>>
>> Thanks for any help/insight.
>>
>> best,
>> cloude
>> --
>> VP of Product Development
>> Instructables.com
>>
>> http://www.instructables.com/member/lebowski
>>  _______________________________________________
>> varnish-dev mailing list
>> varnish-dev at projects.linpro.no
>> http://projects.linpro.no/mailman/listinfo/varnish-dev
>>
>>
>>
>
>
> --
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski
>
>
>


-- 
VP of Product Development
Instructables.com

http://www.instructables.com/member/lebowski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/54de8d12/attachment.html>

From sky at crucially.net  Thu Apr  9 21:49:14 2009
From: sky at crucially.net (Artur Bergman)
Date: Thu, 9 Apr 2009 14:49:14 -0700
Subject: High Server Load Averages?
In-Reply-To: <4a05e1020904091446h7f8be7e5l1b33652e1c323767@mail.gmail.com>
References: <4a05e1020904091202u1ad23e4fx1abaa4b5b59317e@mail.gmail.com>
	<871677B9-7E39-4246-BE5E-422E4E593151@crucially.net>
	<4a05e1020904091227x519affe6r1a9540b5a6dde1b9@mail.gmail.com>
	<477CEE15-7F81-4E12-8110-55D1534D3C4B@crucially.net>
	<4a05e1020904091446h7f8be7e5l1b33652e1c323767@mail.gmail.com>
Message-ID: <9F79E807-0D08-4489-A66F-7EC834565C75@crucially.net>

Your SDC is overloaded

is your filesystem mounted noatime?

Artur

On Apr 9, 2009, at 2:46 PM, Cloude Porteus wrote:

> The current load is just above 2. I'll check this again when I see a  
> load spike.
>
> [cloude at squid03 ~]$ iostat -k -x 5
> Linux 2.6.18-53.1.19.el5.centos.plus  
> (squid03.instructables.com)        04/09/2009
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.19    0.00    0.95    2.14    0.00   95.73
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.07     9.64  0.15  1.65    10.08    45.68     
> 61.80     0.13   70.32   3.96   0.72
> sdb               0.07     9.63  0.15  1.66    10.14    45.68     
> 61.75     0.02   10.03   3.76   0.68
> sdc               0.03    16.47  1.21 14.69    13.99   128.81     
> 17.96     0.08    4.81   4.31   6.85
> sdd               0.03    16.45  1.17 13.24    13.29   119.96     
> 18.49     0.24   16.52   4.06   5.86
> md1               0.00     0.00  0.43 11.13    20.19    44.52     
> 11.19     0.00    0.00   0.00   0.00
> md2               0.00     0.00  2.41 29.40    26.58   117.61      
> 9.07     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 3.15     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.90    0.00    2.40   46.70    0.00   50.00
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdb               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdc               0.00     0.40  6.00 238.40    74.40   974.40      
> 8.58   132.88  515.03   4.09 100.02
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.90    0.00    1.80   67.67    0.00   29.63
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdb               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdc               0.00     1.60 13.40 141.80   188.80  1053.60     
> 16.01   138.62  934.04   6.44 100.02
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.50    0.00    1.80   61.40    0.00   36.30
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.40     0.00     2.40     
> 12.00     0.00    9.00   9.00   0.36
> sdb               0.00     0.00  0.00  0.40     0.00     2.40     
> 12.00     0.00    9.50   9.50   0.38
> sdc               0.00     1.60  6.40 257.00   132.00  2195.20     
> 17.67   107.40  450.21   3.68  96.82
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.20     0.00     0.80      
> 8.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.60    0.00    1.60   47.80    0.00   50.00
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.20     0.00     1.60     
> 16.00     0.00   11.00  11.00   0.22
> sdb               0.00     0.00  0.00  0.20     0.00     1.60     
> 16.00     0.00   13.00  13.00   0.26
> sdc               0.00     0.80  0.20 301.80     8.80  1270.40      
> 8.47   119.40  373.98   3.31 100.04
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.60    0.00    1.70   47.80    0.00   49.90
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdb               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdc               0.00     1.20  2.40 245.31    43.11  1538.52     
> 12.77   101.41  419.12   4.03  99.80
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.60    0.00    1.50    3.20    0.00   94.69
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdb               0.20     0.00  0.40  0.00     2.40     0.00     
> 12.00     0.01   14.00   7.00   0.28
> sdc               0.00     0.00  6.60 11.00   174.40   192.80     
> 41.73     1.26  421.34   3.73   6.56
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.70    0.00    1.60   29.50    0.00   68.20
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdb               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> sdc               0.00     0.00  5.60 208.60   110.40   857.60      
> 9.04    70.18  301.18   2.90  62.06
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.50    0.00    1.50   48.05    0.00   49.95
>
> Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq- 
> sz avgqu-sz   await  svctm  %util
> sda               0.00     0.20  0.00  0.80     0.00     5.60     
> 14.00     0.01    8.75   8.75   0.70
> sdb               0.00     0.20  0.00  0.80     0.00     5.60     
> 14.00     0.01    9.50   9.50   0.76
> sdc               0.00     1.00  6.80 232.40    91.20  1180.80     
> 10.64   110.32  475.49   4.18 100.02
> sdd               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md1               0.00     0.00  0.00  0.60     0.00     2.40      
> 8.00     0.00    0.00   0.00   0.00
> md2               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
> md0               0.00     0.00  0.00  0.00     0.00     0.00      
> 0.00     0.00    0.00   0.00   0.00
>
>
>
>
> On Thu, Apr 9, 2009 at 1:43 PM, Artur Bergman <sky at crucially.net>  
> wrote:
> What is your iopressure?
>
> iostat -k -x 5
>
> or something like that
>
> artur
>
> On Apr 9, 2009, at 12:27 PM, Cloude Porteus wrote:
>
>> Varnishstat doesn't list any nuked objects and file storage and  
>> shmlog look like they have plenty of space:
>>
>> df -h
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Filesystem            Size  Used Avail Use% Mounted on
>> tmpfs                 150M   81M   70M  54% /usr/local/var/varnish
>> /dev/sdc1              74G   11G   61G  16% /var/lib/varnish
>>
>> top
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> top - 12:26:33 up 164 days, 22:21,  1 user,  load average: 2.60,  
>> 3.26, 3.75
>> Tasks:  67 total,   1 running,  66 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  0.7%us,  0.3%sy,  0.0%ni, 97.0%id,  0.7%wa,  0.3%hi,   
>> 1.0%si,  0.0%st
>> Mem:   8183492k total,  7763100k used,   420392k free,    13424k  
>> buffers
>> Swap:  3148720k total,    56636k used,  3092084k free,  7317692k  
>> cached
>>
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>  7441 varnish   15   0 70.0g 6.4g 6.1g S    2 82.5  56:33.31 varnishd
>>
>>
>> Varnishstat:
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Hitrate ratio:        8        8        8
>> Hitrate avg:     0.9782   0.9782   0.9782
>>
>>     36494404       219.98       160.57 Client connections accepted
>>     36494486       220.98       160.57 Client requests received
>>     35028477       212.98       154.12 Cache hits
>>       474091         4.00         2.09 Cache hits for pass
>>       988013         6.00         4.35 Cache misses
>>      1465955        10.00         6.45 Backend connections success
>>            9         0.00         0.00 Backend connections failures
>>          994          .            .   N struct sess_mem
>>           11          .            .   N struct sess
>>       274047          .            .   N struct object
>>       252063          .            .   N struct objecthead
>>       609018          .            .   N struct smf
>>        28720          .            .   N small free smf
>>            2          .            .   N large free smf
>>            2          .            .   N struct vbe_conn
>>          901          .            .   N struct bereq
>>         2000          .            .   N worker threads
>>         2000         0.00         0.01 N worker threads created
>>          143         0.00         0.00 N overflowed work requests
>>            1          .            .   N backends
>>       672670          .            .   N expired objects
>>      3514467          .            .   N LRU moved objects
>>           49         0.00         0.00 HTTP header overflows
>>     32124238       206.98       141.34 Objects sent with write
>>     36494396       224.98       160.57 Total Sessions
>>     36494484       224.98       160.57 Total Requests
>>          783         0.00         0.00 Total pipe
>>       518770         4.00         2.28 Total pass
>>      1464570        10.00         6.44 Total fetch
>>  14559014884     93563.69     64058.18 Total header bytes
>> 168823109304    489874.04    742804.45 Total body bytes
>>     36494387       224.98       160.57 Session Closed
>>          203         0.00         0.00 Session herd
>>   1736767745     10880.80      7641.60 SHM records
>>    148079555       908.90       651.53 SHM writes
>>        15088         0.00         0.07 SHM flushes due to overflow
>>        10494         0.00         0.05 SHM MTX contention
>>          687         0.00         0.00 SHM cycles through buffer
>>      2988576        21.00        13.15 allocator requests
>>       580296          .            .   outstanding allocations
>>   8916353024          .            .   bytes allocated
>>  44770738176          .            .   bytes free
>>          656         0.00         0.00 SMS allocator requests
>>       303864          .            .   SMS bytes allocated
>>       303864          .            .   SMS bytes freed
>>      1465172        10.00         6.45 Backend requests made
>>
>>
>>
>> On Thu, Apr 9, 2009 at 12:18 PM, Artur Bergman <sky at crucially.net>  
>> wrote:
>> For the file storage or for the shmlog?
>>
>> When do you start nuking/expiring from disk? I suspect the load  
>> goes up when you run out of storage space?
>>
>> Cheers
>> Artur
>>
>>
>> On Apr 9, 2009, at 12:02 PM, Cloude Porteus wrote:
>>
>>> Has anyone experienced very high server load averages? We're  
>>> running varnish on a dual core with 8gb of ram. It runs okay for a  
>>> day or two and then I start seeing load averages in 6-10 range for  
>>> an hour or so, drops down to 2-3, then goes back up.
>>>
>>> This starts to happen once we have more items in the cache than  
>>> our physical memory. Maybe increasing our lru_interval will help?  
>>> It's currently set to 3600.
>>>
>>> Right now we're running with a 50gb file storage option. There are  
>>> 270k objects in the cache, 70gb virtual memory, 6.2gb of res  
>>> memory used, 11gb of data on disk in the file storage. We have a  
>>> 98% hit ratio.
>>>
>>> We followed Artur's advice about setting a tmpfs and creating an  
>>> ext2 partition for our file storage.
>>>
>>> I also tried running with malloc as our storage type, but I had to  
>>> set it at a little less than half of our physical ram in order for  
>>> it to work well after the cache got full. I don't understand why  
>>> the virtual memory is double when I am running in malloc mode. I  
>>> was running it with 5gb and the virtual memory was about 10-12gb  
>>> and once it got full it started using the swap memory.
>>>
>>> Thanks for any help/insight.
>>>
>>> best,
>>> cloude
>>> -- 
>>> VP of Product Development
>>> Instructables.com
>>>
>>> http://www.instructables.com/member/lebowski
>>> _______________________________________________
>>> varnish-dev mailing list
>>> varnish-dev at projects.linpro.no
>>> http://projects.linpro.no/mailman/listinfo/varnish-dev
>>
>>
>>
>>
>> -- 
>> VP of Product Development
>> Instructables.com
>>
>> http://www.instructables.com/member/lebowski
>
>
>
>
> -- 
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090409/d637e4f2/attachment.html>

From p.millar at physics.gla.ac.uk  Fri Apr 10 21:47:18 2009
From: p.millar at physics.gla.ac.uk (Paul Millar)
Date: Fri, 10 Apr 2009 23:47:18 +0200
Subject: Logging and Statistics
In-Reply-To: <D15944B3F92FF543AF7F0BBAF55BFC8A07FB7401@BLR-EC-MBX01.wipro.com>
References: <D15944B3F92FF543AF7F0BBAF55BFC8A07FB7401@BLR-EC-MBX01.wipro.com>
Message-ID: <200904102347.19049.p.millar@physics.gla.ac.uk>

Hi Raghavan,

On Thursday 09 April 2009 17:33:15 vijayaraghavan.subramaniam at wipro.com wrote:
> I'm using varnish server.  I want to write varnishncsa log information into
> database,  what is best practices & Is there any varnish API available to
> write into database?

At the risk of promoting my own project, have you looked at MonAMI?

	http://monami.sourceforge.net

There's a monitoring plugin for taking log information from varnish and a 
reporting plugin for logging data into MySQL.

The tutorial culminates with logging monitoring data into a database (albeit 
from a different monitoring plugin) and the userguide gives a detailed 
information on how to configure MonAMI.

HTH,

Paul.


From tical.net at gmail.com  Fri Apr 10 22:12:48 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Fri, 10 Apr 2009 18:12:48 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / stats 
	tracking
Message-ID: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>

Hi all.  Note that everything herein is based only on a very lay knowledge
of varnish, without being familiar with the internals of the code.

In my quest to eek more performance out of Varnish, I've been testing under
2.0.4.  I have not seen much improvement over 2.0.3 in the way it acts after
receiving a bunch of hits all at one time.  I am invoking varnish like this:

ulimit -n 131072
ulimit -l 82000
/usr/local/sbin/varnishd -a 98.124.141.3:80 -b 67.212.179.98:80 -T
98.124.141.3:6083 \
        -t 60 -w1440,3000,60 -u apache -g apache -p obj_workspace=16000 -p
sess_workspace=262144 -p listen_depth=4096 \
        -p shm_workspace=64000 -p thread_pools=8 -p thread_pool_min=180 -p
ping_interval=1 -p srcaddr_ttl=0 -s malloc,80M
As best I can tell, the problem I'm seeing is that it will not create the
number of worker threads that I'm telling it to, as evidenced by the
'status' output within the CLI immediately after launch:

         270  N worker threads
         285  N worker threads created
So if I launch 'ab' with 700 connections against varnish, it will not work
right from the beginning, like so:

[root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking 98.124.141.3 (be patient)
apr_socket_recv: Connection refused (111)
[root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking 98.124.141.3 (be patient)
apr_poll: The timeout specified has expired (70007)
Total of 147 requests completed
[root at mia ~]# telnet 98.124.141.3 80
Trying 98.124.141.3...
Connected to 98.124.141.3 (98.124.141.3).
Escape character is '^]'.
GET / HTTP/1.0
^]
telnet> quit
Connection closed.
The above telnet command simply hung, presumably because there are still 700
sessions in CLOSE_WAIT state within the kernel, although that should not
matter if varnish opened the number of worker threads it was supposed to.
Based on what I've seen, it would seem that varnish has some problem when
you launch it with "too many" initial worker threads (although I'm having a
hard time understanding why 1400ish is too many).  It seems to go crazy if
you specify too many threads initially.  Again, that number should not be a
problem for the machine in theory, as it's a multicore Xeon.  Platform is
Linux 2.6 RHEL.  Any idea what's happening here?

-Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090410/4a09b5b7/attachment.html>

From jna at twitter.com  Fri Apr 10 22:35:14 2009
From: jna at twitter.com (John Adams)
Date: Fri, 10 Apr 2009 15:35:14 -0700
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
Message-ID: <DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>

It takes time to spawn threads. If you start the server with hundreds  
of threads, they won't be ready for ~30-90 seconds.

Maybe that's causing this issue?

-j

On Apr 10, 2009, at 3:12 PM, Ray Barnes wrote:

> Hi all.  Note that everything herein is based only on a very lay  
> knowledge of varnish, without being familiar with the internals of  
> the code.
>
> In my quest to eek more performance out of Varnish, I've been  
> testing under 2.0.4.  I have not seen much improvement over 2.0.3 in  
> the way it acts after receiving a bunch of hits all at one time.  I  
> am invoking varnish like this:
>
> ulimit -n 131072
> ulimit -l 82000
> /usr/local/sbin/varnishd -a 98.124.141.3:80 -b 67.212.179.98:80 -T  
> 98.124.141.3:6083 \
>         -t 60 -w1440,3000,60 -u apache -g apache -p  
> obj_workspace=16000 -p sess_workspace=262144 -p listen_depth=4096 \
>         -p shm_workspace=64000 -p thread_pools=8 -p  
> thread_pool_min=180 -p ping_interval=1 -p srcaddr_ttl=0 -s malloc,80M
> As best I can tell, the problem I'm seeing is that it will not  
> create the number of worker threads that I'm telling it to, as  
> evidenced by the 'status' output within the CLI immediately after  
> launch:
>
>          270  N worker threads
>          285  N worker threads created
> So if I launch 'ab' with 700 connections against varnish, it will  
> not work right from the beginning, like so:
>
> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>  
> apache-2.0
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
> Benchmarking 98.124.141.3 (be patient)
> apr_socket_recv: Connection refused (111)
> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>  
> apache-2.0
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
> Benchmarking 98.124.141.3 (be patient)
> apr_poll: The timeout specified has expired (70007)
> Total of 147 requests completed
> [root at mia ~]# telnet 98.124.141.3 80
> Trying 98.124.141.3...
> Connected to 98.124.141.3 (98.124.141.3).
> Escape character is '^]'.
> GET / HTTP/1.0
> ^]
> telnet> quit
> Connection closed.
> The above telnet command simply hung, presumably because there are  
> still 700 sessions in CLOSE_WAIT state within the kernel, although  
> that should not matter if varnish opened the number of worker  
> threads it was supposed to.  Based on what I've seen, it would seem  
> that varnish has some problem when you launch it with "too many"  
> initial worker threads (although I'm having a hard time  
> understanding why 1400ish is too many).  It seems to go crazy if you  
> specify too many threads initially.  Again, that number should not  
> be a problem for the machine in theory, as it's a multicore Xeon.   
> Platform is Linux 2.6 RHEL.  Any idea what's happening here?
>
> -Ray
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev

---
John Adams
Twitter Operations
jna at twitter.com
http://twitter.com/netik


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090410/b6b285d7/attachment.html>

From tical.net at gmail.com  Fri Apr 10 22:58:01 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Fri, 10 Apr 2009 18:58:01 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
Message-ID: <afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>

John,

Thanks for the reply; as you can see my config is largely based on the one
you posted to this list in February (thanks!).

I went back as you suggested and waited 90 seconds, while starting it the
same way.  Before running any tests, I went into the CLI and viewed stats on
the threads:

         364  N worker threads
         364  N worker threads created
         782  N worker threads not created

When this happens (started threads do not match the number specified),
varnish does really unpredictable things, i.e. it won't take 300 connections
from 'ab' and times out with the following message:

Benchmarking 98.124.141.3 (be patient)
apr_poll: The timeout specified has expired (70007)
Total of 52 requests completed

I think the crux of my problem is figuring out why it won't start more
threads.  Being not-so-familiar with the internals of varnish, I can't tell
whether that's an OS problem or a varnish problem.  Hope that helps.

-Ray


On Fri, Apr 10, 2009 at 6:35 PM, John Adams <jna at twitter.com> wrote:

> It takes time to spawn threads. If you start the server with hundreds of
> threads, they won't be ready for ~30-90 seconds.
> Maybe that's causing this issue?
>
> -j
>
>   On Apr 10, 2009, at 3:12 PM, Ray Barnes wrote:
>
>   Hi all.  Note that everything herein is based only on a very lay
> knowledge of varnish, without being familiar with the internals of the code.
>
> In my quest to eek more performance out of Varnish, I've been testing under
> 2.0.4.  I have not seen much improvement over 2.0.3 in the way it acts after
> receiving a bunch of hits all at one time.  I am invoking varnish like this:
>
> ulimit -n 131072
> ulimit -l 82000
> /usr/local/sbin/varnishd -a 98.124.141.3:80 <http://98.124.141.3/> -b
> 67.212.179.98:80 <http://67.212.179.98/> -T 98.124.141.3:6083 \
>         -t 60 -w1440,3000,60 -u apache -g apache -p obj_workspace=16000 -p
> sess_workspace=262144 -p listen_depth=4096 \
>         -p shm_workspace=64000 -p thread_pools=8 -p thread_pool_min=180 -p
> ping_interval=1 -p srcaddr_ttl=0 -s malloc,80M
> As best I can tell, the problem I'm seeing is that it will not create the
> number of worker threads that I'm telling it to, as evidenced by the
> 'status' output within the CLI immediately after launch:
>
>          270  N worker threads
>          285  N worker threads created
> So if I launch 'ab' with 700 connections against varnish, it will not work
> right from the beginning, like so:
>
> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
> Benchmarking 98.124.141.3 (be patient)
> apr_socket_recv: Connection refused (111)
> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
> Benchmarking 98.124.141.3 (be patient)
> apr_poll: The timeout specified has expired (70007)
> Total of 147 requests completed
> [root at mia ~]# telnet 98.124.141.3 80
> Trying 98.124.141.3...
> Connected to 98.124.141.3 (98.124.141.3).
> Escape character is '^]'.
> GET / HTTP/1.0
> ^]
> telnet> quit
> Connection closed.
> The above telnet command simply hung, presumably because there are still
> 700 sessions in CLOSE_WAIT state within the kernel, although that should not
> matter if varnish opened the number of worker threads it was supposed to.
> Based on what I've seen, it would seem that varnish has some problem when
> you launch it with "too many" initial worker threads (although I'm having a
> hard time understanding why 1400ish is too many).  It seems to go crazy if
> you specify too many threads initially.  Again, that number should not be a
> problem for the machine in theory, as it's a multicore Xeon.  Platform is
> Linux 2.6 RHEL.  Any idea what's happening here?
>
> -Ray
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev
>
>
>   ---
> John Adams
> Twitter Operations
> jna at twitter.com
> http://twitter.com/netik
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090410/a490c4c2/attachment.html>

From jna at twitter.com  Fri Apr 10 23:30:27 2009
From: jna at twitter.com (John Adams)
Date: Fri, 10 Apr 2009 16:30:27 -0700
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
Message-ID: <26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>

Something's very wrong here - we've never experienced this before.

Are you stating the server as root or as another user? Any ulimit or  
restrictions on # of file descriptors?

-j

On Apr 10, 2009, at 3:58 PM, Ray Barnes wrote:

> John,
>
> Thanks for the reply; as you can see my config is largely based on  
> the one you posted to this list in February (thanks!).
>
> I went back as you suggested and waited 90 seconds, while starting  
> it the same way.  Before running any tests, I went into the CLI and  
> viewed stats on the threads:
>
>          364  N worker threads
>          364  N worker threads created
>          782  N worker threads not created
>
> When this happens (started threads do not match the number  
> specified), varnish does really unpredictable things, i.e. it won't  
> take 300 connections from 'ab' and times out with the following  
> message:
>
> Benchmarking 98.124.141.3 (be patient)
> apr_poll: The timeout specified has expired (70007)
> Total of 52 requests completed
>
> I think the crux of my problem is figuring out why it won't start  
> more threads.  Being not-so-familiar with the internals of varnish,  
> I can't tell whether that's an OS problem or a varnish problem.   
> Hope that helps.
>
> -Ray
>
>
>
> On Fri, Apr 10, 2009 at 6:35 PM, John Adams <jna at twitter.com> wrote:
> It takes time to spawn threads. If you start the server with  
> hundreds of threads, they won't be ready for ~30-90 seconds.
>
> Maybe that's causing this issue?
>
> -j
>
> On Apr 10, 2009, at 3:12 PM, Ray Barnes wrote:
>
>> Hi all.  Note that everything herein is based only on a very lay  
>> knowledge of varnish, without being familiar with the internals of  
>> the code.
>>
>> In my quest to eek more performance out of Varnish, I've been  
>> testing under 2.0.4.  I have not seen much improvement over 2.0.3  
>> in the way it acts after receiving a bunch of hits all at one  
>> time.  I am invoking varnish like this:
>>
>> ulimit -n 131072
>> ulimit -l 82000
>> /usr/local/sbin/varnishd -a 98.124.141.3:80 -b 67.212.179.98:80 -T  
>> 98.124.141.3:6083 \
>>         -t 60 -w1440,3000,60 -u apache -g apache -p  
>> obj_workspace=16000 -p sess_workspace=262144 -p listen_depth=4096 \
>>         -p shm_workspace=64000 -p thread_pools=8 -p  
>> thread_pool_min=180 -p ping_interval=1 -p srcaddr_ttl=0 -s malloc,80M
>> As best I can tell, the problem I'm seeing is that it will not  
>> create the number of worker threads that I'm telling it to, as  
>> evidenced by the 'status' output within the CLI immediately after  
>> launch:
>>
>>          270  N worker threads
>>          285  N worker threads created
>> So if I launch 'ab' with 700 connections against varnish, it will  
>> not work right from the beginning, like so:
>>
>> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
>> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>  
>> apache-2.0
>> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
>> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
>> Benchmarking 98.124.141.3 (be patient)
>> apr_socket_recv: Connection refused (111)
>> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
>> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>  
>> apache-2.0
>> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
>> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
>> Benchmarking 98.124.141.3 (be patient)
>> apr_poll: The timeout specified has expired (70007)
>> Total of 147 requests completed
>> [root at mia ~]# telnet 98.124.141.3 80
>> Trying 98.124.141.3...
>> Connected to 98.124.141.3 (98.124.141.3).
>> Escape character is '^]'.
>> GET / HTTP/1.0
>> ^]
>> telnet> quit
>> Connection closed.
>> The above telnet command simply hung, presumably because there are  
>> still 700 sessions in CLOSE_WAIT state within the kernel, although  
>> that should not matter if varnish opened the number of worker  
>> threads it was supposed to.  Based on what I've seen, it would seem  
>> that varnish has some problem when you launch it with "too many"  
>> initial worker threads (although I'm having a hard time  
>> understanding why 1400ish is too many).  It seems to go crazy if  
>> you specify too many threads initially.  Again, that number should  
>> not be a problem for the machine in theory, as it's a multicore  
>> Xeon.  Platform is Linux 2.6 RHEL.  Any idea what's happening here?
>>
>> -Ray
>>
>> _______________________________________________
>> varnish-dev mailing list
>> varnish-dev at projects.linpro.no
>> http://projects.linpro.no/mailman/listinfo/varnish-dev
>
> ---
> John Adams
> Twitter Operations
> jna at twitter.com
> http://twitter.com/netik
>
>
>
>
>

---
John Adams
Twitter Operations
jna at twitter.com
http://twitter.com/netik


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090410/77cb72b2/attachment.html>

From tical.net at gmail.com  Sat Apr 11 19:46:21 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Sat, 11 Apr 2009 15:46:21 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
Message-ID: <afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>

Thanks for the reply, answers inline:

On Fri, Apr 10, 2009 at 7:30 PM, John Adams <jna at twitter.com> wrote:

> Something's very wrong here - we've never experienced this before.
> Are you stating the server as root or as another user?
>

As root.


>  Any ulimit or restrictions on # of file descriptors?
>

I'm manually setting ulimit in the script that starts varnishd, like so:

ulimit -n 131072
ulimit -l 82000
Other than that, and having manually set /proc/sys/fs/file-max to 65535, all
other settings are default according to RHEL 5 and Linux 2.6.18 with Xen
patches and backports maintained by xen.org (the aforementioned results were
all obtained by running varnish under domain 0).  I tried this on another
box which is RHEL 4.6 with 2.6.9-55.0.2.ELsmp (so no Xen in this case), and
/proc/sys/fs/file-max set to 49984 (using the same script to launch varnish
as aforementioned), the result was relatively the same:

         290  N worker threads
         290  N worker threads created
        8705  N worker threads not created
      409188  N worker threads limited
HTH.

-Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090411/6420f2ed/attachment.html>

From sky at crucially.net  Sat Apr 11 21:48:04 2009
From: sky at crucially.net (Artur Bergman)
Date: Sat, 11 Apr 2009 14:48:04 -0700
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
Message-ID: <B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>

I've never seen it do worker threads not created.

Are there any limits on number of threads? Can you get rid of - 
w1440,3000,60  and rely on the -p settings instead?

Artur

On Apr 11, 2009, at 12:46 PM, Ray Barnes wrote:

> Thanks for the reply, answers inline:
>
> On Fri, Apr 10, 2009 at 7:30 PM, John Adams <jna at twitter.com> wrote:
> Something's very wrong here - we've never experienced this before.
>
> Are you stating the server as root or as another user?
>
> As root.
>
> Any ulimit or restrictions on # of file descriptors?
>
> I'm manually setting ulimit in the script that starts varnishd, like  
> so:
>
> ulimit -n 131072
> ulimit -l 82000
> Other than that, and having manually set /proc/sys/fs/file-max to  
> 65535, all other settings are default according to RHEL 5 and Linux  
> 2.6.18 with Xen patches and backports maintained by xen.org (the  
> aforementioned results were all obtained by running varnish under  
> domain 0).  I tried this on another box which is RHEL 4.6 with  
> 2.6.9-55.0.2.ELsmp (so no Xen in this case), and /proc/sys/fs/file- 
> max set to 49984 (using the same script to launch varnish as  
> aforementioned), the result was relatively the same:
>
>          290  N worker threads
>          290  N worker threads created
>         8705  N worker threads not created
>       409188  N worker threads limited
> HTH.
>
> -Ray
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090411/0d428741/attachment.html>

From tical.net at gmail.com  Mon Apr 13 05:57:59 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Mon, 13 Apr 2009 01:57:59 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
Message-ID: <afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>

On Sat, Apr 11, 2009 at 5:48 PM, Artur Bergman <sky at crucially.net> wrote:

> I've never seen it do worker threads not created.
> Are there any limits on number of threads?
>

Apparently there are; thanks for pointing me in the right direction.  I
found a C program that attempts to spawn threads and lets you know at what
point it hits an error -
http://people.redhat.com/alikins/tuning_utils/thread-limit.c - it reports
that I can't open more than 383 threads.  The question is why.  Here's what
I've done thus far:

1) Recompiled glibc per
http://people.redhat.com/alikins/system_tuning.html#threads - the definition
of  PTHREAD_THREADS_MAX is tied to the value in /usr/include/linux/limits.h
so I adjusted that value, installed the source RPM, rebuilt all glibc RPMs
and installed using 'rpm -Uvh --force' to overcome pre/post installation
errors within the RPM (hopefully that did what it was supposed to).

2) Set /proc/sys/kernel/threads-max to 65535 (was 3000ish before), no change

3) Set /etc/security/limits.conf to "* soft nofile 1024" and "* hard nofile
10240" and added "session required /lib/security/pam_limits.so" to
/etc/pam.d/login with no change, per the advice at
http://www.mail-archive.com/java-linux at java.blackdown.org/msg15247.html where
the poster indicates he did not have to recompile glibc to do this

I've tried the same C program above on a few other Linux boxes and they all
seem to be somewhere between 200 and 383 allowed threads.  The first obvious
solution would be to dump Linux and use FBSD - a direction i'll look into in
the future.  But for now we're stuck on Linux.  Any ideas?

-Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090413/614aba9f/attachment.html>

From rafael.umann at terra.com.br  Mon Apr 13 17:53:11 2009
From: rafael.umann at terra.com.br (Rafael Umann)
Date: Mon, 13 Apr 2009 14:53:11 -0300
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
Message-ID: <DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>

Take a look at you FDs:
(linux)
# cat /proc/sys/fs/file-nr
11730	0	5049800

Varnish works with a limit of 65k file descriptors. Anything above  
that will be a problem.

http://varnish.projects.linpro.no/changeset/3631

If you are getting 65k FD`S we hit the same problem.

Another tip: if you are running on a 32bits system, thats your problem!

[]s,


On Apr 13, 2009, at 2:57 AM, Ray Barnes wrote:

> On Sat, Apr 11, 2009 at 5:48 PM, Artur Bergman <sky at crucially.net>  
> wrote:
> I've never seen it do worker threads not created.
>
> Are there any limits on number of threads?
>
> Apparently there are; thanks for pointing me in the right  
> direction.  I found a C program that attempts to spawn threads and  
> lets you know at what point it hits an error - http://people.redhat.com/alikins/tuning_utils/thread-limit.c 
>  - it reports that I can't open more than 383 threads.  The question  
> is why.  Here's what I've done thus far:
>
> 1) Recompiled glibc per http://people.redhat.com/alikins/system_tuning.html#threads 
>  - the definition of  PTHREAD_THREADS_MAX is tied to the value in / 
> usr/include/linux/limits.h so I adjusted that value, installed the  
> source RPM, rebuilt all glibc RPMs and installed using 'rpm -Uvh -- 
> force' to overcome pre/post installation errors within the RPM  
> (hopefully that did what it was supposed to).
>
> 2) Set /proc/sys/kernel/threads-max to 65535 (was 3000ish before),  
> no change
>
> 3) Set /etc/security/limits.conf to "* soft nofile 1024" and "* hard  
> nofile 10240" and added "session required /lib/security/ 
> pam_limits.so" to /etc/pam.d/login with no change, per the advice at http://www.mail-archive.com/java-linux at java.blackdown.org/msg15247.html 
>  where the poster indicates he did not have to recompile glibc to do  
> this
>
> I've tried the same C program above on a few other Linux boxes and  
> they all seem to be somewhere between 200 and 383 allowed threads.   
> The first obvious solution would be to dump Linux and use FBSD - a  
> direction i'll look into in the future.  But for now we're stuck on  
> Linux.  Any ideas?
>
> -Ray
>
>
>
>
>
> E-mail verificado pelo Terra Anti-Spam.
> Para classificar esta mensagem como spam ou n?o spam, clique aqui.
> Verifique periodicamente a pasta Spam para garantir que apenas  
> mensagens
> indesejadas sejam classificadas como Spam.
>
> Esta mensagem foi verificada pelo E-mail Protegido Terra.
> Atualizado em 12/04/2009
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090413/cf14a924/attachment.html>

From tical.net at gmail.com  Mon Apr 13 18:52:45 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Mon, 13 Apr 2009 14:52:45 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
Message-ID: <afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>

On Mon, Apr 13, 2009 at 1:53 PM, Rafael Umann <rafael.umann at terra.com.br>wrote:

> Take a look at you FDs: (linux)
>  # cat /proc/sys/fs/file-nr
> 11730 0 5049800
>
> Varnish works with a limit of 65k file descriptors. Anything above that
> will be a problem.
>
> http://varnish.projects.linpro.no/changeset/3631
>
> If you are getting 65k FD`S we hit the same problem.
>

Thanks for the reply.  I'm barely reaching 1300:

[root at vpsbox-mia ~]# cat /proc/sys/fs/file-nr
1344    0       106235
[root at vpsbox-mia ~]#


>
> Another tip: if you are running on a 32bits system, thats your problem!
>

How does 32 bit architecture restrict me from creating > 380 threads per
process?

-Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090413/a2e47247/attachment.html>

From rafael.umann at terra.com.br  Mon Apr 13 19:16:15 2009
From: rafael.umann at terra.com.br (Rafael Umann)
Date: Mon, 13 Apr 2009 16:16:15 -0300
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
Message-ID: <CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>

32 bits restrict you to use more than ~2.5gb of ram.

Try decreasing your cache size to see if you can open more threads  
(allocate memory for threads instead of using it all for cache) and  
also set the stack size smaller:

# vi /etc/security/limits.conf

* soft stack 512
* hard stack 512

(use 512kb of mem per thread)
or

* soft stack 1024
* hard stack 1024

(use 1mb of mem per thread)

[]s,
Rafael Umann


On Apr 13, 2009, at 3:52 PM, Ray Barnes wrote:

> On Mon, Apr 13, 2009 at 1:53 PM, Rafael Umann <rafael.umann at terra.com.br 
> > wrote:
> Take a look at you FDs:
> (linux)
> # cat /proc/sys/fs/file-nr
> 11730 0 5049800
>
> Varnish works with a limit of 65k file descriptors. Anything above  
> that will be a problem.
>
> http://varnish.projects.linpro.no/changeset/3631
>
> If you are getting 65k FD`S we hit the same problem.
>
> Thanks for the reply.  I'm barely reaching 1300:
>
> [root at vpsbox-mia ~]# cat /proc/sys/fs/file-nr
> 1344    0       106235
> [root at vpsbox-mia ~]#
>
>
>
> Another tip: if you are running on a 32bits system, thats your  
> problem!
>
> How does 32 bit architecture restrict me from creating > 380 threads  
> per process?
>
> -Ray
>
>
> E-mail verificado pelo Terra Anti-Spam.
> Para classificar esta mensagem como spam ou n?o spam, clique aqui.
> Verifique periodicamente a pasta Spam para garantir que apenas  
> mensagens
> indesejadas sejam classificadas como Spam.
>
> Esta mensagem foi verificada pelo E-mail Protegido Terra.
> Atualizado em 12/04/2009
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090413/9988f3db/attachment.html>

From tical.net at gmail.com  Mon Apr 13 19:32:41 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Mon, 13 Apr 2009 15:32:41 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
	<CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
Message-ID: <afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>

On Mon, Apr 13, 2009 at 3:16 PM, Rafael Umann <rafael.umann at terra.com.br>wrote:

> 32 bits restrict you to use more than ~2.5gb of ram.
>


I'm sure you know about PAE kernels, so I'm assuming there is some other
artificial limit at 2.5GB, like SHM space maybe?


>
> Try decreasing your cache size to see if you can open more threads
> (allocate memory for threads instead of using it all for cache) and also set
> the stack size smaller:
>
> # vi /etc/security/limits.conf
>
>  * soft stack 512
> * hard stack 512
>
> (use 512kb of mem per thread)
> or
>
>  * soft stack 1024
> * hard stack 1024
>
> (use 1mb of mem per thread)
>

I tried 512kb, then logging off and back on, then starting varnish with 800
threads being tried.  Same result:

         356  N worker threads
         356  N worker threads created
         181  N worker threads not created
Note that 356 + 181 is not 800.  It actually did not do this initially, it
said 201 worker threads and 840 created (it always does strange things like
this when I try creating more threads than the box can handle).  And the
program that spawns threads, still tells me 383 is the max it can make.

-Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090413/bc870fa9/attachment.html>

From rafael.umann at terra.com.br  Tue Apr 14 18:19:29 2009
From: rafael.umann at terra.com.br (Rafael Umann)
Date: Tue, 14 Apr 2009 15:19:29 -0300
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
	<CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
	<afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
Message-ID: <8DF33129-E6ED-4C0F-8CBB-13809421F811@terra.com.br>


What about the cache size? have you decresead it?

Try running varnish with:

# varnishd -f /etc/varnish/default.vcl \
              -a 0.0.0.0:80 \
              -s file,/var/lib/varnish/varnish_storage.bin,50M \
              -T 0.0.0.0:6082 \
              -u varnish \
              -g varnish \
              -w 500,500,120 \
              -p lru_interval=900 \
              -p thread_pools=1 \
              -P /var/run/varnish/varnish.pid \
              -F"

[]s,


On Apr 13, 2009, at 4:32 PM, Ray Barnes wrote:

> On Mon, Apr 13, 2009 at 3:16 PM, Rafael Umann <rafael.umann at terra.com.br 
> > wrote:
> 32 bits restrict you to use more than ~2.5gb of ram.
>
>
> I'm sure you know about PAE kernels, so I'm assuming there is some  
> other artificial limit at 2.5GB, like SHM space maybe?
>
>
> Try decreasing your cache size to see if you can open more threads  
> (allocate memory for threads instead of using it all for cache) and  
> also set the stack size smaller:
>
> # vi /etc/security/limits.conf
>
> * soft stack 512
> * hard stack 512
>
> (use 512kb of mem per thread)
> or
>
> * soft stack 1024
> * hard stack 1024
>
> (use 1mb of mem per thread)
>
> I tried 512kb, then logging off and back on, then starting varnish  
> with 800 threads being tried.  Same result:
>
>          356  N worker threads
>          356  N worker threads created
>          181  N worker threads not created
> Note that 356 + 181 is not 800.  It actually did not do this  
> initially, it said 201 worker threads and 840 created (it always  
> does strange things like this when I try creating more threads than  
> the box can handle).  And the program that spawns threads, still  
> tells me 383 is the max it can make.
>
> -Ray
>
>
>
> E-mail verificado pelo Terra Anti-Spam.
> Para classificar esta mensagem como spam ou n?o spam, clique aqui.
> Verifique periodicamente a pasta Spam para garantir que apenas  
> mensagens
> indesejadas sejam classificadas como Spam.
>
> Esta mensagem foi verificada pelo E-mail Protegido Terra.
> Atualizado em 12/04/2009
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090414/d2c83b2b/attachment.html>

From tical.net at gmail.com  Tue Apr 14 18:31:31 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Tue, 14 Apr 2009 14:31:31 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <8DF33129-E6ED-4C0F-8CBB-13809421F811@terra.com.br>
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
	<CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
	<afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
	<8DF33129-E6ED-4C0F-8CBB-13809421F811@terra.com.br>
Message-ID: <afa5bb590904141131l546cd038p3fd2564f2c3f1017@mail.gmail.com>

Thanks for the reply.  With those settings, same result:

         192  N worker threads
         284  N worker threads created
          21  N worker threads not created
Again, the issue is apparently that the _operating system_ does not let me
create more than 300ish threads.

-Ray


On Tue, Apr 14, 2009 at 2:19 PM, Rafael Umann <rafael.umann at terra.com.br>wrote:

>
> What about the cache size? have you decresead it?
>
> Try running varnish with:
>
>  # varnishd -f /etc/varnish/default.vcl \
>              -a 0.0.0.0:80 <http://0.0.0.0/> \
>              -s file,/var/lib/varnish/varnish_storage.bin,50M \
>              -T 0.0.0.0:6082 \
>              -u varnish \
>              -g varnish \
>              -w 500,500,120 \
>              -p lru_interval=900 \
>              -p thread_pools=1 \
>              -P /var/run/varnish/varnish.pid \
>              -F"
>
> []s,
>
>
>
>   On Apr 13, 2009, at 4:32 PM, Ray Barnes wrote:
>
>   On Mon, Apr 13, 2009 at 3:16 PM, Rafael Umann <rafael.umann at terra.com.br
> > wrote:
>
>> 32 bits restrict you to use more than ~2.5gb of ram.
>>
>
>
> I'm sure you know about PAE kernels, so I'm assuming there is some other
> artificial limit at 2.5GB, like SHM space maybe?
>
>
>>
>> Try decreasing your cache size to see if you can open more threads
>> (allocate memory for threads instead of using it all for cache) and also set
>> the stack size smaller:
>>
>> # vi /etc/security/limits.conf
>>
>>  * soft stack 512
>> * hard stack 512
>>
>> (use 512kb of mem per thread)
>> or
>>
>>  * soft stack 1024
>> * hard stack 1024
>>
>> (use 1mb of mem per thread)
>>
>
> I tried 512kb, then logging off and back on, then starting varnish with 800
> threads being tried.  Same result:
>
>          356  N worker threads
>          356  N worker threads created
>          181  N worker threads not created
> Note that 356 + 181 is not 800.  It actually did not do this initially, it
> said 201 worker threads and 840 created (it always does strange things like
> this when I try creating more threads than the box can handle).  And the
> program that spawns threads, still tells me 383 is the max it can make.
>
> -Ray
>
>
>
>  ------------------------------
> E-mail verificado pelo Terra Anti-Spam.
> Para classificar esta mensagem como spam ou n?o spam, clique aqui<http://mail.terra.com.br/cgi-bin/reportspam.cgi?+_d=SCY1ODAyNDQ3I3Blcm0hdGVycmEmMSwxMjM5NjUxNjE1LjUxNzI4NS4yMzU0Mi5wcmVzdG9uLnRlcnJhLmNvbSw2MzY0TerraMail>
> .
> Verifique periodicamente a pasta Spam para garantir que apenas mensagens
> indesejadas sejam classificadas como Spam.
> ------------------------------
> Esta mensagem foi verificada pelo E-mail Protegido Terra.
> Atualizado em 12/04/2009
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090414/ac03c9fc/attachment.html>

From tical.net at gmail.com  Tue Apr 14 19:10:48 2009
From: tical.net at gmail.com (Ray Barnes)
Date: Tue, 14 Apr 2009 15:10:48 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads / 
	stats tracking
In-Reply-To: <20090414185748.GQ87733@iwin.com>
References: <26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
	<CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
	<afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
	<8DF33129-E6ED-4C0F-8CBB-13809421F811@terra.com.br>
	<afa5bb590904141131l546cd038p3fd2564f2c3f1017@mail.gmail.com>
	<20090414185748.GQ87733@iwin.com>
Message-ID: <afa5bb590904141210w205eba07n2ecb5d4a7a1bd6cd@mail.gmail.com>

Bret,

Thanks for the reply - that appears to put us (perhaps) a little closer.
I'm assuming for the moment when you said "init script" that you meant the
script I use to call up varnish and not the script that boots the box as per
/etc/inittab.  When I specify "ulimit -s 1024" it did not change the net
result of varnish's inability to create threads:

         192  N worker threads
         424  N worker threads created
          62  N worker threads not created
However the C program I posted previously in this discussion was able to
create 3055 threads.  Hope that helps.

-Ray


On Tue, Apr 14, 2009 at 2:57 PM, Bret A. Barker <bret at iwin.com> wrote:

> Try a "ulimit -s 1024" in your init script. Definitely sounds like a
> problem with thread stack size defaulting to 8192.
>
> -bret
>
> On Tue, Apr 14, 2009 at 02:31:31PM -0400, Ray Barnes wrote:
> > Thanks for the reply.  With those settings, same result:
> >
> >          192  N worker threads
> >          284  N worker threads created
> >           21  N worker threads not created
> > Again, the issue is apparently that the _operating system_ does not let
> me
> > create more than 300ish threads.
> >
> > -Ray
> >
> >
> >
> > On Tue, Apr 14, 2009 at 2:19 PM, Rafael Umann <rafael.umann at terra.com.br
> >wrote:
> >
> > >
> > > What about the cache size? have you decresead it?
> > >
> > > Try running varnish with:
> > >
> > >  # varnishd -f /etc/varnish/default.vcl \
> > >              -a 0.0.0.0:80 <http://0.0.0.0/> <http://0.0.0.0/> \
>  > >              -s file,/var/lib/varnish/varnish_storage.bin,50M \
> > >              -T 0.0.0.0:6082 \
> > >              -u varnish \
> > >              -g varnish \
> > >              -w 500,500,120 \
> > >              -p lru_interval=900 \
> > >              -p thread_pools=1 \
> > >              -P /var/run/varnish/varnish.pid \
> > >              -F"
> > >
> > > []s,
> > >
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090414/bf13041e/attachment.html>

From bret at iwin.com  Tue Apr 14 18:57:49 2009
From: bret at iwin.com (Bret A. Barker)
Date: Tue, 14 Apr 2009 14:57:49 -0400
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904141131l546cd038p3fd2564f2c3f1017@mail.gmail.com>
References: <26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
	<CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
	<afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
	<8DF33129-E6ED-4C0F-8CBB-13809421F811@terra.com.br>
	<afa5bb590904141131l546cd038p3fd2564f2c3f1017@mail.gmail.com>
Message-ID: <20090414185748.GQ87733@iwin.com>

Try a "ulimit -s 1024" in your init script. Definitely sounds like a problem with thread stack size defaulting to 8192.

-bret

On Tue, Apr 14, 2009 at 02:31:31PM -0400, Ray Barnes wrote:
> Thanks for the reply.  With those settings, same result:
> 
>          192  N worker threads
>          284  N worker threads created
>           21  N worker threads not created
> Again, the issue is apparently that the _operating system_ does not let me
> create more than 300ish threads.
> 
> -Ray
> 
> 
> 
> On Tue, Apr 14, 2009 at 2:19 PM, Rafael Umann <rafael.umann at terra.com.br>wrote:
> 
> >
> > What about the cache size? have you decresead it?
> >
> > Try running varnish with:
> >
> >  # varnishd -f /etc/varnish/default.vcl \
> >              -a 0.0.0.0:80 <http://0.0.0.0/> \
> >              -s file,/var/lib/varnish/varnish_storage.bin,50M \
> >              -T 0.0.0.0:6082 \
> >              -u varnish \
> >              -g varnish \
> >              -w 500,500,120 \
> >              -p lru_interval=900 \
> >              -p thread_pools=1 \
> >              -P /var/run/varnish/varnish.pid \
> >              -F"
> >
> > []s,
> >
> >


From des at des.no  Wed Apr 15 01:28:19 2009
From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=)
Date: Wed, 15 Apr 2009 03:28:19 +0200
Subject: Bug? Barage of hits leads to failure creating worker threads /
	stats tracking
In-Reply-To: <afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
	(Ray Barnes's message of "Mon, 13 Apr 2009 15:32:41 -0400")
References: <afa5bb590904101512u75457fbak578a847c4a8eff32@mail.gmail.com>
	<DBFE49DE-6211-4F89-B1C2-0226CC2147B0@twitter.com>
	<afa5bb590904101558k718529d8le6bd0511bdeed397@mail.gmail.com>
	<26961DB5-45E9-4D34-B958-D08A5B070E9B@twitter.com>
	<afa5bb590904111246t74ad5fc2w1099c62fb771b3e7@mail.gmail.com>
	<B3293821-C6EB-4868-B04A-B5361E9D3E3B@crucially.net>
	<afa5bb590904122257i340e89a6ne35be225db2b2a53@mail.gmail.com>
	<DA295B59-84F3-45FC-8BDC-9D22E56B1B60@terra.com.br>
	<afa5bb590904131152o239d880bm4dd7f9859feb280a@mail.gmail.com>
	<CFDD85CF-1207-4BF9-BB2C-A4EE6F56E601@terra.com.br>
	<afa5bb590904131232w227f812fs7cd49d04d863f24a@mail.gmail.com>
Message-ID: <86fxgad7t8.fsf@ds4.des.no>

Ray Barnes <tical.net at gmail.com> writes:
> Rafael Umann <rafael.umann at terra.com.br> writes:
> > 32 bits restrict you to use more than ~2.5gb of ram.
> I'm sure you know about PAE kernels, so I'm assuming there is some other
> artificial limit at 2.5GB, like SHM space maybe?

PAE is irrelevant.  The address space of each process is still limited
to anywhere between 2 to 3 GB depending on the OS.

DES
-- 
Dag-Erling Sm?rgrav - des at des.no


From emil.isberg at gmail.com  Mon Apr 20 12:04:21 2009
From: emil.isberg at gmail.com (Emil Isberg)
Date: Mon, 20 Apr 2009 14:04:21 +0200
Subject: Patch adding limited Apache LogFormat support to varnishncsa
Message-ID: <e0883d160904200504r7406dfb2ve0bf0537cdd5e487@mail.gmail.com>

Hi,

As I noted "Specification of custom formats (like apache's % notation ?" in
the future planning and needed something similar for my use where I needed a
format adaptable to my specific apache configuration I added some limited
support for the parts I needed most.

Basically I have just added a string that is parsed for each logged row.

I didn't find a way to submit it to trac.

Please check the attached diff.

Best regards
Emil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090420/5778366a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: varnishncsa-logformat.diff
Type: application/octet-stream
Size: 7385 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090420/5778366a/attachment.obj>

From yang at knownsec.com  Sat Apr 25 05:00:41 2009
From: yang at knownsec.com (jilong yang)
Date: Sat, 25 Apr 2009 13:00:41 +0800
Subject: how can I debug varnishd ?
In-Reply-To: <d711d8e00904242154s6cbce3bdk4eb71bc8333c7578@mail.gmail.com>
References: <d711d8e00904242154s6cbce3bdk4eb71bc8333c7578@mail.gmail.com>
Message-ID: <d711d8e00904242200m49b9b185k724db7fb1483bafd@mail.gmail.com>

When I gdb to debug varnishd,after I set break or hbreak ,the varnishd
SIGQUIT and then exit.
It let I can't debug it . why ?


ubuntu at ubuntu:~$ gdb /opt/varnish/sbin/varnishd 7469
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
Attaching to program: /opt/varnish/sbin/varnishd, process 7469
Error while mapping shared library sections:
./vcl.1P9zoqAU.so: No such file or directory.
Reading symbols from /opt/varnish/lib/libvarnish.so.1...done.
Loaded symbols for /opt/varnish/lib/libvarnish.so.1
Reading symbols from /lib/tls/i686/cmov/librt.so.1...done.
Loaded symbols for /lib/tls/i686/cmov/librt.so.1
Reading symbols from /opt/varnish/lib/libvarnishcompat.so.1...done.
Loaded symbols for /opt/varnish/lib/libvarnishcompat.so.1
Reading symbols from /opt/varnish/lib/libvcl.so.1...done.
Loaded symbols for /opt/varnish/lib/libvcl.so.1
Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libdl.so.2
Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0xb7d3c6b0 (LWP 7469)]
[New Thread 0x2f8fab90 (LWP 7476)]
[New Thread 0x300fbb90 (LWP 7475)]
[New Thread 0x308fcb90 (LWP 7474)]
[New Thread 0xb13ebb90 (LWP 7473)]
[New Thread 0xb1becb90 (LWP 7472)]
[New Thread 0xb23edb90 (LWP 7471)]
[New Thread 0xb2beeb90 (LWP 7470)]
Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0
Reading symbols from /lib/tls/i686/cmov/libnsl.so.1...done.
Loaded symbols for /lib/tls/i686/cmov/libnsl.so.1
Reading symbols from /lib/tls/i686/cmov/libm.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libm.so.6
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/i686/cmov/libnss_compat.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libnss_compat.so.2
Reading symbols from /lib/tls/i686/cmov/libnss_nis.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libnss_nis.so.2
Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2
Symbol file not found for ./vcl.1P9zoqAU.so
0xb7f24410 in __kernel_vsyscall ()
(gdb) break printf
<----------here I set break
Breakpoint 1 at 0xb7d84334
(gdb) c
<--------- I continue
Continuing.

Program received signal SIGQUIT, Quit.                     <-----------then
the sig quit
[Switching to Thread 0xb2beeb90 (LWP 7470)]
0xb7f24410 in __kernel_vsyscall ()
(gdb) c
Continuing.

Program terminated with signal SIGQUIT, Quit.
The program no longer exists.
(gdb) c
The program is not being run.
(gdb)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090425/045d5e4a/attachment.html>

From yang at knownsec.com  Sat Apr 25 07:38:36 2009
From: yang at knownsec.com (jilong yang)
Date: Sat, 25 Apr 2009 15:38:36 +0800
Subject: how can I debug varnishd ?
In-Reply-To: <43685D7B-2493-4501-A974-12E77038AB91@mosso.com>
References: <d711d8e00904242154s6cbce3bdk4eb71bc8333c7578@mail.gmail.com>
	<d711d8e00904242200m49b9b185k724db7fb1483bafd@mail.gmail.com>
	<43685D7B-2493-4501-A974-12E77038AB91@mosso.com>
Message-ID: <d711d8e00904250038n1a5c64bdsfe11858c29e1f29f@mail.gmail.com>

thanks very much !!

I wan't to check the query string , and I can use vcl to work for it where
GET method.
I hope to check the POST string too ,how can I do ?
I find the post data follow the http header in sp->http0->ws ,  and the post
data in htc->rxbuf->b too .
can I do the check POST DATA work on HTC_Rx ?

2009/4/25 Adrian Otto <aotto at mosso.com>

> Jilong Yang,
> Set ping_interval to 0 to disable it, or set it to a really high number.
> When you stop the child process, the parent process no longer gets ping
> responses from it (checks by default every 3 seconds), so it tries to help
> by killing off the child and making a new one. It does not know you have it
> in a debugger.
>
> Regards,
>
> Adrian
>
> On Apr 24, 2009, at 10:00 PM, jilong yang wrote:
>
>
>
>
>
> When I gdb to debug varnishd,after I set break or hbreak ,the varnishd
> SIGQUIT and then exit.
> It let I can't debug it . why ?
>
>
> ubuntu at ubuntu:~$ gdb /opt/varnish/sbin/varnishd 7469
> GNU gdb 6.8-debian
> Copyright (C) 2008 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <
> http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "i486-linux-gnu"...
> Attaching to program: /opt/varnish/sbin/varnishd, process 7469
> Error while mapping shared library sections:
> ./vcl.1P9zoqAU.so: No such file or directory.
> Reading symbols from /opt/varnish/lib/libvarnish.so.1...done.
> Loaded symbols for /opt/varnish/lib/libvarnish.so.1
> Reading symbols from /lib/tls/i686/cmov/librt.so.1...done.
> Loaded symbols for /lib/tls/i686/cmov/librt.so.1
> Reading symbols from /opt/varnish/lib/libvarnishcompat.so.1...done.
> Loaded symbols for /opt/varnish/lib/libvarnishcompat.so.1
> Reading symbols from /opt/varnish/lib/libvcl.so.1...done.
> Loaded symbols for /opt/varnish/lib/libvcl.so.1
> Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done.
> Loaded symbols for /lib/tls/i686/cmov/libdl.so.2
> Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...done.
> [Thread debugging using libthread_db enabled]
> [New Thread 0xb7d3c6b0 (LWP 7469)]
> [New Thread 0x2f8fab90 (LWP 7476)]
> [New Thread 0x300fbb90 (LWP 7475)]
> [New Thread 0x308fcb90 (LWP 7474)]
> [New Thread 0xb13ebb90 (LWP 7473)]
> [New Thread 0xb1becb90 (LWP 7472)]
> [New Thread 0xb23edb90 (LWP 7471)]
> [New Thread 0xb2beeb90 (LWP 7470)]
> Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0
> Reading symbols from /lib/tls/i686/cmov/libnsl.so.1...done.
> Loaded symbols for /lib/tls/i686/cmov/libnsl.so.1
> Reading symbols from /lib/tls/i686/cmov/libm.so.6...done.
> Loaded symbols for /lib/tls/i686/cmov/libm.so.6
> Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
> Loaded symbols for /lib/tls/i686/cmov/libc.so.6
> Reading symbols from /lib/ld-linux.so.2...done.
> Loaded symbols for /lib/ld-linux.so.2
> Reading symbols from /lib/tls/i686/cmov/libnss_compat.so.2...done.
> Loaded symbols for /lib/tls/i686/cmov/libnss_compat.so.2
> Reading symbols from /lib/tls/i686/cmov/libnss_nis.so.2...done.
> Loaded symbols for /lib/tls/i686/cmov/libnss_nis.so.2
> Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done.
> Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2
> Symbol file not found for ./vcl.1P9zoqAU.so
> 0xb7f24410 in __kernel_vsyscall ()
> (gdb) break printf
> <----------here I set break
> Breakpoint 1 at 0xb7d84334
> (gdb) c
> <--------- I continue
> Continuing.
>
> Program received signal SIGQUIT, Quit.                     <-----------then
> the sig quit
> [Switching to Thread 0xb2beeb90 (LWP 7470)]
> 0xb7f24410 in __kernel_vsyscall ()
> (gdb) c
> Continuing.
>
> Program terminated with signal SIGQUIT, Quit.
> The program no longer exists.
> (gdb) c
> The program is not being run.
> (gdb)
>
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090425/8ebcd4a2/attachment.html>

From varnish-dev at projects.linpro.no  Mon Apr 27 22:37:29 2009
From: varnish-dev at projects.linpro.no (VIAGRA Inc.)
Date: Tue, 28 Apr 2009 06:37:29 +0800
Subject: Pharmacy Online Sale 85% OFF!
Message-ID: <20090428143729.2690.qmail@20081214-1330>

An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20090428/f5e893b0/attachment.html>
-------------- next part --------------


New from WebMD: Dear varnish-dev at projects.linpro.no!. Sign-up today! 


You are subscribed as varnish-dev at projects.linpro.no.
View and manage your WebMD newsletter preferences.
Subscribe to more newsletters. Change/update  your email address. 

WebMD Privacy Policy 
WebMD Office of Privacy
1175 Peachtree Street, Suite 2400, Atlanta, GA 30361
? 2009 WebMD, LLC. All rights reserved.