From phk at phk.freebsd.dk  Thu Feb  9 08:08:35 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Thu, 09 Feb 2006 08:08:35 +0000
Subject: My random thoughts
Message-ID: <3697.1139472515@critter.freebsd.dk>


Here are my random thoughts on Varnish until now.  Some of it mirrors
what we talked about in the meeting, some if it is more detailed or
reaches further into speculation.

Poul-Henning


Notes on Varnish
----------------

Philosophy
----------

It is not enough to deliver a technically superior piece of software,
if it is not possible for people to deploy it usefully in a sensible
way and timely fashion.


Deployment scenarios
--------------------

There are two fundamental usage scenarios for Varnish: when the
first machine is brought up to offload a struggling backend and
when a subsequent machine is brought online to help handle the load.


The first (layer of) Varnish
----------------------------

Somebodys webserver is struggling and they decide to try Varnish.

Often this will be a skunkworks operation with some random PC
purloined from wherever it wasn't being used and the Varnish "HOWTO"
in one hand.

If they do it in an orderly fashion before things reach panic proportions,
a sensible model is to setup the Varnish box, test it out from your
own browser, see that it answers correctly.  Test it some more and
then add the IP# to the DNS records so that it takes 50% of the load
off the backend.

If it happens as firefighting at 3AM the backend will be moved to another
IP, the Varnish box given the main IP and things had better work real
well, really fast.

In both cases, it would be ideal if all that is necessary to tell
Varnish are two pieces of information:

	Storage location
		Alternatively we can offer an "auto" setting that makes
		Varnish discover what is available and use what it find.

	DNS or IP# of backend.

		IP# is useful when the DNS settings are not quite certain
		or when split DNS horizon setups are used.

Ideally this can be done on the commandline so that there is no
configuration file to edit to get going, just

	varnish -d /home/varnish -s backend.example.dom

and you're off running.

A text, curses or HTML based based facility to give some instant
feedback and stats is necessary.

If circumstances are not conductive to strucured approach, it should
be possible to repeat this process and set up N independent Varnish
boxes and get some sort of relief without having to read any further
documentation.


The subsequent (layers of) Varnish
----------------------------------

This is what happens once everybody has caught their breath,
and where we start to talk about Varnish clusters.

We can assume that at this point, the already installed Varnish
machines have been configured more precisely and that people
have studied Varnish configuration to some level of detail.

When Varnish machines are put in a cluster, the administrator should
be able to consider the cluster as a unit and not have to think and
interact with the individual nodes.

Some sort of central management node or facility must exist and
it would be preferable if this was not a physical but a logical
entity so that it can follow the admin to the beach.  Ideally it
would give basic functionality in any browser, even mobile phones.

The focus here is scaleability, we want to avoid per-machine
configuration if at all possible.  Ideally, preconfigured hardware
can be plugged into power and net, find an address with DHCP, contact
preconfigured management node, get a configuration and start working.

But we also need to think about how we avoid a site of Varnish
machines from acting like a stampeeding horde when the power or
connectivity is brought back after a disruption.  Some sort of
slow starting ("warm-up" ?) must be implemented to prevent them
from hitting all the backend with the full force.

An important aspect of cluster operations is giving a statistically
meaninful judgement of the cluster size, in particular answering
the question "would adding another machine help ?" precisely.

We should have a facility that allows the administrator to type
in a REGEXP/URL and have all the nodes answer with a checksum, age
and expiry timer for any documents they have which match.  The
results should be grouped by URL and checksum.


Technical concepts
------------------

We want the central Varnish process to be that, just one process, and
we want to keep it small and efficient at all cost.

Code that will not be used for the central functionality should not
be part of the central process.  For instance code to parse, validate
and interpret the (possibly) complex configuration file should be a
separate program.

Depending on the situation, the Varnish process can either invoke
this program via a pipe or receive the ready to use data structures
via a network connection.

Exported data from the Varnish process should be made as cheap as
possible, likely shared memory.  That will allow us to deploy separate
processes for log-grabbing, statistics monitoring and similar
"off-duty" tasks and let the central process get on with the
important job.


Backend interaction
-------------------

We need a way to tune the backend interaction further than what the
HTTP protocol offers out of the box.

We can assume that all documents we get from the backend has an
expiry timer, if not we will set a default timer (configurable of
course).

But we need further policy than that.  Amongst the questions we have
to ask are:

	How long time after the expiry can we serve a cached copy
	of this document while we have reason to belive the backend
	can supply us with an update ?
	
	How long time after the expiry can we serve a cached copy
	of this document if the backend does not reply or is
	unreachable.

	If we cannot serve this document out of cache and the backend
	cannot inform us, what do we serve instead (404 ?  A default
	document of some sort ?)

	Should we just not serve this page at all if we are in a
	bandwidth crush (DoS/stampede) situation ?

It may also make sense to have a "emergency detector" which triggers
when the backend is overloaded and offer a scaling factor for all
timeouts for when in such an emergency state.  Something like "If
the average response time of the backend rises above 10 seconds,
multiply all expiry timers by two".

It probably also makes sense to have a bandwidth/request traffic
shaper for backend traffic to prevent any one Varnish machine from
pummeling the backend in case of attacks or misconfigured
expiry headers.


Startup/consistency
-------------------

We need to decide what to do about the cache when the Varnish
process starts.  There may be a difference between it starting
first time after the machine booted and when it is subsequently
(re)started.

By far the easiest thing to do is to disregard the cache, that saves
a lot of code for locating and validating the contents, but this
carries a penalty in backend or cluster fetches whenever a node
comes up.  Lets call this the "transient cache model"

The alternative is to allow persistently cached contents to be used
according to configured criteria:

	Can expired contents be served if we can't contact the
	backend ?  (dangerous...)

	Can unexpired contents be served if we can't contact the
	backend ?  If so, how much past the expiry ?

It is a very good question how big a fraction of the persistent
cache would be usable after typical downtimes:

	After a Varnish process restart:  Nearly all.

	After a power-failure ?  Probably at least half, but probably
	not the half that contains the most busy pages.

And we need to take into consideration if validating the format and
contents of the cache might take more resources and time than getting
the content from the backend.

Off the top of my head, I would prefer the transient model any day
because of the simplicity and lack of potential consistency problems,
but if the load on the back end is intolerable this may not be
practically feasible.

The best way to decide is to carefully analyze a number of cold
starts and cache content replacement traces.

The choice we make does affect the storage management part of Varnish,
but I see that is being modular in any instance, so it may merely be
that some storage modules come up clean on any start while other
will come up with existing objects cached.


Clustering
----------

I'm somewhat torn on clustering for traffic purposes.  For admin
and management: Yes, certainly, but starting to pass objects from
one machine in a cluster to another is likely to be just be a waste
of time and code.

Today one can trivially fit 1TB into a 1U machine so the partitioning
argument for cache clusters doesn't sound particularly urgent to me.

If all machines in the cluster have sufficient cache capacity, the
other remaining argument is backend offloading, that would likely
be better mitigated by implementing a 1:10 style two-layer cluster
with the second level node possibly having twice the storage of
the front row nodes.

The coordination necessary for keeping track of, or discovering in
real-time, who has a given object can easily turn into a traffic
and cpu load nightmare.

And from a performance point of view, it only reduces quality:
First we send out a discovery multicast, then we wait some amount
of time to see if a response arrives only then should we start
to ask the backend for the object.  With a two-level cluster
we can ask the layer-two node right away and if it doesn't have
the object it can ask the back-end right away, no timeout is
involved in that.

Finally Consider the impact on a cluster of a "must get" object
like an IMG tag with a misspelled URL.  Every hit on the front page
results in one get of the wrong URL.  One machine in the cluster
ask everybody else in the cluster "do you have this URL" every
time somebody gets the frontpage.

If we implement a negative feedback protocol ("No I don't"), then
each hit on the wrong URL will result in N+1 packets (assuming multicast).

If we use a silent negative protocol the result is less severe for
the machine that got the request, but still everybody wakes up to
to find out that no, we didn't have that URL.

Negative caching can mitigate this to some extent.


Privacy
-------

Configuration data and instructions passed forth and back should
be encrypted and signed if so configured.  Using PGP keys is
a very tempting and simple solution which would pave the way for
administrators typing a short ascii encoded pgp signed message
into a SMS from their Bahamas beach vacation...


Implementation ideas
--------------------

The simplest storage method mmap(2)'s a disk or file and puts
objects into the virtual memory on page aligned boundaries,
using a small struct for metadata.  Data is not persistant
across reboots.  Object free is incredibly cheap.  Object
allocation should reuse recently freed space if at all possible.
"First free hole" is probably a good allocation strategy.
Sendfile can be used if filebacked.  If nothing else disks
can be used by making a 1-file filesystem on them.

More complex storage methods are object per file and object
in database models.  They are relatively trival and well
understood.  May offer persistence.

Read-Only storage methods may make sense for getting hold
of static emergency contents from CD-ROM etc.

Treat each disk arm as a separate storage unit and keep track of
service time (if possible) to decide storage scheduling.

Avoid regular expressions at runtime.  If config file contains
regexps, compile them into executable code and dlopen() it
into the Varnish process.  Use versioning and refcounts to
do memory management on such segments.

Avoid committing transmit buffer space until we have bandwidth
estimate for client.  One possible way:  Send HTTP header
and time ACKs getting back, then calculate transmit buffer size
and send object.  This makes DoS attacks more harmless and
mitigates traffic stampedes.

Kill all TCP connections after N seconds, nobody waits an hour
for a web-page to load.

Abuse mitigation interface to firewall/traffic shaping:  Allow
the central node to put an IP/Net into traffic shaping or take
it out of traffic shaping firewall rules.  Monitor/interface
process (not main Varnish process) calls script to config
firewalling.

"Warm-up" instructions can take a number of forms and we don't know
what is the most efficient or most usable.  Here are some ideas:

    Start at these URL's then...

	... follow all links down to N levels.

	... follow all links that match REGEXP no deeper than N levels down.

	... follow N random links no deeper than M levels down.

	... load N objects by following random links no deeper than
	    M levels down.

    But...

	... never follow any links that match REGEXP

	... never pick up objects larger than N bytes

	... never pick up objects older than T seconds


It makes a lot of sense to not actually implement this in the main
Varnish process, but rather supply a template perl or python script
that primes the cache by requesting the objects through Varnish.
(That would require us to listen separately on 127.0.0.1
so the perlscript can get in touch with Varnish while in warm-up.)

One interesting but quite likely overengineered option in the
cluster case is if the central monitor tracks a fraction of the
requests through the logs of the running machines in the cluster,
spots the hot objects and tell the warming up varnish what objects
to get and from where.


In the cluster configuration, it is probably best to run the cluster
interaction in a separate process rather than the main Varnish
process.  From Varnish to cluster info would go through the shared
memory, but we don't want to implement locking in the shmem so
some sort of back-channel (UNIX domain or UDP socket ?) is necessary.

If we have such an "supervisor" process, it could also be tasked
with restarting the varnish process if vitals signs fail:  A time
stamp in the shmem or kill -0 $pid.

It may even make sense to run the "supervisor" process in stand
alone mode as well, there it can offer a HTML based interface
to the Varnish process (via shmem).

For cluster use the user would probably just pass an extra argument
when he starts up Varnish:

	varnish -c $cluster_args $other_args
vs

	varnish $other_args

and a "varnish" shell script will Do The Right Thing.


Shared memory
-------------

The shared memory layout needs to be thought about somewhat.  On one
hand we want it to be stable enough to allow people to write programs
or scripts that inspect it, on the other hand doing it entirely in
ascii is both slow and prone to race conditions.

The various different data types in the shared memory can either be
put into one single segment(= 1 file) or into individual segments
(= multiple files).  I don't think the number of small data types to
be big enough to make the latter impractical.

Storing the "big overview" data in shmem in ASCII or HTML would
allow one to point cat(1) or a browser directly at the mmaped file
with no interpretation necessary, a big plus in my book.

Similarly, if we don't update them too often, statistics could be stored
in shared memory in perl/awk friendly ascii format.

But the logfile will have to be (one or more) FIFO logs, probably at least
three in fact:  Good requests, Bad requests, and exception messages.

If we decide to make logentries fixed length, we could make them ascii
so that a simple "sort -n /tmp/shmem.log" would put them in order after
a leading numeric timestamp, but it is probably better to provide a
utility to cat/tail-f the log and keep the log in a bytestring FIFO
format.  Overruns should be marked in the output.


*END*
-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


From des at linpro.no  Thu Feb  9 13:51:11 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Thu, 09 Feb 2006 14:51:11 +0100
Subject: My random thoughts
References: <mailman.0.1139472557.20362.varnish-dev@projects.linpro.no>
Message-ID: <ujr4q38iqsw.fsf@cat.linpro.no>

Poul-Henning Kamp <phk at phk.freebsd.dk> writes:
> Here are my random thoughts on Varnish until now.

Thank you.  I will try to take the time to read them and comment
tomorrow; I am currently busy preparing for a trade show early next
week.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From des at linpro.no  Fri Feb 10 12:59:21 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Fri, 10 Feb 2006 13:59:21 +0100
Subject: r2 - trunk
References: <mailman.1.1139576101.20362.varnish-commit@projects.linpro.no>
Message-ID: <ujrfymrfjyu.fsf@cat.linpro.no>

des at projects.linpro.no writes:
> Added:
>    trunk/LICENSE
> Log:
> Two-clause BSD license.  Assign copyright to Linpro for now.

As discussed with Anders on the phone, the assignment of copyright to
Linpro is temporary until we figure out the legal situation.  I just
talked to our CEO (just back from two weeks of jury duty); he is open
to the idea of setting up a foundation or some similar non-profit
legal entity, and will consult with our lawyer.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From des at linpro.no  Fri Feb 10 18:09:27 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Fri, 10 Feb 2006 19:09:27 +0100
Subject: My random thoughts
References: <mailman.0.1139472557.20362.varnish-dev@projects.linpro.no>
Message-ID: <ujrhd772ii0.fsf@cat.linpro.no>

Poul-Henning Kamp <phk at phk.freebsd.dk> writes:
> It is not enough to deliver a technically superior piece of software,
> if it is not possible for people to deploy it usefully in a sensible
> way and timely fashion.

I tend to favor usability over performance.  I believe you tend to
favor performance over usability.  Hopefully, our opposing tendencies
will combine and the result will be a perfect balance ;)

> In both cases, it would be ideal if all that is necessary to tell
> Varnish are two pieces of information:
>
> 	Storage location
> 		Alternatively we can offer an "auto" setting that makes
> 		Varnish discover what is available and use what it find.

I want Varnish to support multiple storage backends:

 - quick and dirty squid-like hashed directories, to begin with

 - fancy block storage straight to disk (or to a large preallocated
   file) like you suggested

 - memcached

> Ideally this can be done on the commandline so that there is no
> configuration file to edit to get going, just
>
> 	varnish -d /home/varnish -s backend.example.dom

This would use hashed directories if /home/varnish is a directory, and
block storage if it's a file or device node.

> We need to decide what to do about the cache when the Varnish
> process starts.  There may be a difference between it starting
> first time after the machine booted and when it is subsequently
> (re)started.

This might vary depending on which storage backend is used.  With
memcached, for instance, there is a possibility that varnish
restarted, but memcached is still running and still has a warm cache;
and if memcached also restarted, it will transparently obtain any
cached object from its peers.  The disadvantage with memcached is that
we can't sendfile() from it.

> By far the easiest thing to do is to disregard the cache, that saves
> a lot of code for locating and validating the contents, but this
> carries a penalty in backend or cluster fetches whenever a node
> comes up.  Lets call this the "transient cache model"

Another issue is that a persistent cache must store both data and
metadata on disk, rather than just store data on disk and metadata in
memory.  This complicates not only the logic but also the storage
format.

> 	Can expired contents be served if we can't contact the
> 	backend ?  (dangerous...)

Dangerous, but highly desirable in certain circumstances.  I need to
locate the architecture notes I wrote last fall and place them online;
I spent quite somet time thinking about and describing how this could
/ should be done.

> It is a very good question how big a fraction of the persistent
> cache would be usable after typical downtimes:
>
> 	After a Varnish process restart:  Nearly all.
>
> 	After a power-failure ?  Probably at least half, but probably
> 	not the half that contains the most busy pages.

When using direct-to-disk storage, we can (fairly) easily design the
storage format in such a way that updates are atomic, and make liberal
use of fsync() or similar to ensure (to the extent possible) that the
cache is in a consistent state after a power failure.

> Off the top of my head, I would prefer the transient model any day
> because of the simplicity and lack of potential consistency problems,
> but if the load on the back end is intolerable this may not be
> practically feasible.

How about this: we start with the transient model, and add persistence
later.

> If all machines in the cluster have sufficient cache capacity, the
> other remaining argument is backend offloading, that would likely
> be better mitigated by implementing a 1:10 style two-layer cluster
> with the second level node possibly having twice the storage of
> the front row nodes.

Multiple cache layers may give rise to undesirable and possibly
unpredictable interaction (compare this to tunneling TCP/IP over TCP,
with both TCP layers battling each other's congestion control)

> Finally Consider the impact on a cluster of a "must get" object
> like an IMG tag with a misspelled URL.  Every hit on the front page
> results in one get of the wrong URL.  One machine in the cluster
> ask everybody else in the cluster "do you have this URL" every
> time somebody gets the frontpage.

Not if we implement negative caching, which we have to anyway -
otherwise all those requests go to the backend, which gets bogged down
sending out 404s.

> If we implement a negative feedback protocol ("No I don't"), then
> each hit on the wrong URL will result in N+1 packets (assuming
> multicast).

Or we can just ignore queries for documents which we don't have; the
requesting node will have a simply request the document from the
backend if no reply arrives within a short timeout (~1s).

> Configuration data and instructions passed forth and back should
> be encrypted and signed if so configured.  Using PGP keys is
> a very tempting and simple solution which would pave the way for
> administrators typing a short ascii encoded pgp signed message
> into a SMS from their Bahamas beach vacation...

Unfortunately, PGP is very slow, so it should only be used to
communicate with some kind of configuration server, not with the cache
itself.

> The simplest storage method mmap(2)'s a disk or file and puts
> objects into the virtual memory on page aligned boundaries,
> using a small struct for metadata.  Data is not persistant
> across reboots.  Object free is incredibly cheap.  Object
> allocation should reuse recently freed space if at all possible.
> "First free hole" is probably a good allocation strategy.
> Sendfile can be used if filebacked.  If nothing else disks
> can be used by making a 1-file filesystem on them.

hmm, I believe you can sendfile() /dev/zero if you use that trick to
get a private mmap()ed arena.

> Avoid regular expressions at runtime.  If config file contains
> regexps, compile them into executable code and dlopen() it
> into the Varnish process.  Use versioning and refcounts to
> do memory management on such segments.

unlike regexps, globs can be evaluated very efficiently.

> It makes a lot of sense to not actually implement this in the main
> Varnish process, but rather supply a template perl or python script
> that primes the cache by requesting the objects through Varnish.
> (That would require us to listen separately on 127.0.0.1
> so the perlscript can get in touch with Varnish while in warm-up.)

This can easily be done with existing software like w3mir.

> One interesting but quite likely overengineered option in the
> cluster case is if the central monitor tracks a fraction of the
> requests through the logs of the running machines in the cluster,
> spots the hot objects and tell the warming up varnish what objects
> to get and from where.

You can probably do this in ~50 lines of Perl using Net::HTTP.

> In the cluster configuration, it is probably best to run the cluster
> interaction in a separate process rather than the main Varnish
> process.  From Varnish to cluster info would go through the shared
> memory, but we don't want to implement locking in the shmem so
> some sort of back-channel (UNIX domain or UDP socket ?) is necessary.

Distributed lock managers are *hard*...  but we don't need locking for
simple stuff like reading logs out of shmem.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From phk at phk.freebsd.dk  Fri Feb 10 18:42:24 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Fri, 10 Feb 2006 18:42:24 +0000
Subject: My random thoughts 
In-Reply-To: Your message of "Fri, 10 Feb 2006 19:09:27 +0100."
	<ujrhd772ii0.fsf@cat.linpro.no> 
Message-ID: <5868.1139596944@critter.freebsd.dk>

In message <ujrhd772ii0.fsf at cat.linpro.no>, Dag-Erling =?iso-8859-1?q?Sm=F8rgra
v?= writes:
>Poul-Henning Kamp <phk at phk.freebsd.dk> writes:

>> In both cases, it would be ideal if all that is necessary to tell
>> Varnish are two pieces of information:
>>
>> 	Storage location
>> 		Alternatively we can offer an "auto" setting that makes
>> 		Varnish discover what is available and use what it find.
>
>I want Varnish to support multiple storage backends:
>
> - quick and dirty squid-like hashed directories, to begin with

That's actually slow and dirty.  So I'd prefer to wait with this
one until we know we need it (ie: persistance).

> - fancy block storage straight to disk (or to a large preallocated
>   file) like you suggested

This is actually the simpler one to implement: make one file,
mmap it, sendfile from it.

I don't see any advantage to memcached right off the bat, but I
may become wiser later on.

Memcached is intended for when your app needs a shared memory
interface, which is then simulated using network.

Our app is network oriented and we know a lot more about or
data than memcached would, so we can do the networking more
efficiently ourselves.

>> By far the easiest thing to do is to disregard the cache, that saves
>> a lot of code for locating and validating the contents, but this
>> carries a penalty in backend or cluster fetches whenever a node
>> comes up.  Lets call this the "transient cache model"
>
>Another issue is that a persistent cache must store both data and
>metadata on disk, rather than just store data on disk and metadata in
>memory.  This complicates not only the logic but also the storage
>format.

Yes, although we can get pretty far with mmap on this too.

>> It is a very good question how big a fraction of the persistent
>> cache would be usable after typical downtimes:
>>
>> 	After a Varnish process restart:  Nearly all.
>>
>> 	After a power-failure ?  Probably at least half, but probably
>> 	not the half that contains the most busy pages.
>
>When using direct-to-disk storage, we can (fairly) easily design the
>storage format in such a way that updates are atomic, and make liberal
>use of fsync() or similar to ensure (to the extent possible) that the
>cache is in a consistent state after a power failure.

I meant "usable" as in "will be asked for", ie: usable for improving
the hitrate.

>How about this: we start with the transient model, and add persistence
>later.

My idea exactly :-)

Since I expect the storage to be pluggable, this should be pretty
straightforward.

>> If all machines in the cluster have sufficient cache capacity, the
>> other remaining argument is backend offloading, that would likely
>> be better mitigated by implementing a 1:10 style two-layer cluster
>> with the second level node possibly having twice the storage of
>> the front row nodes.
>
>Multiple cache layers may give rise to undesirable and possibly
>unpredictable interaction (compare this to tunneling TCP/IP over TCP,
>with both TCP layers battling each other's congestion control)

I doubt it.  The front end Varnish fetches from the backend
into its store and from there another thread will serve the
users, so the two TCP connections are not interacting directly.

>Or we can just ignore queries for documents which we don't have; the
>requesting node will have a simply request the document from the
>backend if no reply arrives within a short timeout (~1s).

I want to avoid any kind of timeouts like that.  One slight bulge
in your load and everybody times out and hits the backend.

>Unfortunately, PGP is very slow, so it should only be used to
>communicate with some kind of configuration server, not with the cache
>itself.

Absolutely.  My plan wast to have the "management process" do that.

>unlike regexps, globs can be evaluated very efficiently.

But more efficiently still if compiled into C code.

>> It makes a lot of sense to not actually implement this in the main
>> Varnish process, but rather supply a template perl or python script
>> that primes the cache by requesting the objects through Varnish.

>This can easily be done with existing software like w3mir.
>[...]
>You can probably do this in ~50 lines of Perl using Net::HTTP.

Sounds like you just won this bite :-)

>Distributed lock managers are *hard*...

Nobody is talking about distributed lock managers.  The shared
memory is strictly local to the machine and r/o by everybody else
than the main Varnish process.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From des at linpro.no  Sat Feb 11 20:23:10 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Sat, 11 Feb 2006 21:23:10 +0100
Subject: My random thoughts
In-Reply-To: <5868.1139596944@critter.freebsd.dk> (Poul-Henning Kamp's
	message of "Fri, 10 Feb 2006 18:42:24 +0000")
References: <5868.1139596944@critter.freebsd.dk>
Message-ID: <ujrpslt8x1t.fsf@cat.linpro.no>

"Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
> "Dag-Erling Sm?rgrav" <des at des.no> writes:
> > Multiple cache layers may give rise to undesirable and possibly
> > unpredictable interaction (compare this to tunneling TCP/IP over TCP,
> > with both TCP layers battling each other's congestion control)
> I doubt it.  The front end Varnish fetches from the backend
> into its store and from there another thread will serve the
> users, so the two TCP connections are not interacting directly.

You took me a little too literally.  What I meant is that we may see
undesirable interaction between the two layers, for instance in the
area of expiry handling (what will the front layer think when the rear
layer sends it expired documents?).

> > Unfortunately, PGP is very slow, so it should only be used to
> > communicate with some kind of configuration server, not with the cache
> > itself.
> Absolutely.  My plan wast to have the "management process" do that.

Hmm, we might as well go right ahead and call it a FEP :)

(see http://www.jargon.net/jargonfile/b/box.html if you didn't catch
the reference)

> > unlike regexps, globs can be evaluated very efficiently.
> But more efficiently still if compiled into C code.

I don't think so, but I may have overlooked something.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From andersb at vgnett.no  Sun Feb 12 21:54:00 2006
From: andersb at vgnett.no (Anders Berg)
Date: Sun, 12 Feb 2006 22:54:00 +0100 (CET)
Subject: My random thoughts 
Message-ID: <61614.193.213.34.102.1139781240.squirrel@denise.vg.no>

Good work guys. I had a great time reading the notes.

Here comes the sys.adm approach.

P.S The sys.adm approach can easily been seen as a overengineered
solution, don't feel my approach as a must-have. More as a nice-to-have.


>Notes on Varnish
>----------------
>
>Philosophy
>----------
>
>It is not enough to deliver a technically superior piece of software,
>if it is not possible for people to deploy it usefully in a sensible
>way and timely fashion.
>[...]
>If circumstances are not conductive to strucured approach, it should
>be possible to repeat this process and set up N independent Varnish
>boxes and get some sort of relief without having to read any further
>documentation.

I think these are reasonable senarios and solutions.

>
>The subsequent (layers of) Varnish
>----------------------------------
>
>[...]
>When Varnish machines are put in a cluster, the administrator should
>be able to consider the cluster as a unit and not have to think and
>interact with the individual nodes.

That would be great. Imho far to little software acts like this.
There could be a good reason for that, but I wouldn't know.

>Some sort of central management node or facility must exist and
>it would be preferable if this was not a physical but a logical
>entity so that it can follow the admin to the beach.  Ideally it
>would give basic functionality in any browser, even mobile phones.

A web-browser interface and a CLI should cover 99% of use. An easy
protocol/API would make it possible for anybody to write their own
interface to the central managment node.

>The focus here is scaleability, we want to avoid per-machine
>configuration if at all possible.  Ideally, preconfigured hardware
>can be plugged into power and net, find an address with DHCP, contact
>preconfigured management node, get a configuration and start working.

This would ease many things. If one should make a image of some sort, one
does not have to change/make new image for every config change (if that
happens more ofte than software updates).

>But we also need to think about how we avoid a site of Varnish
>machines from acting like a stampeeding horde when the power or
>connectivity is brought back after a disruption.  Some sort of
>slow starting ("warm-up" ?) must be implemented to prevent them
>from hitting all the backend with the full force.

Yes. As you said in Oslo Poul, this could be a killer-app feature for some
sites.

>An important aspect of cluster operations is giving a statistically
>meaninful judgement of the cluster size, in particular answering
>the question "would adding another machine help ?" precisely.

Is this possible? It would involve knowing how the backend is doing with
added load.
One thing is to measure how it's doing right now (responstime), but to
predict added load is hard.
My guess is also that the only reason somebody would ask "would adding
another machine help ?" was if the CPU or bandwith was exhausted on the
accelerator(s) in place, and one really needed to do something anyway. The
only other reason I can think of is responstime from the accelerator, and
then we have the predict load problem.

>We should have a facility that allows the administrator to type
>in a REGEXP/URL and have all the nodes answer with a checksum, age
>and expiry timer for any documents they have which match.  The
>results should be grouped by URL and checksum.

Not only the admin needs this. Its great when programmers/implementors
need to debug how "good" the new/old application caches.
In a world of rapid development, little or no time is often given to
make/check the "cachebility" of the app.
A "check www.rapiddev.com/newapp/*" after a couple of clicks on the app
could save developers huge amount of time, and reduce backend load
immensely.

>
>Technical concepts
>------------------
>
>We want the central Varnish process to be that, just one process, and
>we want to keep it small and efficient at all cost.

Yes. When you say 1 process, you mean 1 process per CPU/Core?

>Code that will not be used for the central functionality should not
>be part of the central process.  For instance code to parse, validate
>and interpret the (possibly) complex configuration file should be a
>separate program.

Lets list possible processes:

1. Varnish main.
2. Disk/storage process.
3. Config process/program.
4. Managment process.
5. Logger/stats.

>Depending on the situation, the Varnish process can either invoke
>this program via a pipe or receive the ready to use data structures
>via a network connection.
>
>Exported data from the Varnish process should be made as cheap as
>possible, likely shared memory.  That will allow us to deploy separate
>processes for log-grabbing, statistics monitoring and similar
>"off-duty" tasks and let the central process get on with the
>important job.

Sounds great.

>
>Backend interaction
>-------------------
>
>We need a way to tune the backend interaction further than what the
>HTTP protocol offers out of the box.
>
>We can assume that all documents we get from the backend has an
>expiry timer, if not we will set a default timer (configurable of
>course).
>
>But we need further policy than that.  Amongst the questions we have
>to ask are:
>
>	How long time after the expiry can we serve a cached copy
>	of this document while we have reason to belive the backend
>	can supply us with an update ?
>
>	How long time after the expiry can we serve a cached copy
>	of this document if the backend does not reply or is
>	unreachable.
>
>	If we cannot serve this document out of cache and the backend
>	cannot inform us, what do we serve instead (404 ?  A default
>	document of some sort ?)
>
>	Should we just not serve this page at all if we are in a
>	bandwidth crush (DoS/stampede) situation ?

You are correct. Did you mean ask the user or did you mean questions to
answer in a specification?
I think the best approach is to ask the user, and let him answer in the
config. I can see as many answers to these questions (and more) as there
are websites :) Also a site might answer differently in different
scenarios.

>It may also make sense to have a "emergency detector" which triggers
>when the backend is overloaded and offer a scaling factor for all
>timeouts for when in such an emergency state.  Something like "If
>the average response time of the backend rises above 10 seconds,
>multiply all expiry timers by two".

Good idea. Once again I opt for a config choice on that one.

>It probably also makes sense to have a bandwidth/request traffic
>shaper for backend traffic to prevent any one Varnish machine from
>pummeling the backend in case of attacks or misconfigured
>expiry headers.

Good idea, but this one I am unsure about. The reason: One more thing that
can make the accelerator behave in a way you don't understand.
You are delivering stale documents from the accelerator. You start
"debugging". "Hmm, most of thre requests are given from backen in timely
fashion..." You debug more and start examining the headers. I can see
myself going through loads of different stuff, and than: "Ahh, the traffic
shaper..."
As I said, I like the idea, but to many rules for backoffs will make the
sys.admin scratch his head even more.
Can we come up with a way for Varnish to tell the sys.adm. "Hey, you are
delivering stale's here. Because ..." Or is this overengineer?

>
>Startup/consistency
>-------------------
>
>We need to decide what to do about the cache when the Varnish
>process starts.  There may be a difference between it starting
>first time after the machine booted and when it is subsequently
>(re)started.
>
>By far the easiest thing to do is to disregard the cache, that saves
>a lot of code for locating and validating the contents, but this
>carries a penalty in backend or cluster fetches whenever a node
>comes up.  Lets call this the "transient cache model"

I agree with Dag here. Lets start with "transient cache model" and add
more later.
We will discuss some scenarios at spec writing, and maybe come up with
some models for later implementation.
Better dig out those architecture notes Dag :)

>The alternative is to allow persistently cached contents to be used
>according to configured criteria:
>[...]
>The choice we make does affect the storage management part of Varnish,
>but I see that is being modular in any instance, so it may merely be
>that some storage modules come up clean on any start while other
>will come up with existing objects cached.

Ironically at VG the stuff that can be cahced long (JPG's, GIF's etc) can
be cached long, while the costly stuff is the documents that cost CPU
making.
It would not be surprised if its like that many places.

>
>Clustering
>----------
>
>I'm somewhat torn on clustering for traffic purposes.  For admin
>and management: Yes, certainly, but starting to pass objects from
>one machine in a cluster to another is likely to be just be a waste
>of time and code.
>
>Today one can trivially fit 1TB into a 1U machine so the partitioning
>argument for cache clusters doesn't sound particularly urgent to me.
>
>If all machines in the cluster have sufficient cache capacity, the
>other remaining argument is backend offloading, that would likely
>be better mitigated by implementing a 1:10 style two-layer cluster
>with the second level node possibly having twice the storage of
>the front row nodes.

I am also torn here.
A part of me says. Hey, there is ICP v2 and such, lets use it, it's good
economy.
Another part is thinking that ICP works at it's best when you have many
accelerators, and if Varnish can deliver what we hope, not many frontends
are needed for most sites in the world :) At that level, you can for sure
deliver the extra content ICP and such would save you from.
I know that in saying that I am sacrificing design because of
implementation, but there it is.

>The coordination necessary for keeping track of, or discovering in
>real-time, who has a given object can easily turn into a traffic
>and cpu load nightmare.
>
>And from a performance point of view, it only reduces quality:
>First we send out a discovery multicast, then we wait some amount
>of time to see if a response arrives only then should we start
>to ask the backend for the object.  With a two-level cluster
>we can ask the layer-two node right away and if it doesn't have
>the object it can ask the back-end right away, no timeout is
>involved in that.

A note. One of the reasons to be wary of two-level clusters in my opinion
is that if you cache a document from the backend at the lowest lvl for say
2 min. And then the level over comes and gets it 1 min. into those 2 min.,
looks up in its config and finds out this is a 2 min. cache document, the
document will be 1 min stale before a refesh. This could of cource be
solved with Expires tags, but it makes sys.adm's wary.
Dag also noted problems with this when we have two-layer approach and
first layer is in backoff-mode.

>Finally Consider the impact on a cluster of a "must get" object
>like an IMG tag with a misspelled URL.  Every hit on the front page
>results in one get of the wrong URL.  One machine in the cluster
>ask everybody else in the cluster "do you have this URL" every
>time somebody gets the frontpage.
>[...]
>Negative caching can mitigate this to some extent.
>
>
>Privacy
>-------
>
>Configuration data and instructions passed forth and back should
>be encrypted and signed if so configured.  Using PGP keys is
>a very tempting and simple solution which would pave the way for
>administrators typing a short ascii encoded pgp signed message
>into a SMS from their Bahamas beach vacation...

Bahamas? Vaction? :)

>
>Implementation ideas
>--------------------
>
>The simplest storage method mmap(2)'s a disk or file and puts
>objects into the virtual memory on page aligned boundaries,
>using a small struct for metadata.  Data is not persistant
>across reboots.  Object free is incredibly cheap.  Object
>allocation should reuse recently freed space if at all possible.
>"First free hole" is probably a good allocation strategy.
>Sendfile can be used if filebacked.  If nothing else disks
>can be used by making a 1-file filesystem on them.
>
>More complex storage methods are object per file and object
>in database models.  They are relatively trival and well
>understood.  May offer persistence.

Dag says:

>- quick and dirty squid-like hashed directories, to begin with
>
> - fancy block storage straight to disk (or to a large preallocated
>   file) like you suggested
>
> - memcached

as Poul later comments, squid is slow and dirty. Lets try to avoid it.
I am fine with fancy block storage, and I am tempted to suggest: Berkeley DB
I have always pictured Varnish with a Berkley DB backend. Why? I _think_
it is fast (only website info to go on here).

http://www.sleepycat.com/products/bdb.html and
http://www.sleepycat.com/products/bdb.html

its block storage, and wildcard purge could potentially be as easy as:
delete from table where URL like '%bye-bye%';
Another thing I am just gonna base on my wildest fantasies, could we use
the Berkley DB replication to make a cache up-to-date after downtime?
Would be fun, wouldn't it? :)

I also like memcached, and I am excited to hear Poul suggest that we build
a "better" approach.
When I read that, I must admit that my first thought was that it would be
really nice if this is a deamon/shem process that one can build a php (or
whatever) interface against. This is out of scope, but imagine you have
full access to the cache-data in php if only in RO mode. That means you
can build php apps with a superquick backend with loads of metadata. :)

>Read-Only storage methods may make sense for getting hold
>of static emergency contents from CD-ROM etc.

Nice feature.

>Treat each disk arm as a separate storage unit and keep track of
>service time (if possible) to decide storage scheduling.
>
>Avoid regular expressions at runtime.  If config file contains
>regexps, compile them into executable code and dlopen() it
>into the Varnish process.  Use versioning and refcounts to
>do memory management on such segments.

I smell a glob vs. compiled regexp showdown. Hehe.
My only contrib here would be. Don't do it in java regexp :)

>Avoid committing transmit buffer space until we have bandwidth
>estimate for client.  One possible way:  Send HTTP header
>and time ACKs getting back, then calculate transmit buffer size
>and send object.  This makes DoS attacks more harmless and
>mitigates traffic stampedes.

Yes. Are you thinking of writing a FreeBSD kernel module (accept_filter)
for this? Like accf_http.


>Kill all TCP connections after N seconds, nobody waits an hour
>for a web-page to load.
>
>Abuse mitigation interface to firewall/traffic shaping:  Allow
>the central node to put an IP/Net into traffic shaping or take
>it out of traffic shaping firewall rules.  Monitor/interface
>process (not main Varnish process) calls script to config
>firewalling.

This sounds like a really good feature. Hope it can be solved in Linux as
well. Not sure they have the fancy IPFW filters etc.

>"Warm-up" instructions can take a number of forms and we don't know
>what is the most efficient or most usable.  Here are some ideas:
>[...]
>
>One interesting but quite likely overengineered option in the
>cluster case is if the central monitor tracks a fraction of the
>requests through the logs of the running machines in the cluster,
>spots the hot objects and tell the warming up varnish what objects
>to get and from where.

>>This can easily be done with existing software like w3mir.
>>[...]
>>You can probably do this in ~50 lines of Perl using Net::HTTP.

>>>Sounds like you just won this bite :-)

Nice :) But I am not sure this is as "easy" as it sounds at first.

>In the cluster configuration, it is probably best to run the cluster
>interaction in a separate process rather than the main Varnish
>process.  From Varnish to cluster info would go through the shared
>memory, but we don't want to implement locking in the shmem so
>some sort of back-channel (UNIX domain or UDP socket ?) is necessary.
>
>If we have such an "supervisor" process, it could also be tasked
>with restarting the varnish process if vitals signs fail:  A time
>stamp in the shmem or kill -0 $pid.

You got to like programs that keep themselvs alive.

>It may even make sense to run the "supervisor" process in stand
>alone mode as well, there it can offer a HTML based interface
>to the Varnish process (via shmem).
>
>For cluster use the user would probably just pass an extra argument
>when he starts up Varnish:
>
>	varnish -c $cluster_args $other_args
>vs
>
>	varnish $other_args
>
>and a "varnish" shell script will Do The Right Thing.

Thats what we should aim at.

>Shared memory
>-------------
>
>The shared memory layout needs to be thought about somewhat.  On one
>hand we want it to be stable enough to allow people to write programs
>or scripts that inspect it, on the other hand doing it entirely in
>ascii is both slow and prone to race conditions.
>
>The various different data types in the shared memory can either be
>put into one single segment(= 1 file) or into individual segments
>(= multiple files).  I don't think the number of small data types to
>be big enough to make the latter impractical.
>
>Storing the "big overview" data in shmem in ASCII or HTML would
>allow one to point cat(1) or a browser directly at the mmaped file
>with no interpretation necessary, a big plus in my book.
>
>Similarly, if we don't update them too often, statistics could be stored
>in shared memory in perl/awk friendly ascii format.

That would be a big pluss with the stats either in HTML or in ASCII at least.

>But the logfile will have to be (one or more) FIFO logs, probably at least
>three in fact:  Good requests, Bad requests, and exception messages.

And a debug logg. The squid modell is not to bad there. Only poorly
documented.
In short its a "binary configuration", 1=some part a, 4=some part b, ...,
128=some part i.
Debug=133=a,b and i.

I mentioned on the meeting some URL's that would provide some relevant
reading:

http://www.web-cache.com/

is old but good. It lists all relevant protocols:

http://www.web-cache.com/Writings/protocols-standards.html

and other written things:

http://www.web-cache.com/writings.html

Here is also the Hypertext Caching Protocol - alternative and improvement
to ICP, what I refered to as WCCP at the last meeting.
Another RFC to take a look on might be: Web Cache Invalidation Protocol
(WCIP)
Here is what ESI.org has to say about WCIP: http://www.esi.org/tfaq.html#q8
And here is their approach: http://www.esi.org/invalidation_protocol_1-0.html

Sorry about all the text :)

P.S I was not on the list when Poul wrote the first post, so I don't have
the ID either. My post will come as a seperate one.

Anders Berg


From andersb at vgnett.no  Thu Feb 16 00:45:54 2006
From: andersb at vgnett.no (Anders Berg)
Date: Thu, 16 Feb 2006 01:45:54 +0100 (CET)
Subject: [Fwd: Re: My random thoughts]
In-Reply-To: <2493.1139994855@critter.freebsd.dk>
References: Your message of "Mon, 13 Feb 2006 10:10:23 +0100."           
	<52801.129.240.201.175.1139821823.squirrel@denise.vg.no>
	<2493.1139994855@critter.freebsd.dk>
Message-ID: <65058.193.213.34.102.1140050754.squirrel@denise.vg.no>

Thanks for reply Poul.

One thought that keeps coming back to me all the time is the need for a
really well documented/well discussed/tested  HTTP header strategy. It is
crucial and I belive we will spend much of our time next week and much
more later on this. I do not think it is possible to cover all aspects in
spec. alone. This is maybe to state the obvious, but I rather think that I
should so we all have a time to ponder on it.


>>as Poul later comments, squid is slow and dirty. Lets try to avoid it. I
>>am fine with fancy block storage, and I am tempted to suggest: Berkeley
>> DB
>>I have always pictured Varnish with a Berkley DB backend. Why? I _think_
>>it is fast (only website info to go on here).
>
> We may want to use DB to hash urls into object identity, but I doubt we
> will be putting the objects themselves into DB.

Yes. Objects _could_ work fine for a website with ASCII text HTML pages
and small JPEG's, GIF's, but anybody delivering "large" files and binaries
would curse it. So I see the usage rather limited for objects.

>>its block storage, and wildcard purge could potentially be as easy as:
>>delete from table where URL like '%bye-bye%';
>>Another thing I am just gonna base on my wildest fantasies, could we use
>>the Berkley DB replication to make a cache up-to-date after downtime?
>>Would be fun, wouldn't it? :)
>
> I fear it would be expensive.

Considering that objects would be kept outside this could work if the
database held some more data like how "hot" the object is, then parse
("select id from table order by hotness limit 200") it and fetch, but I
see that it may be alot more "effective" to do it the w3mir way Dag
suggested. Hotness would be inserted from aggregated shm data? I note
w3mir could maybe give us a License problem?

Anyway, spec week is coming up and I am excited. :)

Anders Berg


From phk at phk.freebsd.dk  Thu Feb 16 10:09:17 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Thu, 16 Feb 2006 10:09:17 +0000
Subject: [Fwd: Re: My random thoughts] 
In-Reply-To: Your message of "Thu, 16 Feb 2006 01:45:54 +0100."
	<65058.193.213.34.102.1140050754.squirrel@denise.vg.no> 
Message-ID: <4156.1140084557@critter.freebsd.dk>

In message <65058.193.213.34.102.1140050754.squirrel at denise.vg.no>, "Anders Ber
g" writes:

Let me just try to see if I can express the overall threading
strategy I have formed without using a whiteboard:

The [...] is which thread we're in.


[acceptor] Incoming connections are handled by acceptfilters in a
single thread or if acceptfilters are not available with a single
threaded poll loop.

[acceptor] Once a full HTTP request has been gathered, the URL is
hashed and looked up to see if we have a hit or not.

[acceptor] If we have a hit, and the object is in a "ready" state,
a thread is pulled off the "sender" queue and given the request to
complete.

[sender] The object will be shipped out according to its state (it
may still be arriving from the backend) and the HTTP headers.
sendfile will be used if at all possible.  Once done, the the fd
will be sent back the the acceptor if not closed {can we engage
acceptfilters again ?}  {We may ($config) engage in compression
here and in such case we would embellish the object with the
compressed version (up front) so it can be reused by other senders.}

[acceptor] If we have a hit, but the object is not in a "ready"
state, (for instance we are trying to get the object from the
backend, but havn't received any of it yet) the request is parked
on the object.

[acceptor] If we have no hit, the header needs to be analyzed (URL
cleanup, rewriting, negative lookup etc etc).  We could use a
"sender" thread to do this, but I would rather in order to limit
the amount of potentially expensive work we do here.  My initial
thought therefore is to put the request into a queue to be dealt
with by the "backend" threads.

[backend] These threads will look for two kinds of work in order
of priority: requests that needs analysing and objects nearing
expiration.

[backend] Requests needing analysis are chewed upon according to
the configured rules and one of four outcomes are possible:

[backend] Invalid request.  Grap a "sender" and ship out a static
error-object.

[backend] Rematched request, (after analysis it matches an existing
object) treat like the acceptor would for a hash hit.  If configuration
allows: add new hash entry to put this URL on fast track in the
future.

[backend] Unmatched request, cacheable (glob/regexp matching).
Create object, queue request on it.  Add hash entry.  Initiate fetch
from backend.  When HTTP header arrives, set expiry on object
accordingly.  Once some data has arrived, grab sender and pass it
the object (NB: not the request).  Receive full object.

[backend] Unmatched request, uncacheable (glob/regexp matching).
Create (transient) object.  Initiate fetch from backend.  Once some
data has arrived, grab sender thread and pass it object.  Receive
full object.

[backend] Near-expiry objects: Once an object nears expiry (defined
by config) it is eligble for refresh.  A backend thread will determine
if the object is important enough (defined by config) compared to
current backend responsiveness to be refreshed.  If it is, a GET
request is sent to the backend.  (I'm not sure optimizing with a HEAD
is worth much here, maybe a hybrid strategy:  If the object has been
refreshed before and a GET was necessary more often than not, then
do GET otherwise try HEAD first).

[sender] When passed object:  If only one request queued on object,
behave as if passed that request.  If more than one request is
queued, grab a sender for each and pass that request.

[sender] On transient object:  Destroy object after transmission.

[any] If on attempting to pull a sender off the queue, none is
available, the request or object is queued instead.

[overseer] Monitor number of sender threads and create/destroy them
as appropriate.  Sender threads go back to the front of the queue
(to cache efficiency reasons) and if they linger in the tail of the
queue doing nothing for more than $config seconds, they get killed off.

[overseer] Monitor backend responsiveness based on backend thread
statistics.  Switch between various policy states accordingly.

[master] handle requests coming in via $channel from janitor process.

... or something like that.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From des at linpro.no  Fri Feb 17 12:47:26 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Fri, 17 Feb 2006 13:47:26 +0100
Subject: My random thoughts
References: <61614.193.213.34.102.1139781240.squirrel@denise.vg.no>
Message-ID: <ujrk6buxich.fsf@cat.linpro.no>

"Anders Berg" <andersb at vgnett.no> writes:
> "Dag-Erling Sm?rgrav" <des at linpro.no> writes:
> > - quick and dirty squid-like hashed directories, to begin with
> as Poul later comments, squid is slow and dirty. Lets try to avoid it.

I just mentioned it as a way of getting a storage backend up and
running quickly so we can concentrate on other stuff.

> I am fine with fancy block storage, and I am tempted to suggest:
> Berkeley DB I have always pictured Varnish with a Berkley DB
> backend. Why? I _think_ it is fast (only website info to go on
> here).
>
> http://www.sleepycat.com/products/bdb.html and
> http://www.sleepycat.com/products/bdb.html
>
> its block storage, and wildcard purge could potentially be as easy as:
> delete from table where URL like '%bye-bye%';

Berkeley DB does not have an SQL interface or any kind of query
engine.

> "Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
> > Abuse mitigation interface to firewall/traffic shaping:  Allow
> > the central node to put an IP/Net into traffic shaping or take
> > it out of traffic shaping firewall rules.  Monitor/interface
> > process (not main Varnish process) calls script to config
> > firewalling.
> This sounds like a really good feature. Hope it can be solved in
> Linux as well. Not sure they have the fancy IPFW filters etc.

They have iptables and other equivalents.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From des at linpro.no  Fri Feb 17 12:51:49 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Fri, 17 Feb 2006 13:51:49 +0100
Subject: [Fwd: Re: My random thoughts]
References: <4156.1140084557@critter.freebsd.dk>
Message-ID: <ujrfymixi56.fsf@cat.linpro.no>

"Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
> [acceptor] If we have no hit, the header needs to be analyzed (URL
> cleanup, rewriting, negative lookup etc etc).  We could use a
> "sender" thread to do this, but I would rather in order to limit
> the amount of potentially expensive work we do here.  My initial
> thought therefore is to put the request into a queue to be dealt
> with by the "backend" threads.

The header always needs to be analyzed, as it may contain stuff like
If-Modified-Since, Range, etc.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From phk at phk.freebsd.dk  Fri Feb 17 13:26:58 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Fri, 17 Feb 2006 13:26:58 +0000
Subject: [Fwd: Re: My random thoughts] 
In-Reply-To: Your message of "Fri, 17 Feb 2006 13:51:49 +0100."
	<ujrfymixi56.fsf@cat.linpro.no> 
Message-ID: <15373.1140182818@critter.freebsd.dk>

In message <ujrfymixi56.fsf at cat.linpro.no>, Dag-Erling =?iso-8859-1?q?Sm=F8rgra
v?= writes:
>"Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
>> [acceptor] If we have no hit, the header needs to be analyzed (URL
>> cleanup, rewriting, negative lookup etc etc).  We could use a
>> "sender" thread to do this, but I would rather in order to limit
>> the amount of potentially expensive work we do here.  My initial
>> thought therefore is to put the request into a queue to be dealt
>> with by the "backend" threads.
>
>The header always needs to be analyzed, as it may contain stuff like
>If-Modified-Since, Range, etc.

While those headers are relevant, they are of no use until we have
the object in question, so we don't need to look at them until in
the sender or backend thread.

Since we only have one frontend thread, I want to minimize the amount
of work it does to the absolute minimum.

The number of sender and backend threads are variable and can/will
be adjusted to fit the load.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From andersb at vgnett.no  Fri Feb 17 17:11:45 2006
From: andersb at vgnett.no (Anders Berg)
Date: Fri, 17 Feb 2006 18:11:45 +0100 (CET)
Subject: My random thoughts
In-Reply-To: <ujrk6buxich.fsf@cat.linpro.no>
References: <61614.193.213.34.102.1139781240.squirrel@denise.vg.no>
	<ujrk6buxich.fsf@cat.linpro.no>
Message-ID: <4263.195.139.5.194.1140196305.squirrel@denise.vg.no>

> "Dag-Erling Sm?rgrav" <des at linpro.no> writes:
>> I am fine with fancy block storage, and I am tempted to suggest:
>> Berkeley DB I have always pictured Varnish with a Berkley DB
>> backend. Why? I _think_ it is fast (only website info to go on
>> here).
>>
>> http://www.sleepycat.com/products/bdb.html and
>> http://www.sleepycat.com/products/bdb.html
>>
>> its block storage, and wildcard purge could potentially be as easy as:
>> delete from table where URL like '%bye-bye%';
>
> Berkeley DB does not have an SQL interface or any kind of query
> engine.

Okay, I knew it did not have a SQL interface, but not that it did not
deliver a query engine of some sort. Anyway Berkeley DB (now Oracle owned
:)) does say this on their homepage:

"Berkeley DB is the ideal choice for static queries over dynamic data,
while traditional relational databases are well suited for dynamic queries
over static data."

I did not paste this in to argue that you and Berkeley have a different
definition of queries :) But rather that the "queries" we are gonna use
for this are the same, and the data dynamic. So at first glance it looks
to be right for us if it's _fast_. But no fear, I can kill darlings :)


>> "Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
>> > Abuse mitigation interface to firewall/traffic shaping:  Allow
>> > the central node to put an IP/Net into traffic shaping or take
>> > it out of traffic shaping firewall rules.  Monitor/interface
>> > process (not main Varnish process) calls script to config
>> > firewalling.
>> This sounds like a really good feature. Hope it can be solved in
>> Linux as well. Not sure they have the fancy IPFW filters etc.
>
> They have iptables and other equivalents.

Brilliant. Now lets pray they work the way they should, and are dynamic :)

Anders Berg


From des at linpro.no  Mon Feb 20 15:59:05 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Mon, 20 Feb 2006 16:59:05 +0100
Subject: draft spec
Message-ID: <ujrbqx22f92.fsf@cat.linpro.no>

Here's a dump of what we've written down so far:

http://varnish.projects.linpro.no/wiki/VarnishSpecDraft

Please let me know if there are any glaring mistakes or if something
seems to be headed in the wrong direction.

I'd like to make one comment regardint the Components section - the
various components are not necessarily separate threads, they're just
distinct functional units, some of which may be implemented as threads
with message passing or work queues while others are simply APIs.

Oh, and we haven't started talking about management or logging, so
this is all in the main Varnish process.

What I'd like to do tomorrow and on Wednesday is try to cover as much
ground as possible without going into too much detail; we can save
that for when Poul-Henning is here.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From des at linpro.no  Wed Feb 22 18:55:32 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Wed, 22 Feb 2006 19:55:32 +0100
Subject: r16 - trunk/varnish-doc/share
References: <20060222184402.061351ED520@projects.linpro.no>
Message-ID: <ujr1wxvmdej.fsf@cat.linpro.no>

des at projects.linpro.no writes:
> Modified:
>    trunk/varnish-doc/share/docbook-xml.css
> Log:
> Set correct mime-type.

Since mod_dav_svn uses svn:mime-type as Content-Type, it is now
possible to read the draft spec online, straight from the repo:

http://varnish.projects.linpro.no/svn/trunk/varnish-doc/en/varnish-specification/article.xml

It's not perfect - no TOC, no links, no bibliography - but it's enough
to be able to read the text without first having to check out the
DocBook source and run it through xsltproc.

In the medium term, I will look into creating a lightweight XSL
stylesheet for DocBook which will allow a web browser to transform
DocBook to XHTML on the fly (the full DocBook XSL stylesheets are not
well suited for that purpose)

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From phk at phk.freebsd.dk  Fri Feb 24 12:53:08 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Fri, 24 Feb 2006 12:53:08 +0000
Subject: notes1
Message-ID: <20060224125308.27584BC6D@phk.freebsd.dk>


Notes on Varnish
----------------

Collected 2006-02-24 to 2006-02-..

Poul-Henning Kamp

-----------------------------------------------------------------------
Policy Configuration

Policy is configured in a simple unidirectional (no loops, no goto)
programming language which is compiled into 'C' and from there binary
modules which are dlopen'ed by the main Varnish process.

The dl object contains one exported symbol, a pointer to a structure
which contains a reference count, a number of function pointers,
a couple of string variables with identifying information.

All access into the config is protected by the reference counts.

Multiple policy configurations can be loaded at the same time
but only one is the "active configuration".  Loading, switching and
unloading of policy configurations happen via the managment
process.

A global config sequence number is incremented on each switch and
policy modified object attributes (ttl, cache/nocache) are all
qualified by the config-sequence under which they were calculated
and invalid if a different policy is now in effect.

-----------------------------------------------------------------------
Configuration Language

XXX: include lines.

BNF:
	program:	function
			| program function

	function:	"sub" function_name compound_statement

	compound_statement:	"{" statements "}"

	statements:	/* empty */
			| statement
			| statements statement
			

	statement:	if_statement
			| call_statement
			| "finish"
			| assignment_statement
			| action_statement

	if_statement:	"if" condition compound_statement elif_parts else_part

	elif_parts:	/* empty */
			| elif_part
			| elif_parts elif_part

	elif_part:	"elseif" condition compound_statement
			| "elsif" condition compound_statement
			| "else if" condition compound_statement

	else_part:	/* empty */
			| "else" compound_statement

	call_statement:	"call" function_name

	assign_statement:	field "=" value

	field:		object
			field "." variable

	action_statement:	action arguments

	arguments:	/* empty */
			arguments | argument

-----------------------------------------------------------------------
Sample request policy program

	sub request_policy {

		if (client.ip in 10.0.0.0/8) {
			no-cache
			finish
		}

		if (req.url.host ~ "cnn.no$") {
			rewrite	s/cnn.no$/vg.no/
		}

		if (req.url.path ~ "cgi-bin") {
			no-cache
		}

		if (req.useragent ~ "spider") {
			no-new-cache
		}

		if (backend.response_time > 0.8s) {
			set req.ttlfactor = 1.5
		} elseif (backend.response_time > 1.5s) {
			set req.ttlfactor = 2.0
		} elseif (backend.response_time > 2.5s) {
			set req.ttlfactor = 5.0
		}

		/*
		 * the program contains no references to
		 * maxage, s-maxage and expires, so the
		 * default handling (RFC2616) applies
		 */
	}

-----------------------------------------------------------------------
Sample fetch policy program

	sub backends {
		set backend.vg.ip = {...}
		set backend.ads.ip = {...}
		set backend.chat.ip = {...}
		set backend.chat.timeout = 10s
		set backend.chat.bandwidth = 2000 MB/s
		set backend.other.ip = {...}
	}

	sub vg_backend {
		set backend.ip = {10.0.0.1-5}
		set backend.timeout = 4s
		set backend.bandwidth = 2000Mb/s
	}

	sub fetch_policy {

		if (req.url.host ~ "/vg.no$/") {
			set req.backend = vg
			call vg_backend
		} else {
			/* XXX: specify 404 page url ? */
			error 404
		}

		if (backend.response_time > 2.0s) {
			if (req.url.path ~ "/landbrugspriser/") {
				error 504
			}
		}
		fetch
		if (backend.down) {
			if (obj.exist) {
				set obj.ttl += 10m
				finish
			}
			switch_config ohhshit
		}
		if (obj.result == 404) {
			error 300 "http://www.vg.no"
		}
		if (obj.result != 200) {
			finish
		}
		if (obj.size > 256k) {
			no-cache
		} else if (obj.size > 32k && obj.ttl < 2m) {
			obj.tll = 5m				
		}
		if (backend.response_time > 2.0s) {
			set ttl *= 2.0
		}
	}

	sub prefetch_policy {

		if (obj.usage < 10 && obj.ttl < 5m) {
			fetch
		}
	}

-----------------------------------------------------------------------
Purging

When a purge request comes in, the regexp is tagged with the next
generation number and added to the tail of the list of purge regexps.

Before a sender transmits an object, it is checked against any
purge-regexps which have higher generation number than the object
and if it matches the request is sent to a fetcher and the object
purged.

If there were purge regexps with higher generation to match, but
they didn't match, the object is tagged with the current generation
number and moved to the tail of the list.

Otherwise, the object does not change generation number and is
not moved on the generation list.

New Objects are tagged with the current generation number and put
at the tail of the list.

Objects are removed from the generation list when deleted.

When a purge object has a lower generation number than the first
object on the generation list, the purge object has been completed
and will be removed.  A log entry is written with number of compares
and number of hits.
	
-----------------------------------------------------------------------
Random notes

	swap backed storage

	slowstart by config-flipping
		start-config has peer servers as backend
		once hitrate goes above limit, management process
		flips config to 'real' config.

	stat-object
		always URL, not regexp

	management + varnish process in one binary, comms via pipe

	Change from config with long expiry to short expiry, how
	does the ttl drop ?  (config sequence number invalidates
	all calculated/modified attributes.)

	Mgt process holds copy of acceptor socket ->  Restart without
	lost client requests.

	BW limit per client IP: create shortlived object (<4sec)
	to hold status.  Enforce limits by delaying responses.


-----------------------------------------------------------------------
Source structure


	libvarnish
		library with interface facilities, for instance
		functions to open&read shmem log

	varnish
		varnish sources in three classes

-----------------------------------------------------------------------
protocol cluster/mgt/varnish

object_query url -> TTL, size, checksum
{purge,invalidate} regexp
object_status url -> object metadata

load_config filename
switch_config configname
list_configs
unload_config

freeze 	# stop the clock, freezes the object store
thaw

suspend	# stop acceptor accepting new requests
resume

stop	# forced stop (exits) varnish process
start
restart = "stop;start"

ping $utc_time -> pong $utc_time

# cluster only
config_contents filename $inline -> compilation messages

stats [-mr] -> $data

zero stats

help

-----------------------------------------------------------------------
CLI (local)
	import protocol from above

	telnet localhost someport
	authentication:
		password $secret
	secret stored in {/usr/local}/etc/varnish.secret (400 root:wheel)


-----------------------------------------------------------------------
HTML (local)

	php/cgi-bin thttpd ?
	(alternatively direct from C-code.)
	Everything the CLI can do +
	stats
		popen("rrdtool");
	log view

-----------------------------------------------------------------------
CLI (cluster)
	import protocol from above, prefix machine/all
	compound stats
	accept / deny machine (?)
	curses if you set termtype

-----------------------------------------------------------------------
HTML (cluster)
	ditto
	ditto

	http://clustercontrol/purge?regexp=fslkdjfslkfdj
		POST with list of regexp
		authentication ? (IP access list)

-----------------------------------------------------------------------
Mail (cluster)

	pgp signed emails with CLI commands

-----------------------------------------------------------------------
connection varnish -> cluster controller

	Encryption
		SSL
	Authentication (?)
		IP number checks.

	varnish -c clusterid -C mycluster_ctrl.vg.no

-----------------------------------------------------------------------
Filer
	/usr/local/sbin/varnish
		contains mgt + varnish process.
		if -C argument, open SSL to cluster controller.
		Arguments:
			-p portnumber
			-c clusterid at cluster_controller
			-f config_file
			-m memory_limit
			-s kind[,storage-options]
			-l logfile,logsize
			-b backend ip...
			-d debug
			-u uid
			-a CLI_port

		KILL SIGTERM	-> suspend, stop

	/usr/local/sbin/varnish_cluster
		Cluster controller.
		Use syslog

		Arguments:
			-f config file
			-d debug
			-u uid (?)

	/usr/local/sbin/varnish_logger
		Logfile processor
		-i shmemfile
		-e regexp
		-o "/var/log/varnish.%Y%m%d.traffic" 
		-e regexp2
		-n "/var/log/varnish.%Y%m%d.exception"  (NCSA format)
		-e regexp3
		-s syslog_level,syslogfacility
		-r host:port	send via TCP, prefix hostname

		SIGHUP: reopen all files.

	/usr/local/bin/varnish_cli
		Command line tool.

	/usr/local/share/varnish/etc/varnish.conf
		default request + fetch + backend scripts

	/usr/local/share/varnish/etc/rfc2616.conf
		RFC2616 compliant handling function

	/usr/local/etc/varnish.conf (optional)
		request + fetch + backend scripts

	/usr/local/share/varnish/etc/varnish.startup
		default startup sequence

	/usr/local/etc/varnish.startup (optional)
		startup sequence

	/usr/local/etc/varnish_cluster.conf
		XXX

	{/usr/local}/etc/varnish.secret
		CLI password file.

-----------------------------------------------------------------------
varnish.startup

	load config /foo/bar startup_conf
	switch config startup_conf
	!mypreloadscript
	load config /foo/real real_conf
	switch config real_conf
	resume

*eof*


From andersb at vgnett.no  Fri Feb 24 13:55:00 2006
From: andersb at vgnett.no (Anders Berg)
Date: Fri, 24 Feb 2006 14:55:00 +0100 (CET)
Subject: Module-map Varnish
Message-ID: <2158.193.69.165.4.1140789300.squirrel@denise.vg.no>

A non-text attachment was scrubbed...
Name: varnish_prosesser.pdf
Type: application/pdf
Size: 26872 bytes
Desc: not available
URL: <https://www.varnish-cache.org/lists/pipermail/vinyl-dev/attachments/20060224/60971d6c/attachment.pdf>

From des at linpro.no  Fri Feb 24 14:54:28 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?=)
Date: Fri, 24 Feb 2006 15:54:28 +0100
Subject: r24 - in trunk/varnish-cache: . bin bin/varnishd include lib
	lib/libvarnish lib/libvarnishapi
References: <20060224143556.09D131ED51F@projects.linpro.no>
Message-ID: <ujrlkw0vmcb.fsf@cat.linpro.no>

des at projects.linpro.no writes:
> Added:
>    trunk/varnish-cache/Makefile.am
>    trunk/varnish-cache/autogen.sh
>    trunk/varnish-cache/bin/
>    trunk/varnish-cache/bin/Makefile.am
>    trunk/varnish-cache/bin/varnishd/
>    trunk/varnish-cache/bin/varnishd/Makefile.am
>    trunk/varnish-cache/bin/varnishd/varnishd.c
>    trunk/varnish-cache/configure.ac
>    trunk/varnish-cache/include/
>    trunk/varnish-cache/include/Makefile.am
>    trunk/varnish-cache/include/varnishapi.h
>    trunk/varnish-cache/lib/
>    trunk/varnish-cache/lib/Makefile.am
>    trunk/varnish-cache/lib/libvarnish/
>    trunk/varnish-cache/lib/libvarnish/Makefile.am
>    trunk/varnish-cache/lib/libvarnishapi/
>    trunk/varnish-cache/lib/libvarnishapi/Makefile.am
> Log:
> Source tree structure as agreed.

To build, first make sure you have GNU autotools installed (FreeBSD:
devel/gnu-autoconf, devel/gnu-automake, devel/gnu-libtool).  Check out
the sources, run autogen.sh to generate the configure script, then
configure && make && make install as usual.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From des at linpro.no  Mon Feb 27 10:01:57 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?=)
Date: Mon, 27 Feb 2006 11:01:57 +0100
Subject: RFC: namespaces
Message-ID: <ujru0alw25m.fsf@cat.linpro.no>

Varnish is going to consist of quite a bit of code, and I'd like to
keep separate modules in separate namespaces.  I'd like to suggest the
following convention:

 - All external symbols get a three-letter prefix followed by an
   underscore.

 - The first letter is always v for Varnish.

 - The next two letters identify the module the symbol belongs to.
   Each module gets a unique two-letter mnemonic code.

For instance we could assign the two-letter code "lo" to the logger;
logging functions would be named e.g. vlo_emit(), and log-related
preprocessor macros would be named e.g. VLO_LEVEL_DEBUG.

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From phk at phk.freebsd.dk  Mon Feb 27 11:55:01 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Mon, 27 Feb 2006 11:55:01 +0000
Subject: RFC: namespaces 
In-Reply-To: Your message of "Mon, 27 Feb 2006 11:01:57 +0100."
	<ujru0alw25m.fsf@cat.linpro.no> 
Message-ID: <903.1141041301@critter.freebsd.dk>

In message <ujru0alw25m.fsf at cat.linpro.no>, Dag-Erling =?iso-8859-1?Q?Sm=F8rgra
v?= writes:

>For instance we could assign the two-letter code "lo" to the logger;
>logging functions would be named e.g. vlo_emit(), and log-related
>preprocessor macros would be named e.g. VLO_LEVEL_DEBUG.

Sounds a bit like overkill to me, but if it makes you happy I can
live with it.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


From des at linpro.no  Mon Feb 27 13:56:58 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?=)
Date: Mon, 27 Feb 2006 14:56:58 +0100
Subject: RFC: namespaces
References: <903.1141041301@critter.freebsd.dk>
Message-ID: <ujrirr0x5ud.fsf@cat.linpro.no>

"Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
> "Dag-Erling Sm?rgrav" <des at des.no> writes:
> > For instance we could assign the two-letter code "lo" to the logger;
> > logging functions would be named e.g. vlo_emit(), and log-related
> > preprocessor macros would be named e.g. VLO_LEVEL_DEBUG.
> Sounds a bit like overkill to me, but if it makes you happy I can
> live with it.

Well, the alternatives are worse IMHO:

open_log(const char *);         /* conflicts with libfoobar */
varnish_open_log(const char *); /* too long */
open(const char *);             /* oops */
vlo_open(const char *);         /* that's better! */

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From des at linpro.no  Mon Feb 27 14:18:01 2006
From: des at linpro.no (Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?=)
Date: Mon, 27 Feb 2006 15:18:01 +0100
Subject: r24 - in trunk/varnish-cache: . bin bin/varnishd include lib
	lib/libvarnish lib/libvarnishapi
References: <20060224143556.09D131ED51F@projects.linpro.no>
	<ujrlkw0vmcb.fsf@cat.linpro.no>
Message-ID: <ujrbqwsx4va.fsf@cat.linpro.no>

des at linpro.no (Dag-Erling Sm?rgrav) writes:
> To build, first make sure you have GNU autotools installed (FreeBSD:
> devel/gnu-autoconf, devel/gnu-automake, devel/gnu-libtool).  Check out
> the sources, run autogen.sh to generate the configure script, then
> configure && make && make install as usual.

I forgot to add that the recommended configure command line for
developers is the following:

$ ./configure --enable-pedantic --enable-wall --enable-werror \
        --enable-dependency-tracking

I need to get this all written down in the wiki...

DES
-- 
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no


From phk at phk.freebsd.dk  Mon Feb 27 17:29:07 2006
From: phk at phk.freebsd.dk (Poul-Henning Kamp)
Date: Mon, 27 Feb 2006 17:29:07 +0000
Subject: RFC: namespaces 
In-Reply-To: Your message of "Mon, 27 Feb 2006 14:56:58 +0100."
	<ujrirr0x5ud.fsf@cat.linpro.no> 
Message-ID: <2100.1141061347@critter.freebsd.dk>

In message <ujrirr0x5ud.fsf at cat.linpro.no>, Dag-Erling =?iso-8859-1?Q?Sm=F8rgra
v?= writes:

>Well, the alternatives are worse IMHO:
>
>open_log(const char *);         /* conflicts with libfoobar */
>varnish_open_log(const char *); /* too long */
>open(const char *);             /* oops */
>vlo_open(const char *);         /* that's better! */

Well, I would tend to consider the varnish process + mgt_process
namespace "private" and therefore not in need of a lot of prefix
whereas for the public API I fully agree a prefix is in order.

But as I said: I can live with it.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.